.. _stats: ****************** stats - Statistics ****************** .. currentmodule:: gammapy.stats .. _stats-introduction: Introduction ============ `gammapy.stats` holds statistical estimators, fit statistics and algorithms commonly used in gamma-ray astronomy. It is mostly concerned with the evaluation of one or several observations that count events in a given region and time window, i.e. with Poisson-distributed counts measurements. For on-off methods we will use the following variable names following the notation in [Cousins2007]_: ================= ====================== ==================================================== Variable Dataset attribute name Definition ================= ====================== ==================================================== ``n_on`` ``counts`` Total observed counts in the on region ``n_off`` ``counts_off`` Total observed counts in the off region ``mu_on`` ``npred`` Total expected counts in the on region ``mu_off`` ``npred_off`` Total expected counts in the off region ``mu_sig`` ``npred_sig`` Signal expected counts in the on region ``mu_bkg`` ``npred_bkg`` Background expected counts in the on region ``a_on`` ``acceptance`` Relative background exposure in the on region ``a_off`` ``acceptance_off`` Relative background exposure in the off region ``alpha`` ``alpha`` Background efficiency ratio ``a_on`` / ``a_off`` ``n_bkg`` ``background`` Background estimate in the on region ================= ====================== ==================================================== The following formulae show how an on-off measurement :math:`(n_{on}, n_{off})` is related to the quantities in the above table: .. math:: n_{on} \sim Pois(\mu_{on})\text{ with }\mu_{on} = \mu_s + \mu_b n_{off} \sim Pois(\mu_{off})\text{ with }\mu_{off} = \mu_b / \alpha\text{ with }\alpha = a_{on} / a_{off} With the background estimate in the on region .. math:: n_{bkg} = \alpha\ n_{off}, the maximum likelihood estimate of a signal excess is .. math:: n_{excess} = n_{on} - n_{bkg}. When the background is known and there is only an "on" region (sometimes also called "source region"), we use the variable names ``n_on``, ``mu_on``, ``mu_sig`` and ``mu_bkg``. These are references describing the available methods: [LiMa1983]_, [Cash1979]_, [Stewart2009]_, [Rolke2005]_, [Feldman1998]_, [Cousins2007]_. Getting Started =============== Li \& Ma Significance --------------------- [LiMa1983]_ (see equation 17) As an example, assume you measured :math:`n_{on} = 18` counts in a region where you suspect a source might be present and :math:`n_{off} = 97` counts in a background control region where you assume no source is present and that is :math:`a_{off}/a_{on}=10` times larger than the on-region. Here's how you compute the statistical significance of your detection with the Li \& Ma formula: .. code-block:: python >>> from gammapy.stats import significance_on_off >>> significance_on_off(n_on=18, n_off=97, alpha=1. / 10, method='lima') 2.2421704424844875 Confidence Intervals -------------------- Assume you measured 6 counts in a Poissonian counting experiment with an expected background :math:`b = 3`. Here's how you compute the 90% upper limit on the signal strength :math:`\mu`: .. code-block:: python import numpy as np from scipy import stats import gammapy.stats as gstats x_bins = np.arange(0, 100) mu_bins = np.linspace(0, 50, 50 / 0.005 + 1, endpoint=True) matrix = [stats.poisson(mu + 3).pmf(x_bins) for mu in mu_bins] acceptance_intervals = gstats.fc_construct_acceptance_intervals_pdfs(matrix, 0.9) LowerLimitNum, UpperLimitNum, _ = gstats.fc_get_limits(mu_bins, x_bins, acceptance_intervals) mu_upper_limit = gstats.fc_find_limit(6, UpperLimitNum, mu_bins) The result is ``mu_upper_limit == 8.465``. Using `gammapy.stats` ===================== .. toctree:: :maxdepth: 1 feldman_cousins fit_statistics Reference/API ============= .. automodapi:: gammapy.stats :no-inheritance-diagram: :include-all-objects: