# Statistics tools (gammapy.stats)¶

## Introduction¶

gammapy.stats holds statistical estimators, fit statistics and algorithms commonly used in gamma-ray astronomy.

It is mostly concerned with the evaluation of one or several observations that count events in a given region and time window, i.e. with Poisson-distributed counts measurements.

For on-off methods we will use the following variable names following the notation in [Cousins2007]:

Variable Definition
n_on Total observed counts in the on region
n_off Total observed counts in the off region
mu_on Total expected counts in the on region
mu_off Total expected counts in the off region
mu_sig Signal expected counts in the on region
mu_bkg Background expected counts in the on region
a_on Relative background efficiency in the on region
a_off Relative background efficiency in the off region
alpha Background efficiency ratio a_on / a_off
n_bkg Background estimate in the on region

The following formulae show how an on-off measurement $$(n_{on}, n_{off})$$ is related to the quantities in the above table:

\begin{align}\begin{aligned}n_{on} \sim Pois(\mu_{on})\text{ with }\mu_{on} = \mu_s + \mu_b\\n_{off} \sim Pois(\mu_{off})\text{ with }\mu_{off} = \mu_b / \alpha\text{ with }\alpha = a_{on} / a_{off}\end{aligned}\end{align}

With the background estimate in the on region

$n_{bkg} = \alpha\ n_{off},$

the maximum likelihood estimate of a signal excess is

$n_{excess} = n_{on} - n_{bkg}.$

When the background is known and there is only an “on” region (sometimes also called “source region”), we use the variable names n_on, mu_on, mu_sig and mu_bkg.

These are references describing the available methods: [LiMa1983], [Cash1979], [Stewart2009], [Rolke2005], [Feldman1998], [Cousins2007].

## Getting Started¶

### Li & Ma Significance¶

As an example, assume you measured $$n_{on} = 18$$ counts in a region where you suspect a source might be present and $$n_{off} = 97$$ counts in a background control region where you assume no source is present and that is $$a_{off}/a_{on}=10$$ times larger than the on-region.

Here’s how you compute the statistical significance of your detection with the Li & Ma formula:

>>> from gammapy.stats import significance_on_off
>>> significance_on_off(n_on=18, n_off=97, alpha=1. / 10, method='lima')
2.2421704424844875


### Confidence Intervals¶

Assume you measured 6 counts in a Poissonian counting experiment with an expected background $$b = 3$$. Here’s how you compute the 90% upper limit on the signal strength $$\\mu$$:

import numpy as np
from scipy import stats
import gammapy.stats as gstats

x_bins = np.arange(0, 100)
mu_bins = np.linspace(0, 50, 50 / 0.005 + 1, endpoint=True)

matrix = [stats.poisson(mu + 3).pmf(x_bins) for mu in mu_bins]
acceptance_intervals = gstats.fc_construct_acceptance_intervals_pdfs(matrix, 0.9)
LowerLimitNum, UpperLimitNum, _ = gstats.fc_get_limits(mu_bins, x_bins, acceptance_intervals)
mu_upper_limit = gstats.fc_find_limit(6, UpperLimitNum, mu_bins)


The result is mu_upper_limit == 8.465.

## Reference/API¶

### gammapy.stats Package¶

Statistics utility functions and classes.

#### Functions¶

 background(n_off, alpha) Estimate background in the on-region from an off-region observation. background_error(n_off, alpha) Estimate standard error on background in the on region from an off-region observation. cash(n_on, mu_on) Cash statistic, for Poisson data. chi2(N_S, B, S, sigma2) Chi-square statistic with user-specified variance. chi2constvar(N_S, N_B, A_S, A_B) Chi-square statistic with constant variance. chi2datavar(N_S, N_B, A_S, A_B) Chi-square statistic with data variance. chi2gehrels(N_S, N_B, A_S, A_B) Chi-square statistic with Gehrel’s variance. chi2modvar(S, B, A_S, A_B) Chi-square statistic with model variance. chi2xspecvar(N_S, N_B, A_S, A_B) Chi-square statistic with XSPEC variance. combine_stats(stats_1, stats_2[, weight_method]) Combine using some weight method for the exposure. compute_total_stats(counts, exposure[, ...]) Compute total stats for arrays of per-bin stats. convert_likelihood(to[, probability, ...]) Convert between various equivalent likelihood measures. cov_to_corr(covariance) Compute correlation matrix from covariance matrix. cstat(n_on, mu_on[, n_on_min]) C statistic, for Poisson data. excess(n_on, n_off, alpha) Estimate excess in the on region for an on-off observation. excess_error(n_on, n_off, alpha) Estimate standard error on excess in the on region for an on-off observation. fc_construct_acceptance_intervals(...) Convenience function that calculates the PDF for the user. fc_construct_acceptance_intervals_pdfs(...) Numerically choose bins a la Feldman Cousins ordering principle. fc_find_acceptance_interval_gauss(mu, sigma, ...) Analytical acceptance interval for Gaussian with boundary at the origin. fc_find_acceptance_interval_poisson(mu, ...) Analytical acceptance interval for Poisson process with background. fc_find_average_upper_limit(x_bins, matrix, ...) Function to calculate the average upper limit for a confidence belt fc_find_limit(x_value, x_values, y_values) Find the limit for a given x measurement fc_fix_limits(lower_limit, upper_limit) Push limits outwards as described in the FC paper. fc_get_limits(mu_bins, x_bins, ...) Find lower and upper limit from acceptance intervals. get_wstat_gof_terms(n_on, n_off) Calculate goodness of fit terms for wstat get_wstat_mu_bkg(n_on, n_off, alpha, mu_sig) Calculate mu_bkg for wstat lstat() L statistic, for Poisson data with Poisson background (Bayesian). make_stats(signal, background, area_factor) Fill using some weight method for the exposure. pgstat() PG statistic, for Poisson data with Gaussian background. probability_to_significance_normal(probability) Convert one-sided tail probability to significance. probability_to_significance_normal_limit(...) Convert tail probability to significance in the limit of small p and large s. sensitivity(mu_bkg, significance[, ...]) Compute sensitivity. sensitivity_on_off(n_off, alpha, significance) Compute sensitivity of an on-off observation. significance(n_on, mu_bkg[, method, n_on_min]) Compute significance for an observed number of counts and known background. significance_on_off(n_on, n_off, alpha[, ...]) Compute significance of an on-off observation. significance_to_probability_normal(significance) Convert significance to one-sided tail probability. significance_to_probability_normal_limit(...) Convert significance to tail probability in the limit of small p and large s. wstat(n_on, n_off, alpha, mu_sig[, mu_bkg, ...]) W statistic, for Poisson data with Poisson background.

#### Classes¶

 Stats(n_on, n_off, a_on, a_off) Container for an on-off observation.