stats - Statistics¶
Introduction¶
gammapy.stats
holds statistical estimators, fit statistics and algorithms
commonly used in gamma-ray astronomy.
It is mostly concerned with the evaluation of one or several observations that count events in a given region and time window, i.e. with Poisson-distributed counts measurements.
For on-off methods we will use the following variable names following the notation in [Cousins2007]:
Variable | Definition |
---|---|
n_on |
Total observed counts in the on region |
n_off |
Total observed counts in the off region |
mu_on |
Total expected counts in the on region |
mu_off |
Total expected counts in the off region |
mu_sig |
Signal expected counts in the on region |
mu_bkg |
Background expected counts in the on region |
a_on |
Relative background exposure in the on region |
a_off |
Relative background exposure in the off region |
alpha |
Background efficiency ratio a_on / a_off |
n_bkg |
Background estimate in the on region |
The following formulae show how an on-off measurement \((n_{on}, n_{off})\) is related to the quantities in the above table:
With the background estimate in the on region
the maximum likelihood estimate of a signal excess is
When the background is known and there is only an “on” region (sometimes also
called “source region”), we use the variable names n_on
, mu_on
,
mu_sig
and mu_bkg
.
These are references describing the available methods: [LiMa1983], [Cash1979], [Stewart2009], [Rolke2005], [Feldman1998], [Cousins2007].
Getting Started¶
Li & Ma Significance¶
[LiMa1983] (see equation 17)
As an example, assume you measured \(n_{on} = 18\) counts in a region where you suspect a source might be present and \(n_{off} = 97\) counts in a background control region where you assume no source is present and that is \(a_{off}/a_{on}=10\) times larger than the on-region.
Here’s how you compute the statistical significance of your detection with the Li & Ma formula:
>>> from gammapy.stats import significance_on_off
>>> significance_on_off(n_on=18, n_off=97, alpha=1. / 10, method='lima')
2.2421704424844875
Confidence Intervals¶
Assume you measured 6 counts in a Poissonian counting experiment with an expected background \(b = 3\). Here’s how you compute the 90% upper limit on the signal strength \(\\mu\):
import numpy as np
from scipy import stats
import gammapy.stats as gstats
x_bins = np.arange(0, 100)
mu_bins = np.linspace(0, 50, 50 / 0.005 + 1, endpoint=True)
matrix = [stats.poisson(mu + 3).pmf(x_bins) for mu in mu_bins]
acceptance_intervals = gstats.fc_construct_acceptance_intervals_pdfs(matrix, 0.9)
LowerLimitNum, UpperLimitNum, _ = gstats.fc_get_limits(mu_bins, x_bins, acceptance_intervals)
mu_upper_limit = gstats.fc_find_limit(6, UpperLimitNum, mu_bins)
The result is mu_upper_limit == 8.465
.
Reference/API¶
gammapy.stats Package¶
Statistics.
Functions¶
background (n_off, alpha) |
Estimate background in the on-region from an off-region observation. |
background_error (n_off, alpha) |
Estimate standard error on background in the on region from an off-region observation. |
cash (n_on, mu_on) |
Cash statistic, for Poisson data. |
cash_sum_cython () |
Summed cash fit statistics. |
combine_stats (stats_1, stats_2[, weight_method]) |
Combine using some weight method for the exposure. |
compute_total_stats (counts, exposure[, …]) |
Compute total stats for arrays of per-bin stats. |
cstat (n_on, mu_on[, n_on_min]) |
C statistic, for Poisson data. |
cstat_sum_cython () |
Summed cstat fit statistics. |
excess (n_on, n_off, alpha) |
Estimate excess in the on region for an on-off observation. |
excess_error (n_on, n_off, alpha) |
Estimate error on excess for an on-off measurement. |
excess_matching_significance (mu_bkg, …[, …]) |
Compute excess matching a given significance. |
excess_matching_significance_on_off (n_off, …) |
Compute sensitivity of an on-off observation. |
excess_ul_helene (excess, excess_error, …) |
Compute excess upper limit using the Helene method. |
fc_construct_acceptance_intervals (…) |
Convenience function that calculates the PDF for the user. |
fc_construct_acceptance_intervals_pdfs (…) |
Numerically choose bins a la Feldman Cousins ordering principle. |
fc_find_acceptance_interval_gauss (mu, sigma, …) |
Analytical acceptance interval for Gaussian with boundary at the origin. |
fc_find_acceptance_interval_poisson (mu, …) |
Analytical acceptance interval for Poisson process with background. |
fc_find_average_upper_limit (x_bins, matrix, …) |
Function to calculate the average upper limit for a confidence belt |
fc_find_limit (x_value, x_values, y_values) |
Find the limit for a given x measurement |
fc_fix_limits (lower_limit, upper_limit) |
Push limits outwards as described in the FC paper. |
fc_get_limits (mu_bins, x_bins, …) |
Find lower and upper limit from acceptance intervals. |
get_wstat_gof_terms (n_on, n_off) |
Calculate goodness of fit terms for wstat |
get_wstat_mu_bkg (n_on, n_off, alpha, mu_sig) |
Calculate mu_bkg for wstat |
make_stats (signal, background, area_factor) |
Fill using some weight method for the exposure. |
probability_to_significance_normal (probability) |
Convert one-sided tail probability to significance. |
probability_to_significance_normal_limit (…) |
Convert tail probability to significance in the limit of small p and large s. |
significance (n_on, mu_bkg[, method, n_on_min]) |
Compute significance for an observed number of counts and known background. |
significance_on_off (n_on, n_off, alpha[, method]) |
Compute significance of an on-off observation. |
significance_to_probability_normal (significance) |
Convert significance to one-sided tail probability. |
significance_to_probability_normal_limit (…) |
Convert significance to tail probability in the limit of small p and large s. |
wstat (n_on, n_off, alpha, mu_sig[, mu_bkg, …]) |
W statistic, for Poisson data with Poisson background. |