This is a fixed-text formatted version of a Jupyter notebook

Try online
You may download all the notebooks as a tar file.
Source files: spectral_analysis.ipynb | spectral_analysis.py

Spectral analysis#

Prerequisites#

Understanding how spectral extraction is performed in Cherenkov astronomy, in particular regarding OFF background measurements.
Understanding the basics data reduction and modeling/fitting process with the gammapy library API as shown in the first gammapy analysis with the gammapy library API tutorial

Context#

While 3D analyses allow in principle to consider complex field of views containing overlapping gamma-ray sources, in many cases we might have an observation with a single, strong, point-like source in the field of view. A spectral analysis, in that case, might consider all the events inside a source (or ON) region and bin them in energy only, obtaining 1D datasets.

In classical Cherenkov astronomy, the background estimation technique associated with this method measures the number of events in OFF regions taken in regions of the field-of-view devoid of gamma-ray emitters, where the background rate is assumed to be equal to the one in the ON region.

This allows to use a specific fit statistics for ON-OFF measurements, the wstat (see gammapy.stats.wstat), where no background model is assumed. Background is treated as a set of nuisance parameters. This removes some systematic effects connected to the choice or the quality of the background model. But this comes at the expense of larger statistical uncertainties on the fitted model parameters.

Objective: perform a full region based spectral analysis of 4 Crab observations of H.E.S.S. data release 1 and fit the resulting datasets.

Introduction#

Here, as usual, we use the gammapy.data.DataStore to retrieve a list of selected observations (gammapy.data.Observations). Then, we define the ON region containing the source and the geometry of the gammapy.datasets.SpectrumDataset object we want to produce. We then create the corresponding dataset Maker.

We have to define the Maker object that will extract the OFF counts from reflected regions in the field-of-view. To ensure we use data in an energy range where the quality of the IRFs is good enough we also create a safe range Maker.

We can then proceed with data reduction with a loop over all selected observations to produce datasets in the relevant geometry.

We can then explore the resulting datasets and look at the cumulative signal and significance of our source. We finally proceed with model fitting.

In practice, we have to: - Create a gammapy.data.DataStore poiting to the relevant data - Apply an observation selection to produce a list of observations, a gammapy.data.Observations object. - Define a geometry of the spectrum we want to produce: - Create a ~regions.CircleSkyRegion for the ON extraction region - Create a gammapy.maps.MapAxis for the energy binnings: one for the reconstructed (i.e. measured) energy, the other for the true energy (i.e. the one used by IRFs and models) - Create the necessary makers : - the spectrum dataset maker : gammapy.makers.SpectrumDatasetMaker - the OFF background maker, here a gammapy.makers.ReflectedRegionsBackgroundMaker - and the safe range maker : gammapy.makers.SafeMaskMaker - Perform the data reduction loop. And for every observation: - Apply the makers sequentially to produce a gammapy.datasets.SpectrumDatasetOnOff - Append it to list of datasets - Define the gammapy.modeling.models.SkyModel to apply to the dataset. - Create a gammapy.modeling.Fit object and run it to fit the model parameters - Apply a gammapy.estimators.FluxPointsEstimator to compute flux points for the spectral part of the fit.

Setup#

As usual, we’ll start with some setup …

[1]:

%matplotlib inline
import matplotlib.pyplot as plt

[2]:

# Check package versions
import gammapy
import numpy as np
import astropy
import regions

print("gammapy:", gammapy.__version__)
print("numpy:", np.__version__)
print("astropy", astropy.__version__)
print("regions", regions.__version__)

gammapy: 0.20.1
numpy: 1.22.3
astropy 5.0.4
regions 0.6

[3]:

from pathlib import Path
import astropy.units as u
from astropy.coordinates import SkyCoord, Angle
from regions import CircleSkyRegion
from gammapy.maps import MapAxis, RegionGeom, WcsGeom
from gammapy.modeling import Fit
from gammapy.data import DataStore
from gammapy.datasets import (
    Datasets,
    SpectrumDataset,
    SpectrumDatasetOnOff,
    FluxPointsDataset,
)
from gammapy.modeling.models import (
    ExpCutoffPowerLawSpectralModel,
    create_crab_spectral_model,
    SkyModel,
)
from gammapy.makers import (
    SafeMaskMaker,
    SpectrumDatasetMaker,
    ReflectedRegionsBackgroundMaker,
)
from gammapy.estimators import FluxPointsEstimator
from gammapy.visualization import plot_spectrum_datasets_off_regions

Load Data#

First, we select and load some H.E.S.S. observations of the Crab nebula (simulated events for now).

We will access the events, effective area, energy dispersion, livetime and PSF for containement correction.

[4]:

datastore = DataStore.from_dir("$GAMMAPY_DATA/hess-dl3-dr1/")
obs_ids = [23523, 23526, 23559, 23592]
observations = datastore.get_observations(obs_ids)

Define Target Region#

The next step is to define a signal extraction region, also known as on region. In the simplest case this is just a CircleSkyRegion.

[5]:

target_position = SkyCoord(ra=83.63, dec=22.01, unit="deg", frame="icrs")
on_region_radius = Angle("0.11 deg")
on_region = CircleSkyRegion(center=target_position, radius=on_region_radius)

Create exclusion mask#

We will use the reflected regions method to place off regions to estimate the background level in the on region. To make sure the off regions don’t contain gamma-ray emission, we create an exclusion mask.

Using http://gamma-sky.net/ we find that there’s only one known gamma-ray source near the Crab nebula: the AGN called RGB J0521+212 at GLON = 183.604 deg and GLAT = -8.708 deg.

[6]:

exclusion_region = CircleSkyRegion(
    center=SkyCoord(183.604, -8.708, unit="deg", frame="galactic"),
    radius=0.5 * u.deg,
)

skydir = target_position.galactic
geom = WcsGeom.create(
    npix=(150, 150), binsz=0.05, skydir=skydir, proj="TAN", frame="icrs"
)

exclusion_mask = ~geom.region_mask([exclusion_region])
exclusion_mask.plot();

../../../_images/tutorials_analysis_1D_spectral_analysis_12_0.png

Run data reduction chain#

We begin with the configuration of the maker classes:

[7]:

energy_axis = MapAxis.from_energy_bounds(
    0.1, 40, nbin=10, per_decade=True, unit="TeV", name="energy"
)
energy_axis_true = MapAxis.from_energy_bounds(
    0.05, 100, nbin=20, per_decade=True, unit="TeV", name="energy_true"
)

geom = RegionGeom.create(region=on_region, axes=[energy_axis])
dataset_empty = SpectrumDataset.create(
    geom=geom, energy_axis_true=energy_axis_true
)

[8]:

dataset_maker = SpectrumDatasetMaker(
    containment_correction=True, selection=["counts", "exposure", "edisp"]
)
bkg_maker = ReflectedRegionsBackgroundMaker(exclusion_mask=exclusion_mask)
safe_mask_masker = SafeMaskMaker(methods=["aeff-max"], aeff_percent=10)

[9]:

%%time
datasets = Datasets()

for obs_id, observation in zip(obs_ids, observations):
    dataset = dataset_maker.run(
        dataset_empty.copy(name=str(obs_id)), observation
    )
    dataset_on_off = bkg_maker.run(dataset, observation)
    dataset_on_off = safe_mask_masker.run(dataset_on_off, observation)
    datasets.append(dataset_on_off)

CPU times: user 6.68 s, sys: 202 ms, total: 6.88 s
Wall time: 7.38 s

Plot off regions#

[10]:

plt.figure(figsize=(8, 8))
ax = exclusion_mask.plot()
on_region.to_pixel(ax.wcs).plot(ax=ax, edgecolor="k")
plot_spectrum_datasets_off_regions(ax=ax, datasets=datasets)

[10]:

<WCSAxesSubplot:xlabel='Right Ascension', ylabel='Declination'>

../../../_images/tutorials_analysis_1D_spectral_analysis_18_1.png

Source statistic#

Next we’re going to look at the overall source statistics in our signal region.

[11]:

info_table = datasets.info_table(cumulative=True)

[12]:

info_table

[12]:

Table length=4

name	counts	excess	sqrt_ts	background	npred	npred_background	npred_signal	exposure_min	exposure_max	livetime	ontime	counts_rate	background_rate	excess_rate	n_bins	n_fit_bins	stat_type	stat_sum	counts_off	acceptance	acceptance_off	alpha
								m2 s	m2 s	s	s	1 / s	1 / s	1 / s
str7	int64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	int64	int64	str5	float64	int64	float64	float64	float64
stacked	149	139.25	20.449683569684254	9.75	20.461539024432028	20.461539024432028	nan	2892003.25	841726208.0	1581.7367646954954	1687.0	0.09420025084179189	0.006164110373875643	0.08803614046791623	27	18	wstat	433.5372460592368	117	18.0	216.0	0.0833333358168602
stacked	303	280.75	28.446255766130005	22.250001907348633	43.84615575068094	43.84615575068094	nan	13397219.0	1572412928.0	3154.4235137812793	3370.0	0.09605558628263806	0.007053587386139234	0.08900199950115721	27	19	wstat	823.6909484036685	267	19.0	227.99996948242188	0.0833333432674408
stacked	439	408.7743835449219	36.17588855633591	30.225610733032227	50.88364346926001	50.88364346926001	nan	19239702.0	2077411712.0	4732.546993609518	5056.0	0.09276188923063906	0.006386753427667313	0.08637513459388795	27	19	wstat	1325.2637060766638	594	19.0	373.3919677734375	0.05088486522436142
stacked	550	512.135498046875	40.90086300803346	37.864498138427734	59.674911682627844	59.674911682627844	nan	21017612.0	2635248128.0	6313.811659421773	6742.0	0.08711061236349418	0.0059970902175905914	0.08111352154172066	27	19	wstat	1701.2360965968023	869	19.0	436.05487060546875	0.04357249662280083

[13]:

plt.plot(
    info_table["livetime"].to("h"), info_table["excess"], marker="o", ls="none"
)
plt.xlabel("Livetime [h]")
plt.ylabel("Excess");

../../../_images/tutorials_analysis_1D_spectral_analysis_22_0.png

[14]:

plt.plot(
    info_table["livetime"].to("h"),
    info_table["sqrt_ts"],
    marker="o",
    ls="none",
)
plt.xlabel("Livetime [h]")
plt.ylabel("Sqrt(TS)");

../../../_images/tutorials_analysis_1D_spectral_analysis_23_0.png

Finally you can write the extracted datasets to disk using the OGIP format (PHA, ARF, RMF, BKG, see here for details):

[15]:

path = Path("spectrum_analysis")
path.mkdir(exist_ok=True)

[16]:

for dataset in datasets:
    dataset.write(
        filename=path / f"obs_{dataset.name}.fits.gz", overwrite=True
    )

If you want to read back the datasets from disk you can use:

[17]:

datasets = Datasets()

for obs_id in obs_ids:
    filename = path / f"obs_{obs_id}.fits.gz"
    datasets.append(SpectrumDatasetOnOff.read(filename))

Fit spectrum#

Now we’ll fit a global model to the spectrum. First we do a joint likelihood fit to all observations. If you want to stack the observations see below. We will also produce a debug plot in order to show how the global fit matches one of the individual observations.

[18]:

spectral_model = ExpCutoffPowerLawSpectralModel(
    amplitude=1e-12 * u.Unit("cm-2 s-1 TeV-1"),
    index=2,
    lambda_=0.1 * u.Unit("TeV-1"),
    reference=1 * u.TeV,
)
model = SkyModel(spectral_model=spectral_model, name="crab")

datasets.models = [model]

fit_joint = Fit()
result_joint = fit_joint.run(datasets=datasets)

# we make a copy here to compare it later
model_best_joint = model.copy()

Fit quality and model residuals#

We can access the results dictionary to see if the fit converged:

[19]:

print(result_joint)

OptimizeResult

        backend    : minuit
        method     : migrad
        success    : True
        message    : Optimization terminated successfully.
        nfev       : 244
        total stat : 86.12

CovarianceResult

        backend    : minuit
        method     : hesse
        success    : True
        message    : Hesse terminated successfully.

and check the best-fit parameters

[20]:

datasets.models.to_parameters_table()

[20]:

Table length=5

model	type	name	value	unit	error	min	max	frozen	is_norm	link
str4	str8	str9	float64	str14	float64	float64	float64	bool	bool	str1
crab	spectral	index	2.2727e+00		1.566e-01	nan	nan	False	False
crab	spectral	amplitude	4.7913e-11	cm-2 s-1 TeV-1	3.600e-12	nan	nan	False	True
crab	spectral	reference	1.0000e+00	TeV	0.000e+00	nan	nan	True	False
crab	spectral	lambda_	1.2097e-01	TeV-1	5.382e-02	nan	nan	False	False
crab	spectral	alpha	1.0000e+00		0.000e+00	nan	nan	True	False

A simple way to inspect the model residuals is using the function ~SpectrumDataset.plot_fit()

[21]:

ax_spectrum, ax_residuals = datasets[0].plot_fit()
ax_spectrum.set_ylim(0.1, 40)
datasets[0].plot_masks(ax=ax_spectrum);

../../../_images/tutorials_analysis_1D_spectral_analysis_37_0.png

For more ways of assessing fit quality, please refer to the dedicated modeling and fitting tutorial.

Compute Flux Points#

To round up our analysis we can compute flux points by fitting the norm of the global model in energy bands. We’ll use a fixed energy binning for now:

[22]:

e_min, e_max = 0.7, 30
energy_edges = np.geomspace(e_min, e_max, 11) * u.TeV

Now we create an instance of the gammapy.estimators.FluxPointsEstimator, by passing the dataset and the energy binning:

[23]:

fpe = FluxPointsEstimator(
    energy_edges=energy_edges, source="crab", selection_optional="all"
)
flux_points = fpe.run(datasets=datasets)

Here is a the table of the resulting flux points:

[24]:

flux_points.to_table(sed_type="dnde", formatted=True)

[24]:

Table length=10

e_ref	e_min	e_max	dnde	dnde_err	dnde_errp	dnde_errn	dnde_ul	ts	sqrt_ts	npred [4]	npred_excess [4]	stat	is_ul	counts [4]	success	norm_scan [11]	stat_scan [11]
TeV	TeV	TeV	1 / (cm2 s TeV)	1 / (cm2 s TeV)	1 / (cm2 s TeV)	1 / (cm2 s TeV)	1 / (cm2 s TeV)
float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float32	float64	bool	float64	bool	float64	float64
0.823	0.737	0.920	6.950e-11	7.224e-12	7.462e-12	7.000e-12	8.496e-11	318.308	17.841	31.36646258047154 .. 23.889729207163217	28.875505 .. 21.720263	8.252	False	30.0 .. 25.0	True	0.200 .. 5.000	140.739 .. 451.124
1.148	0.920	1.434	2.884e-11	2.581e-12	2.654e-12	2.510e-12	3.430e-11	449.770	21.208	40.721594294634784 .. 29.627449286871947	38.467556 .. 27.449282	12.649	False	43.0 .. 35.0	True	0.200 .. 5.000	182.603 .. 693.653
1.790	1.434	2.235	1.104e-11	1.132e-12	1.168e-12	1.096e-12	1.345e-11	350.109	18.711	31.973739526878333 .. 21.124620512912387	29.863094 .. 19.905582	6.105	False	37.0 .. 21.0	True	0.200 .. 5.000	150.799 .. 424.855
2.790	2.235	3.483	3.462e-12	4.450e-13	4.631e-13	4.273e-13	4.426e-12	219.114	14.802	20.86827471761865 .. 13.385553096466168	19.729824 .. 12.434938	12.915	False	13.0 .. 17.0	True	0.200 .. 5.000	102.546 .. 293.446
3.892	3.483	4.348	8.756e-13	2.576e-13	2.800e-13	2.362e-13	1.482e-12	36.235	6.020	4.959042980000916 .. 2.702481330476382	4.168653 .. 2.5370095	4.035	False	8.0 .. 2.0	True	0.200 .. 5.000	13.594 .. 125.092
5.429	4.348	6.778	5.335e-13	1.079e-13	1.149e-13	1.011e-13	7.776e-13	86.926	9.323	8.52067876691984 .. 5.173151592973639	8.258547 .. 4.9781284	11.815	False	12.0 .. 6.0	True	0.200 .. 5.000	47.101 .. 132.109
8.462	6.778	10.564	1.623e-13	4.547e-14	4.941e-14	4.186e-14	2.695e-13	38.108	6.173	5.121380116204808 .. 3.1055485072447895	4.620307 .. 2.731345	8.980	False	5.0 .. 5.0	True	0.200 .. 5.000	28.474 .. 55.783
11.803	10.564	13.189	5.214e-14	3.008e-14	3.529e-14	2.546e-14	1.339e-13	7.911	2.813	1.2131436603915728 .. 0.7323859566564407	1.1362206 .. 0.6783319	7.934	False	0.0 .. 0.0	True	0.200 .. 5.000	12.285 .. 18.679
16.465	13.189	20.556	1.177e-14	7.816e-15	9.399e-15	6.393e-15	3.395e-14	5.269	2.295	0.9432263496672387 .. 0.6987575077748854	0.84752977 .. 0.50956833	6.611	False	1.0 .. 0.0	True	0.200 .. 5.000	9.622 .. 17.557
25.663	20.556	32.040	nan	nan	1.094e-15	nan	4.378e-15	-0.012	-0.000	0.07895530597715192 .. 0.10937377988399984	0.002032229 .. 0.0012656718	0.633	True	0.0 .. 0.0	False	0.200 .. 5.000	0.866 .. 6.774

Now we plot the flux points and their likelihood profiles. For the plotting of upper limits we choose a threshold of TS < 4.

[25]:

plt.figure(figsize=(8, 5))
ax = flux_points.plot(sed_type="e2dnde", color="darkorange")
flux_points.plot_ts_profiles(ax=ax, sed_type="e2dnde");

../../../_images/tutorials_analysis_1D_spectral_analysis_46_0.png

The final plot with the best fit model, flux points and residuals can be quickly made like this:

[26]:

flux_points_dataset = FluxPointsDataset(
    data=flux_points, models=model_best_joint
)

[27]:

flux_points_dataset.plot_fit();

../../../_images/tutorials_analysis_1D_spectral_analysis_49_0.png

Stack observations#

And alternative approach to fitting the spectrum is stacking all observations first and the fitting a model. For this we first stack the individual datasets:

[28]:

dataset_stacked = Datasets(datasets).stack_reduce()

Again we set the model on the dataset we would like to fit (in this case it’s only a single one) and pass it to the gammapy.modeling.Fit object:

[29]:

dataset_stacked.models = model
stacked_fit = Fit()
result_stacked = stacked_fit.run([dataset_stacked])

# make a copy to compare later
model_best_stacked = model.copy()

[30]:

print(result_stacked)

OptimizeResult

        backend    : minuit
        method     : migrad
        success    : True
        message    : Optimization terminated successfully.
        nfev       : 54
        total stat : 8.16

CovarianceResult

        backend    : minuit
        method     : hesse
        success    : True
        message    : Hesse terminated successfully.

[31]:

model_best_joint.parameters.to_table()

[31]:

Table length=5

type	name	value	unit	error	min	max	frozen	is_norm	link
str8	str9	float64	str14	float64	float64	float64	bool	bool	str1
spectral	index	2.2727e+00		1.566e-01	nan	nan	False	False
spectral	amplitude	4.7913e-11	cm-2 s-1 TeV-1	3.600e-12	nan	nan	False	True
spectral	reference	1.0000e+00	TeV	0.000e+00	nan	nan	True	False
spectral	lambda_	1.2097e-01	TeV-1	5.382e-02	nan	nan	False	False
spectral	alpha	1.0000e+00		0.000e+00	nan	nan	True	False

[32]:

model_best_stacked.parameters.to_table()

[32]:

Table length=5

type	name	value	unit	error	min	max	frozen	is_norm	link
str8	str9	float64	str14	float64	float64	float64	bool	bool	str1
spectral	index	2.2785e+00		1.563e-01	nan	nan	False	False
spectral	amplitude	4.7800e-11	cm-2 s-1 TeV-1	3.566e-12	nan	nan	False	True
spectral	reference	1.0000e+00	TeV	0.000e+00	nan	nan	True	False
spectral	lambda_	1.1830e-01	TeV-1	5.329e-02	nan	nan	False	False
spectral	alpha	1.0000e+00		0.000e+00	nan	nan	True	False

Finally, we compare the results of our stacked analysis to a previously published Crab Nebula Spectrum for reference. This is available in gammapy.modeling.models.create_crab_spectral_model.

[33]:

plot_kwargs = {
    "energy_bounds": [0.1, 30] * u.TeV,
    "sed_type": "e2dnde",
    "yunits": u.Unit("erg cm-2 s-1"),
}

# plot stacked model
model_best_stacked.spectral_model.plot(
    **plot_kwargs, label="Stacked analysis result"
)
model_best_stacked.spectral_model.plot_error(
    facecolor="blue", alpha=0.3, **plot_kwargs
)

# plot joint model
model_best_joint.spectral_model.plot(
    **plot_kwargs, label="Joint analysis result", ls="--"
)
model_best_joint.spectral_model.plot_error(
    facecolor="orange", alpha=0.3, **plot_kwargs
)

create_crab_spectral_model("hess_ecpl").plot(
    **plot_kwargs, label="Crab reference"
)
plt.legend()

[33]:

<matplotlib.legend.Legend at 0x146fe3af0>

../../../_images/tutorials_analysis_1D_spectral_analysis_58_1.png

Exercises#

Now you have learned the basics of a spectral analysis with Gammapy. To practice you can continue with the following exercises:

Fit a different spectral model to the data. You could try gammapy.modeling.models.ExpCutoffPowerLawSpectralModel or gammapy.modeling.models.LogParabolaSpectralModel.
Compute flux points for the stacked dataset.
Create a gammapy.datasets.FluxPointsDataset with the flux points you have computed for the stacked dataset and fit the flux points again with obe of the spectral models. How does the result compare to the best fit model, that was directly fitted to the counts data?

What next?#

The methods shown in this tutorial is valid for point-like or midly extended sources where we can assume that the IRF taken at the region center is valid over the whole region. If one wants to extract the 1D spectrum of a large source and properly average the response over the extraction region, one has to use a different approach explained in the extended source spectral analysis tutorial.