This is a fixed-text formatted version of a Jupyter notebook

Spectral analysis

Prerequisites

  • Understanding how spectral extraction is performed in Cherenkov astronomy, in particular regarding OFF background measurements.

  • Understanding the basics data reduction and modeling/fitting process with the gammapy library API as shown in the first gammapy analysis with the gammapy library API tutorial

Context

While 3D analyses allow in principle to consider complex field of views containing overlapping gamma-ray sources, in many cases we might have an observation with a single, strong, point-like source in the field of view. A spectral analysis, in that case, might consider all the events inside a source (or ON) region and bin them in energy only, obtaining 1D datasets.

In classical Cherenkov astronomy, the background estimation technique associated with this method measures the number of events in OFF regions taken in regions of the field-of-view devoid of gamma-ray emitters, where the background rate is assumed to be equal to the one in the ON region.

This allows to use a specific fit statistics for ON-OFF measurements, the wstat (see gammapy.stats.fit_statistics), where no background model is assumed. Background is treated as a set of nuisance parameters. This removes some systematic effects connected to the choice or the quality of the background model. But this comes at the expense of larger statistical uncertainties on the fitted model parameters.

Objective: perform a full region based spectral analysis of 4 Crab observations of H.E.S.S. data release 1 and fit the resulting datasets.

Introduction

Here, as usual, we use the gammapy.data.DataStore to retrieve a list of selected observations (gammapy.data.Observations). Then, we define the ON region containing the source and the geometry of the gammapy.datasets.SpectrumDataset object we want to produce. We then create the corresponding dataset Maker.

We have to define the Maker object that will extract the OFF counts from reflected regions in the field-of-view. To ensure we use data in an energy range where the quality of the IRFs is good enough we also create a safe range Maker.

We can then proceed with data reduction with a loop over all selected observations to produce datasets in the relevant geometry.

We can then explore the resulting datasets and look at the cumulative signal and significance of our source. We finally proceed with model fitting.

In practice, we have to: - Create a gammapy.data.DataStore poiting to the relevant data - Apply an observation selection to produce a list of observations, a gammapy.data.Observations object. - Define a geometry of the spectrum we want to produce: - Create a ~regions.CircleSkyRegion for the ON extraction region - Create a gammapy.maps.MapAxis for the energy binnings: one for the reconstructed (i.e. measured) energy, the other for the true energy (i.e. the one used by IRFs and models) - Create the necessary makers : - the spectrum dataset maker : gammapy.makers.SpectrumDatasetMaker - the OFF background maker, here a gammapy.makers.ReflectedRegionsBackgroundMaker - and the safe range maker : gammapy.makers.SafeRangeMaker - Perform the data reduction loop. And for every observation: - Apply the makers sequentially to produce a gammapy.datasets.SpectrumDatasetOnOff - Append it to list of datasets - Define the gammapy.modeling.models.SkyModel to apply to the dataset. - Create a gammapy.modeling.Fit object and run it to fit the model parameters - Apply a gammapy.estimators.FluxPointsEstimator to compute flux points for the spectral part of the fit.

Setup

As usual, we’ll start with some setup …

[1]:
%matplotlib inline
import matplotlib.pyplot as plt
[2]:
# Check package versions
import gammapy
import numpy as np
import astropy
import regions

print("gammapy:", gammapy.__version__)
print("numpy:", np.__version__)
print("astropy", astropy.__version__)
print("regions", regions.__version__)
gammapy: 0.19
numpy: 1.21.4
astropy 4.3.1
regions 0.5
[3]:
from pathlib import Path
import astropy.units as u
from astropy.coordinates import SkyCoord, Angle
from regions import CircleSkyRegion
from gammapy.maps import MapAxis, RegionGeom, WcsGeom
from gammapy.modeling import Fit
from gammapy.data import DataStore
from gammapy.datasets import (
    Datasets,
    SpectrumDataset,
    SpectrumDatasetOnOff,
    FluxPointsDataset,
)
from gammapy.modeling.models import (
    ExpCutoffPowerLawSpectralModel,
    create_crab_spectral_model,
    SkyModel,
)
from gammapy.makers import (
    SafeMaskMaker,
    SpectrumDatasetMaker,
    ReflectedRegionsBackgroundMaker,
)
from gammapy.estimators import FluxPointsEstimator
from gammapy.visualization import plot_spectrum_datasets_off_regions

Load Data

First, we select and load some H.E.S.S. observations of the Crab nebula (simulated events for now).

We will access the events, effective area, energy dispersion, livetime and PSF for containement correction.

[4]:
datastore = DataStore.from_dir("$GAMMAPY_DATA/hess-dl3-dr1/")
obs_ids = [23523, 23526, 23559, 23592]
observations = datastore.get_observations(obs_ids)
No HDU found matching: OBS_ID = 23523, HDU_TYPE = rad_max, HDU_CLASS = None
No HDU found matching: OBS_ID = 23526, HDU_TYPE = rad_max, HDU_CLASS = None
No HDU found matching: OBS_ID = 23559, HDU_TYPE = rad_max, HDU_CLASS = None
No HDU found matching: OBS_ID = 23592, HDU_TYPE = rad_max, HDU_CLASS = None

Define Target Region

The next step is to define a signal extraction region, also known as on region. In the simplest case this is just a CircleSkyRegion.

[5]:
target_position = SkyCoord(ra=83.63, dec=22.01, unit="deg", frame="icrs")
on_region_radius = Angle("0.11 deg")
on_region = CircleSkyRegion(center=target_position, radius=on_region_radius)

Create exclusion mask

We will use the reflected regions method to place off regions to estimate the background level in the on region. To make sure the off regions don’t contain gamma-ray emission, we create an exclusion mask.

Using http://gamma-sky.net/ we find that there’s only one known gamma-ray source near the Crab nebula: the AGN called RGB J0521+212 at GLON = 183.604 deg and GLAT = -8.708 deg.

[6]:
exclusion_region = CircleSkyRegion(
    center=SkyCoord(183.604, -8.708, unit="deg", frame="galactic"),
    radius=0.5 * u.deg,
)

skydir = target_position.galactic
geom = WcsGeom.create(
    npix=(150, 150), binsz=0.05, skydir=skydir, proj="TAN", frame="icrs"
)

exclusion_mask = ~geom.region_mask([exclusion_region])
exclusion_mask.plot();
../../../_images/tutorials_analysis_1D_spectral_analysis_12_0.png

Run data reduction chain

We begin with the configuration of the maker classes:

[7]:
energy_axis = MapAxis.from_energy_bounds(
    0.1, 40, nbin=10, per_decade=True, unit="TeV", name="energy"
)
energy_axis_true = MapAxis.from_energy_bounds(
    0.05, 100, nbin=20, per_decade=True, unit="TeV", name="energy_true"
)

geom = RegionGeom.create(region=on_region, axes=[energy_axis])
dataset_empty = SpectrumDataset.create(
    geom=geom, energy_axis_true=energy_axis_true
)
[8]:
dataset_maker = SpectrumDatasetMaker(
    containment_correction=True, selection=["counts", "exposure", "edisp"]
)
bkg_maker = ReflectedRegionsBackgroundMaker(exclusion_mask=exclusion_mask)
safe_mask_masker = SafeMaskMaker(methods=["aeff-max"], aeff_percent=10)
[9]:
%%time
datasets = Datasets()

for obs_id, observation in zip(obs_ids, observations):
    dataset = dataset_maker.run(
        dataset_empty.copy(name=str(obs_id)), observation
    )
    dataset_on_off = bkg_maker.run(dataset, observation)
    dataset_on_off = safe_mask_masker.run(dataset_on_off, observation)
    datasets.append(dataset_on_off)
CPU times: user 1.65 s, sys: 97.6 ms, total: 1.75 s
Wall time: 1.46 s

Plot off regions

[10]:
plt.figure(figsize=(8, 8))
ax = exclusion_mask.plot()
on_region.to_pixel(ax.wcs).plot(ax=ax, edgecolor="k")
plot_spectrum_datasets_off_regions(ax=ax, datasets=datasets)
../../../_images/tutorials_analysis_1D_spectral_analysis_18_0.png

Source statistic

Next we’re going to look at the overall source statistics in our signal region.

[11]:
info_table = datasets.info_table(cumulative=True)
[12]:
info_table
[12]:
Table length=4
namecountsbackgroundexcesssqrt_tsnprednpred_backgroundnpred_signalexposure_minexposure_maxlivetimeontimecounts_ratebackground_rateexcess_raten_binsn_fit_binsstat_typestat_sumcounts_offacceptanceacceptance_offalpha
m2 sm2 sss1 / s1 / s1 / s
str7int64float64float32float64float64float64float64float64float64float64float64float64float64float64int64int64str5float64int64float64float64float64
stacked1499.75139.2520.4496837570096320.46153902443202820.461539024432028nan2892003.25841726208.01581.73675841093061687.00.09420025121606880.0061641103983669190.088036140817701892718wstat433.537246059236811718.0216.00.0833333432674408
stacked30322.250001907348633280.7528.4462560711951443.8461557506809443.84615575068094nan13397218.01572412928.03154.42348241806033370.00.09605558723768180.007053587456270340.089002000386069852719wstat823.690948403668526719.0227.99998045503590.08333335071802139
stacked43930.225610733032227408.7743836.17588827538513450.8836434692600150.88364346926001nan19239694.02077411584.04732.5469999313355056.00.092761889106726140.0063867534191357780.086375134478506562719wstat1325.263706076663859419.0373.391958881612650.049657758325338364
stacked55037.864498138427734512.135540.900863460156359.67491168262784459.674911682627844nan21017612.02635248640.06313.8116406202326742.00.087110612622895930.0059970902354489860.0811135217832642719wstat1701.236096596802386919.0436.05490133892460.04170095920562744
[13]:
plt.plot(
    info_table["livetime"].to("h"), info_table["excess"], marker="o", ls="none"
)
plt.xlabel("Livetime [h]")
plt.ylabel("Excess");
../../../_images/tutorials_analysis_1D_spectral_analysis_22_0.png
[14]:
plt.plot(
    info_table["livetime"].to("h"),
    info_table["sqrt_ts"],
    marker="o",
    ls="none",
)
plt.xlabel("Livetime [h]")
plt.ylabel("Sqrt(TS)");
../../../_images/tutorials_analysis_1D_spectral_analysis_23_0.png

Finally you can write the extracted datasets to disk using the OGIP format (PHA, ARF, RMF, BKG, see here for details):

[15]:
path = Path("spectrum_analysis")
path.mkdir(exist_ok=True)
[16]:
for dataset in datasets:
    dataset.write(
        filename=path / f"obs_{dataset.name}.fits.gz", overwrite=True
    )

If you want to read back the datasets from disk you can use:

[17]:
datasets = Datasets()

for obs_id in obs_ids:
    filename = path / f"obs_{obs_id}.fits.gz"
    datasets.append(SpectrumDatasetOnOff.read(filename))

Fit spectrum

Now we’ll fit a global model to the spectrum. First we do a joint likelihood fit to all observations. If you want to stack the observations see below. We will also produce a debug plot in order to show how the global fit matches one of the individual observations.

[18]:
spectral_model = ExpCutoffPowerLawSpectralModel(
    amplitude=1e-12 * u.Unit("cm-2 s-1 TeV-1"),
    index=2,
    lambda_=0.1 * u.Unit("TeV-1"),
    reference=1 * u.TeV,
)
model = SkyModel(spectral_model=spectral_model, name="crab")

datasets.models = [model]

fit_joint = Fit()
result_joint = fit_joint.run(datasets=datasets)

# we make a copy here to compare it later
model_best_joint = model.copy()

Fit quality and model residuals

We can access the results dictionary to see if the fit converged:

[19]:
print(result_joint)
OptimizeResult

        backend    : minuit
        method     : migrad
        success    : True
        message    : Optimization terminated successfully.
        nfev       : 244
        total stat : 86.12

OptimizeResult

        backend    : minuit
        method     : migrad
        success    : True
        message    : Optimization terminated successfully.
        nfev       : 244
        total stat : 86.12


and check the best-fit parameters

[20]:
datasets.models.to_parameters_table()
[20]:
Table length=5
modeltypenamevalueuniterrorminmaxfrozenlink
str4str8str9float64str14float64float64float64boolstr1
crabspectralindex2.2727e+001.566e-01nannanFalse
crabspectralamplitude4.7913e-11cm-2 s-1 TeV-13.600e-12nannanFalse
crabspectralreference1.0000e+00TeV0.000e+00nannanTrue
crabspectrallambda_1.2097e-01TeV-15.382e-02nannanFalse
crabspectralalpha1.0000e+000.000e+00nannanTrue

A simple way to inspect the model residuals is using the function ~SpectrumDataset.plot_fit()

[21]:
ax_spectrum, ax_residuals = datasets[0].plot_fit()
ax_spectrum.set_ylim(0.1, 40)
[21]:
(0.1, 40)
../../../_images/tutorials_analysis_1D_spectral_analysis_37_1.png

For more ways of assessing fit quality, please refer to the dedicated modeling and fitting tutorial.

Compute Flux Points

To round up our analysis we can compute flux points by fitting the norm of the global model in energy bands. We’ll use a fixed energy binning for now:

[22]:
e_min, e_max = 0.7, 30
energy_edges = np.geomspace(e_min, e_max, 11) * u.TeV

Now we create an instance of the gammapy.estimators.FluxPointsEstimator, by passing the dataset and the energy binning:

[23]:
fpe = FluxPointsEstimator(
    energy_edges=energy_edges, source="crab", selection_optional="all"
)
flux_points = fpe.run(datasets=datasets)

Here is a the table of the resulting flux points:

[24]:
flux_points.to_table(sed_type="dnde", formatted=True)
[24]:
Table length=10
e_refe_mine_maxdndednde_errdnde_errpdnde_errndnde_ultssqrt_tsnpred [4]npred_excess [4]statis_ulcounts [4]successnorm_scan [11]stat_scan [11]
TeVTeVTeV1 / (cm2 s TeV)1 / (cm2 s TeV)1 / (cm2 s TeV)1 / (cm2 s TeV)1 / (cm2 s TeV)
float64float64float64float64float64float64float64float64float64float64float64float32float64boolfloat64boolfloat64float64
0.8230.7370.9206.950e-117.224e-127.462e-127.000e-128.496e-11318.30817.84131.366455739137695 .. 23.8897645682434828.8755 .. 21.7202978.252False30.0 .. 25.0True0.200 .. 5.000140.739 .. 451.124
1.1480.9201.4342.884e-112.581e-122.654e-122.510e-123.430e-11449.77021.20840.72158479422507 .. 29.6274937824000838.46755 .. 27.44932612.649False43.0 .. 35.0True0.200 .. 5.000182.603 .. 693.653
1.7901.4342.2351.104e-111.132e-121.168e-121.096e-121.345e-11350.10918.71131.973730651551723 .. 21.12465596854583429.863085 .. 19.9056176.105False37.0 .. 21.0True0.200 .. 5.000150.799 .. 424.855
2.7902.2353.4833.462e-124.450e-134.631e-134.273e-134.426e-12219.11414.80220.868268731138357 .. 13.38557590954921419.72982 .. 12.43496112.915False13.0 .. 17.0True0.200 .. 5.000102.546 .. 293.446
3.8923.4834.3488.756e-132.576e-132.800e-132.362e-131.482e-1236.2356.0204.959041534302395 .. 2.70248573251928774.1686516 .. 2.53701384.035False8.0 .. 2.0True0.200 .. 5.00013.594 .. 125.092
5.4294.3486.7785.335e-131.079e-131.149e-131.011e-137.776e-1386.9269.3238.520676253068467 .. 5.17315976108213258.258545 .. 4.978136511.815False12.0 .. 6.0True0.200 .. 5.00047.101 .. 132.109
8.4626.77810.5641.623e-134.547e-144.941e-144.186e-142.695e-1338.1086.1735.121378946137629 .. 3.1055529256330154.6203055 .. 2.73134958.980False5.0 .. 5.0True0.200 .. 5.00028.474 .. 55.783
11.80310.56413.1895.214e-143.008e-143.529e-142.546e-141.339e-137.9112.8131.2131431975079712 .. 0.7323869303973991.1362201 .. 0.678332877.934False0.0 .. 0.0True0.200 .. 5.00012.285 .. 18.679
16.46513.18920.5561.177e-147.816e-159.399e-156.393e-153.395e-145.2692.2950.9432260338197485 .. 0.69875824357999750.8475294 .. 0.509569056.611False1.0 .. 0.0True0.200 .. 5.0009.622 .. 17.557
25.66320.55632.040nannan1.094e-15nan4.378e-15-0.000-0.0000.07692307692307691 .. 0.10810810810810809-4.4549585e-17 .. -2.7745514e-170.620True0.0 .. 0.0False0.200 .. 5.0000.866 .. 6.774

Now we plot the flux points and their likelihood profiles. For the plotting of upper limits we choose a threshold of TS < 4.

[25]:
plt.figure(figsize=(8, 5))
ax = flux_points.plot(sed_type="e2dnde", color="darkorange")
flux_points.plot_ts_profiles(ax=ax, sed_type="e2dnde");
../../../_images/tutorials_analysis_1D_spectral_analysis_46_0.png

The final plot with the best fit model, flux points and residuals can be quickly made like this:

[26]:
flux_points_dataset = FluxPointsDataset(
    data=flux_points, models=model_best_joint
)
[27]:
flux_points_dataset.plot_fit();
../../../_images/tutorials_analysis_1D_spectral_analysis_49_0.png

Stack observations

And alternative approach to fitting the spectrum is stacking all observations first and the fitting a model. For this we first stack the individual datasets:

[28]:
dataset_stacked = Datasets(datasets).stack_reduce()

Again we set the model on the dataset we would like to fit (in this case it’s only a single one) and pass it to the gammapy.modeling.Fit object:

[29]:
dataset_stacked.models = model
stacked_fit = Fit()
result_stacked = stacked_fit.run([dataset_stacked])

# make a copy to compare later
model_best_stacked = model.copy()
[30]:
print(result_stacked)
OptimizeResult

        backend    : minuit
        method     : migrad
        success    : True
        message    : Optimization terminated successfully.
        nfev       : 54
        total stat : 8.16

OptimizeResult

        backend    : minuit
        method     : migrad
        success    : True
        message    : Optimization terminated successfully.
        nfev       : 54
        total stat : 8.16


[31]:
model_best_joint.parameters.to_table()
[31]:
Table length=5
typenamevalueuniterrorminmaxfrozenlink
str8str9float64str14float64float64float64boolstr1
spectralindex2.2727e+001.566e-01nannanFalse
spectralamplitude4.7913e-11cm-2 s-1 TeV-13.600e-12nannanFalse
spectralreference1.0000e+00TeV0.000e+00nannanTrue
spectrallambda_1.2097e-01TeV-15.382e-02nannanFalse
spectralalpha1.0000e+000.000e+00nannanTrue
[32]:
model_best_stacked.parameters.to_table()
[32]:
Table length=5
typenamevalueuniterrorminmaxfrozenlink
str8str9float64str14float64float64float64boolstr1
spectralindex2.2785e+001.563e-01nannanFalse
spectralamplitude4.7800e-11cm-2 s-1 TeV-13.566e-12nannanFalse
spectralreference1.0000e+00TeV0.000e+00nannanTrue
spectrallambda_1.1830e-01TeV-15.329e-02nannanFalse
spectralalpha1.0000e+000.000e+00nannanTrue

Finally, we compare the results of our stacked analysis to a previously published Crab Nebula Spectrum for reference. This is available in gammapy.modeling.models.create_crab_spectral_model.

[33]:
plot_kwargs = {
    "energy_bounds": [0.1, 30] * u.TeV,
    "sed_type": "e2dnde",
    "yunits": u.Unit("erg cm-2 s-1"),
}

# plot stacked model
model_best_stacked.spectral_model.plot(
    **plot_kwargs, label="Stacked analysis result"
)
model_best_stacked.spectral_model.plot_error(
    facecolor="blue", alpha=0.3, **plot_kwargs
)

# plot joint model
model_best_joint.spectral_model.plot(
    **plot_kwargs, label="Joint analysis result", ls="--"
)
model_best_joint.spectral_model.plot_error(
    facecolor="orange", alpha=0.3, **plot_kwargs
)

create_crab_spectral_model("hess_ecpl").plot(
    **plot_kwargs, label="Crab reference"
)
plt.legend()
[33]:
<matplotlib.legend.Legend at 0x1597a5b80>
../../../_images/tutorials_analysis_1D_spectral_analysis_58_1.png

Exercises

Now you have learned the basics of a spectral analysis with Gammapy. To practice you can continue with the following exercises:

What next?

The methods shown in this tutorial is valid for point-like or midly extended sources where we can assume that the IRF taken at the region center is valid over the whole region. If one wants to extract the 1D spectrum of a large source and properly average the response over the extraction region, one has to use a different approach explained in the extended source spectral analysis tutorial.