This is a fixed-text formatted version of a Jupyter notebook
- Try online
- You can contribute with your own notebooks in this GitHub repository.
- Source files: spectrum_analysis.ipynb | spectrum_analysis.py
Spectral analysis with Gammapy¶
Introduction¶
This notebook explains in detail how to use the classes in gammapy.spectrum and related ones.
Based on a datasets of 4 Crab observations with H.E.S.S. (simulated events for now) we will perform a full region based spectral analysis, i.e. extracting source and background counts from certain regions, and fitting them using the forward-folding approach. We will use the following classes
Data handling:
- gammapy.data.DataStore
- gammapy.data.DataStoreObservation
- gammapy.data.ObservationStats
- gammapy.data.ObservationSummary
To extract the 1-dim spectral information:
To perform the joint fit:
- gammapy.spectrum.SpectrumDatasetOnOff
- gammapy.spectrum.SpectrumDatasetOnOffStacker
- gammapy.spectrum.models.PowerLaw
- gammapy.spectrum.models.ExponentialCutoffPowerLaw
- gammapy.spectrum.models.LogParabola
To compute flux points (a.k.a. “SED” = “spectral energy distribution”)
Feedback welcome!
Setup¶
As usual, we’ll start with some setup …
[1]:
%matplotlib inline
import matplotlib.pyplot as plt
[2]:
# Check package versions
import gammapy
import numpy as np
import astropy
import regions
print("gammapy:", gammapy.__version__)
print("numpy:", np.__version__)
print("astropy", astropy.__version__)
print("regions", regions.__version__)
gammapy: 0.12
numpy: 1.16.2
astropy 3.1.1
regions 0.3
[3]:
import astropy.units as u
from astropy.coordinates import SkyCoord, Angle
from regions import CircleSkyRegion
from gammapy.maps import Map
from gammapy.utils.fitting import Fit
from gammapy.data import ObservationStats, ObservationSummary, DataStore
from gammapy.background import ReflectedRegionsBackgroundEstimator
from gammapy.spectrum.models import PowerLaw
from gammapy.spectrum import (
SpectrumExtraction,
SpectrumDatasetOnOff,
SpectrumDatasetOnOffStacker,
CrabSpectrum,
FluxPointsEstimator,
FluxPointsDataset,
)
Load Data¶
First, we select and load some H.E.S.S. observations of the Crab nebula (simulated events for now).
We will access the events, effective area, energy dispersion, livetime and PSF for containement correction.
[4]:
datastore = DataStore.from_dir("$GAMMAPY_DATA/hess-dl3-dr1/")
obs_ids = [23523, 23526, 23559, 23592]
observations = datastore.get_observations(obs_ids)
Define Target Region¶
The next step is to define a signal extraction region, also known as on region. In the simplest case this is just a CircleSkyRegion, but here we will use the Target
class in gammapy that is useful for book-keeping if you run several analysis in a script.
[5]:
target_position = SkyCoord(ra=83.63, dec=22.01, unit="deg", frame="icrs")
on_region_radius = Angle("0.11 deg")
on_region = CircleSkyRegion(center=target_position, radius=on_region_radius)
Create exclusion mask¶
We will use the reflected regions method to place off regions to estimate the background level in the on region. To make sure the off regions don’t contain gamma-ray emission, we create an exclusion mask.
Using http://gamma-sky.net/ we find that there’s only one known gamma-ray source near the Crab nebula: the AGN called RGB J0521+212 at GLON = 183.604 deg and GLAT = -8.708 deg.
[6]:
exclusion_region = CircleSkyRegion(
center=SkyCoord(183.604, -8.708, unit="deg", frame="galactic"),
radius=0.5 * u.deg,
)
skydir = target_position.galactic
exclusion_mask = Map.create(
npix=(150, 150), binsz=0.05, skydir=skydir, proj="TAN", coordsys="GAL"
)
mask = exclusion_mask.geom.region_mask([exclusion_region], inside=False)
exclusion_mask.data = mask
exclusion_mask.plot();
Estimate background¶
Next we will manually perform a background estimate by placing reflected regions around the pointing position and looking at the source statistics. This will result in a gammapy.background.BackgroundEstimate that serves as input for other classes in gammapy.
[7]:
background_estimator = ReflectedRegionsBackgroundEstimator(
observations=observations,
on_region=on_region,
exclusion_mask=exclusion_mask,
)
background_estimator.run()
[8]:
plt.figure(figsize=(8, 8))
background_estimator.plot(add_legend=True);
Source statistic¶
Next we’re going to look at the overall source statistics in our signal region. For more info about what debug plots you can create check out the ObservationSummary class.
[9]:
stats = []
for obs, bkg in zip(observations, background_estimator.result):
stats.append(ObservationStats.from_observation(obs, bkg))
print(stats[1])
obs_summary = ObservationSummary(stats)
fig = plt.figure(figsize=(10, 6))
ax1 = fig.add_subplot(121)
obs_summary.plot_excess_vs_livetime(ax=ax1)
ax2 = fig.add_subplot(122)
obs_summary.plot_significance_vs_livetime(ax=ax2);
*** Observation summary report ***
Observation Id: 23526
Livetime: 0.437 h
On events: 201
Off events: 225
Alpha: 0.083
Bkg events in On region: 18.75
Excess: 182.25
Excess / Background: 9.72
Gamma rate: 6.95 1 / min
Bkg rate: 0.72 1 / min
Sigma: 21.86
Extract spectrum¶
Now, we’re going to extract a spectrum using the SpectrumExtraction class. We provide the reconstructed energy binning we want to use. It is expected to be a Quantity with unit energy, i.e. an array with an energy unit. We also provide the true energy binning to use.
[10]:
e_reco = np.logspace(-1, np.log10(40), 40) * u.TeV
e_true = np.logspace(np.log10(0.05), 2, 200) * u.TeV
Instantiate a SpectrumExtraction object that will do the extraction. The containment_correction parameter is there to allow for PSF leakage correction if one is working with full enclosure IRFs. We also compute a threshold energy and store the result in OGIP compliant files (pha, rmf, arf). This last step might be omitted though.
[11]:
extraction = SpectrumExtraction(
observations=observations,
bkg_estimate=background_estimator.result,
containment_correction=False,
e_reco=e_reco,
e_true=e_true,
)
[12]:
%%time
extraction.run()
CPU times: user 1.8 s, sys: 32.3 ms, total: 1.83 s
Wall time: 1.85 s
Now we can (optionally) compute the energy thresholds for the analysis, acoording to different methods. Here we choose the energy where the effective area drops below 10% of the maximum:
[13]:
# Add a condition on correct energy range in case it is not set by default
extraction.compute_energy_threshold(method_lo="area_max", area_percent_lo=10.0)
Let’s take a look at the datasets, we just extracted:
[14]:
# Requires IPython widgets
# extraction.spectrum_observations.peek()
extraction.spectrum_observations[0].peek()
Finally you can write the extrated datasets to disk using the OGIP format (PHA, ARF, RMF, BKG, see here for details):
[15]:
# ANALYSIS_DIR = "crab_analysis"
# extraction.write(outdir=ANALYSIS_DIR, overwrite=True)
If you want to read back the datasets from disk you can use:
[16]:
# datasets = []
# for obs_id in obs_ids:
# filename = ANALYSIS_DIR + "/ogip_data/pha_obs{}.fits".format(obs_id)
# datasets.append(SpectrumDatasetOnOff.from_ogip_files(filename))
Fit spectrum¶
Now we’ll fit a global model to the spectrum. First we do a joint likelihood fit to all observations. If you want to stack the observations see below. We will also produce a debug plot in order to show how the global fit matches one of the individual observations.
[17]:
model = PowerLaw(
index=2, amplitude=2e-11 * u.Unit("cm-2 s-1 TeV-1"), reference=1 * u.TeV
)
datasets_joint = extraction.spectrum_observations
for dataset in datasets_joint:
dataset.model = model
fit_joint = Fit(datasets_joint)
result_joint = fit_joint.run()
# we make a copy here to compare it later
model_best_joint = model.copy()
model_best_joint.parameters.covariance = result_joint.parameters.covariance
[18]:
print(result_joint)
OptimizeResult
backend : minuit
method : minuit
success : True
message : Optimization terminated successfully.
nfev : 47
total stat : 113.58
[19]:
plt.figure(figsize=(8, 6))
ax_spectrum, ax_residual = datasets_joint[0].plot_fit()
ax_spectrum.set_ylim(0, 25)
[19]:
(0, 25)
Compute Flux Points¶
To round up our analysis we can compute flux points by fitting the norm of the global model in energy bands. We’ll use a fixed energy binning for now:
[20]:
e_min, e_max = 0.7, 30
e_edges = np.logspace(np.log10(e_min), np.log10(e_max), 11) * u.TeV
Now we create an instance of the FluxPointsEstimator
, by passing the dataset and the energy binning:
[21]:
fpe = FluxPointsEstimator(datasets=datasets_joint, e_edges=e_edges)
flux_points = fpe.run()
Here is a the table of the resulting flux points:
[22]:
flux_points.table_formatted
[22]:
e_ref | e_min | e_max | ref_dnde | ref_flux | ref_eflux | ref_e2dnde | norm | loglike | norm_err | counts [4] | norm_errp | norm_errn | norm_ul | sqrt_ts | ts | norm_scan [11] | dloglike_scan [11] | dnde | dnde_ul | dnde_err | dnde_errp | dnde_errn |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
TeV | TeV | TeV | 1 / (cm2 s TeV) | 1 / (cm2 s) | TeV / (cm2 s) | TeV / (cm2 s) | 1 / (cm2 s TeV) | 1 / (cm2 s TeV) | 1 / (cm2 s TeV) | 1 / (cm2 s TeV) | 1 / (cm2 s TeV) | |||||||||||
float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | int64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 |
0.859 | 0.737 | 1.002 | 4.198e-11 | 1.120e-11 | 9.535e-12 | 3.099e-11 | 0.878 | 14.117 | 0.118 | 0 .. 0 | 0.123 | 0.113 | 1.134 | 13.903 | 193.295 | 0.200 .. 5.000 | 82.99951168352825 .. 358.0090550191403 | 3.685e-11 | 4.760e-11 | 4.959e-12 | 5.165e-12 | 4.758e-12 |
1.261 | 1.002 | 1.588 | 1.527e-11 | 9.087e-12 | 1.124e-11 | 2.429e-11 | 0.950 | 13.145 | 0.088 | 31 .. 36 | 0.091 | 0.085 | 1.137 | 20.656 | 426.658 | 0.200 .. 5.000 | 171.90332333110035 .. 642.1168926481937 | 1.450e-11 | 1.737e-11 | 1.347e-12 | 1.394e-12 | 1.302e-12 |
1.852 | 1.588 | 2.160 | 5.552e-12 | 3.193e-12 | 5.861e-12 | 1.905e-11 | 1.166 | 9.827 | 0.147 | 27 .. 12 | 0.153 | 0.141 | 1.484 | 14.824 | 219.758 | 0.200 .. 5.000 | 109.57521292555477 .. 250.39854651204217 | 6.474e-12 | 8.239e-12 | 8.169e-13 | 8.499e-13 | 7.847e-13 |
2.518 | 2.160 | 2.936 | 2.472e-12 | 1.933e-12 | 4.825e-12 | 1.568e-11 | 1.230 | 9.251 | 0.176 | 10 .. 11 | 0.184 | 0.167 | 1.613 | 13.446 | 180.807 | 0.200 .. 5.000 | 92.00048036311878 .. 178.3899607681866 | 3.040e-12 | 3.989e-12 | 4.341e-13 | 4.547e-13 | 4.141e-13 |
3.697 | 2.936 | 4.656 | 8.991e-13 | 1.569e-12 | 5.685e-12 | 1.229e-11 | 0.944 | 14.639 | 0.159 | 17 .. 11 | 0.168 | 0.151 | 1.297 | 10.654 | 113.498 | 0.200 .. 5.000 | 60.11957248672836 .. 211.0149297574274 | 8.489e-13 | 1.166e-12 | 1.433e-13 | 1.512e-13 | 1.357e-13 |
5.429 | 4.656 | 6.330 | 3.270e-13 | 5.512e-13 | 2.966e-12 | 9.637e-12 | 1.149 | 8.520 | 0.271 | 9 .. 5 | 0.292 | 0.251 | 1.774 | 8.126 | 66.024 | 0.200 .. 5.000 | 37.34928358492572 .. 78.59670588747657 | 3.757e-13 | 5.801e-13 | 8.859e-14 | 9.533e-14 | 8.213e-14 |
7.971 | 6.330 | 10.037 | 1.189e-13 | 4.473e-13 | 3.495e-12 | 7.556e-12 | 1.096 | 12.532 | 0.283 | 5 .. 4 | 0.308 | 0.259 | 1.758 | 7.172 | 51.440 | 0.200 .. 5.000 | 35.53849334782444 .. 76.41826285390175 | 1.303e-13 | 2.091e-13 | 3.363e-14 | 3.663e-14 | 3.081e-14 |
11.703 | 10.037 | 13.647 | 4.325e-14 | 1.572e-13 | 1.823e-12 | 5.924e-12 | 0.976 | 10.598 | 0.449 | 0 .. 1 | 0.511 | 0.390 | 2.131 | 3.660 | 13.396 | 0.200 .. 5.000 | 16.653005609307925 .. 36.15555421539119 | 4.223e-14 | 9.216e-14 | 1.940e-14 | 2.211e-14 | 1.689e-14 |
17.183 | 13.647 | 21.636 | 1.573e-14 | 1.275e-13 | 2.148e-12 | 4.645e-12 | 0.328 | 7.192 | 0.298 | 1 .. 0 | 0.378 | 0.328 | 1.253 | 0.941 | 0.885 | 0.200 .. 5.000 | 7.405550778656064 .. 40.490250200347795 | 5.159e-15 | 1.971e-14 | 4.683e-15 | 5.951e-15 | 5.159e-15 |
25.229 | 21.636 | 29.419 | 5.721e-15 | 4.482e-14 | 1.121e-12 | 3.642e-12 | 0.000 | 0.541 | 0.001 | 0 .. 0 | 0.298 | 0.000 | 1.192 | 0.002 | 0.000 | 0.200 .. 5.000 | 1.2117459554950087 .. 17.32009763836271 | 1.021e-20 | 6.819e-15 | 8.345e-18 | 1.705e-15 | 1.021e-20 |
Now we plot the flux points and their likelihood profiles. For the plotting of upper limits we choose a threshold of TS < 4.
[23]:
plt.figure(figsize=(8, 5))
flux_points.table["is_ul"] = flux_points.table["ts"] < 4
ax = flux_points.plot(
energy_power=2, flux_unit="erg-1 cm-2 s-1", color="darkorange"
)
flux_points.to_sed_type("e2dnde").plot_likelihood(ax=ax)
[23]:
<matplotlib.axes._subplots.AxesSubplot at 0x1c17251e48>
The final plot with the best fit model, flux points and residuals can be quickly made like this:
[24]:
flux_points_dataset = FluxPointsDataset(
data=flux_points, model=model_best_joint
)
[25]:
plt.figure(figsize=(8, 6))
flux_points_dataset.peek();
Stack observations¶
And alternative approach to fitting the spectrum is stacking all observations first and the fitting a model. For this we first stack the individual datasets using the SpectrumDatasetOnOffStacker
class:
[26]:
stacker = SpectrumDatasetOnOffStacker(datasets_joint)
dataset_stacked = stacker.run()
Again we set the model on the dataset we would like to fit (in this case it’s only a singel one) and pass it to the Fit
object:
[27]:
dataset_stacked.model = model
stacked_fit = Fit([dataset_stacked])
result_stacked = stacked_fit.run()
# make a copy to compare later
model_best_stacked = model.copy()
model_best_stacked.parameters.covariance = result_stacked.parameters.covariance
[28]:
print(result_stacked)
OptimizeResult
backend : minuit
method : minuit
success : True
message : Optimization terminated successfully.
nfev : 26
total stat : 28.85
[29]:
model_best_joint.parameters.to_table()
[29]:
name | value | error | unit | min | max | frozen |
---|---|---|---|---|---|---|
str9 | float64 | float64 | str14 | float64 | float64 | bool |
index | 2.633e+00 | 7.598e-02 | nan | nan | False | |
amplitude | 2.814e-11 | 1.891e-12 | cm-2 s-1 TeV-1 | nan | nan | False |
reference | 1.000e+00 | 0.000e+00 | TeV | nan | nan | True |
[30]:
model_best_stacked.parameters.to_table()
[30]:
name | value | error | unit | min | max | frozen |
---|---|---|---|---|---|---|
str9 | float64 | float64 | str14 | float64 | float64 | bool |
index | 2.634e+00 | 7.601e-02 | nan | nan | False | |
amplitude | 2.814e-11 | 1.892e-12 | cm-2 s-1 TeV-1 | nan | nan | False |
reference | 1.000e+00 | 0.000e+00 | TeV | nan | nan | True |
Finally, we compare the results of our stacked analysis to a previously published Crab Nebula Spectrum for reference. This is available in gammapy.spectrum
.
[31]:
plot_kwargs = {
"energy_range": [0.1, 30] * u.TeV,
"energy_power": 2,
"flux_unit": "erg-1 cm-2 s-1",
}
# plot stacked model
model_best_stacked.plot(**plot_kwargs, label="Stacked analysis result")
model_best_stacked.plot_error(**plot_kwargs)
# plot joint model
model_best_joint.plot(**plot_kwargs, label="Joint analysis result", ls="--")
model_best_joint.plot_error(**plot_kwargs)
CrabSpectrum().model.plot(**plot_kwargs, label="Crab reference")
plt.legend()
[31]:
<matplotlib.legend.Legend at 0x102bea198>
Exercises¶
Now you have learned the basics of a spectral analysis with Gammapy. To practice you can continue with the following exercises:
- Fit a different spectral model to the data. You could try e.g. an
ExponentialCutoffPowerLaw
orLogParabola
model. - Compute flux points for the stacked dataset.
- Create a
FluxPointsDataset
with the flux points you have computed for the stacked dataset and fit the flux points again with obe of the spectral models. How does the result compare to the best fit model, that was directly fitted to the counts data?
[32]: