Spectrum energy grouping

Introduction

These are some notes on spectrum energy grouping in Gammapy.

The main application is to compute energy binnings and then compute flux points.

This is work in progress, feedback welcome!

Other packages

HAP FitSpectrum

HAP has a Spectrum class, which is a vector of SpectrumStats, which contains a few numbers (nOn, nOff, exposureOn, recoAreaTimeOn, liveTime) that can be summed when grouping energy bins or observations and methods to compute derived quantities (excess, significance``).

To compute a grouping, the Spectrum::Rebin method is used, which returns a re-binned Spectrum object and has these options:

  • algorithm --RebinAlgo (string) – {‘NONE’, ‘Regroup’, ‘MinSignif’, ‘MinOnEvents’}
  • parameter --RebinParameters (float) – one single parameter, used differently by each algorithm

The adaptive rebinning methods are implemented via Spectrum::PairBinsInRange left to right.

To select the total energy range, the following options are available:

Sherpa

Sherpa has some spectral grouping functionality

In the Sherpa Python package this seems to be exposed mainly via the sherpa.astropy.data.DataPHA class and it’s group_* methods, that all call into the C grplib library eventually via a Python C extension.

Overall I find the documentation and code not very accessible, and instead of trying to figure out if we can coerce it to do all the spectral grouping algorithms we want, for now I’ll go ahead and re-implement grouping via simple Python functions and classes in Gammapy.

Gammapy Design

Existing functionality

The calculate_flux_point_binning function:

The gammapy.spectrum.DifferentialFluxPoints.compute method computes the grouping and then computes flux points with that grouping. The implementation is complex, because it fiddles with eps to re-compute the group ID vector from EBOUNDS.

New proposal

I’d like to implement a little toolbox replicating what HAP FitSpectrum does and more. Not all the bugs though, it shall be correct and well-tested.

  • The output should be a GROUP_ID vector or a SpectrumEnergyGroupMaker object (that would be a nice place to attach debug info, print output and plots)
  • For the input I’m not sure.
  • I’m not sure how to structure the code yet and what API to use. Ideally it should be simple to use, yet extensible with user-defined methods. Maybe it’s OK to just have a few pre-baked methods and users that want something different will have to write their own function or wrap or sub-class the pre-baked class?

Gammapy Examples

Some brainstorming how I’d like to compute spectrum energy groupings as a user.

The main point is to figure out which classes we want and how to configure and run the computation and return the results.

Let’s say we have a SpectrumObservation and / or stats summary table:

obs = SpectrumObservation.read('$GAMMAPY_EXTRA/datasets/hess-crab4_pha/pha_obs23523.fits')
table = obs.stats_table()

The first step is always to create a SpectrumEnergyGroupMaker object like this:

seg = SpectrumEnergyGroupMaker(obs=obs)

TODO: Should we take an obs object here or a table? A table would be more loosely coupled, but an obs might have convenient functionality?

Then one runs some of the compute_* methods on it, usually:

seg.compute_range_<method>()
seg.compute_groups_<method>()

Accessing results always goes like this:

print(seg) # print summary info
seg.plot() # make debug plots

>>> seg.groups.to_group_table()
# Table with one group per row

>>> table = seg.groups.to_total_table()
# Table with one energy bin per row and an energy group index column
>>> table['energy_group_idx'] # This is the vector defining the grouping

# Not implemented yet:
# seg.energy_group_idx # the group ID Numpy array, the main result
seg.energy_bounds # the energy bounds array (EnergyBounds object)

See the gammapy.spectrum.SpectrumGroups.to_group_table and gammapy.spectrum.SpectrumGroups.to_total_table docstrings for more info on what results are available.

The flux point computation should take either the seg.groups.to_total_table() table or the seg.groups.to_group_table() table as input. Both contain the grouping info as integer bin indices, so that the flux point computation doesn’t have to fiddle with float energy bounds.

User-supplied energy binning

For a given user-supplied energy binning:

seg.compute_range_safe() # apply safe energy range
ebounds = [0.3, 1, 3, 10, 30] * u.TeV
seg.compute_groups_fixed(ebounds=ebounds)

Adaptive binning

Here’s an example how to run the default HAP FitSpectrum method (min sigma, left to right):

seg.compute_range_safe() # uses obs to set the safe energy range
seg.compute_groups_adaptive(quantity='sigma', threshold=2.0)

Other examples

...

Other API

Should we expose some pre-baked common energy grouping options in the API that does the flux point computation, for convenience?

Most users will not have to / want to fiddle with this much.

Gammapy implementation

Astropy table and pandas dataframe has some groupby functionality that could be useful to compute aggregate stats (e.g. sum and mean) for groups of bins or anything via apply: