PIG 2 - Organization of low-level analysis code¶
The case of image and cube analysis¶
Author: Régis Terrier & Christoph Deil
Created: Jan 12, 2018
Accepted: Jul 27, 2018
Status: accepted
Discussion: GH 1277
Abstract¶
This PIG discusses the general structure of the low level analysis subpackages
of gammapy. Low level analysis is based on the gammapy building blocks from
gammapy.data
, gammapy.irf
and gammapy.maps
. Low level analysis
implements all the individual steps required to perform data reduction for IACT
from DL3 inputs (event lists and IRFs) to DL4 data (spectra, maps and cubes) and
their associated reduced IRFs. Low-level analysis should be structured in a very
modular way to allow easy implementation of high-level analysis classes and
scripts.
General code style guidelines¶
Functions or methods should be no longer than few tens of lines of code. Above that it is better to use multiple functions to make testing easier and allow more modular usage. One line functions are usually not needed unless this is a very complex line.
Similarly, classes should have 3-10 methods. 2 methods classes (e.g. only
__init__
and __call__
) should usually be functions. Above 10-20
methodes, the class should be split into several classes/functions.
It is important to keep the number of functions and classes needed by the user to a reasonnable level. Modularity is therefore very important, since it allows to easily implement high level interfaces that orchestrates the common analysis patterns.
Algorithms and data should be clearly separated. The naming scheme used should allow easy identification of the nature of a piece of code. For instance, functions creating maps and or cube should be named make_map_xxx.
Data analysis subpackages in gammapy¶
Low level analysis produces reduced datasets and IRFs from the general event
lists and multidimensional IRFs of each observation or GTI. The building blocks
on which it relies are coded in gammapy.data (EventList
, DataStore
,
DataStoreObservation
etc), in gammapy.maps (in particular WcsNDMap
used
both for images and cubes), in gammapy.irf (e.g. EffectiveAreaTable2D
,
EnergyDispersion2D
, EnergyDependentTablePSF
, etc).
Analysis subpackages are:
1D or spectral analysis (in
gammapy.spectrum
)2D and 3D (cube) analysis (in
gammapy.cube
)timing analysis (in
gammapy.time
)
Low level map and cube analysis¶
The low-level analysis cube package deals with the production of all maps/cubes
and PSF kernels required to perform 2D and 3D modeling and fitting. This
includes counts, exposure, acceptance and normalized background maps and cubes.
These reduced data and IRFs are stored using the gammapy.maps.WcsNDMap
class
which describes multidimensional maps with their World Coordinate System (WCS)
description and a set of non-spatial axis. The default map structure for most of
the typical analysis will be 3 dimensional maps with an energy axis (with a
single bin for 2D images).
The low level analysis is performed on an observation per observation (or GTI)
basis. This is required by the response and background rapid variations.
Therefore, all basic functions operate on a single EventList
or set of IRFs
(i.e. EffectiveAreaTable2D
, EnergyDispersion2D
,
EnergyDependentTablePSF
). The iterative production of the individual reduced
datasets and IRFs and their combination is realized by the higher level class.
The individual observation products can be serialized, mostly for analysis
debugging purposes or to avoid reprocessing large databases when new data are
added.
Depending on the type of analysis, different reduced IRFs are to be produced. The main difference lies in the type of energy considered: reconstructed or true (i.e. incident) energy. Counts, hadronic acceptance and background always use reconstructed (i.e. measured) energy. Exposure and PSF kernels will be defined in reconstructed energy for 2D analysis whereas they will be defined in true energies for 3D analysis with their own energy binning. A reduced energy dispersion will then be produced to convert from true to reconstructed energies and used later to predict counts.
The maker functions and the products have to clearly state what type of energy
they are using to avoid any confusion. The serialization has to include a way to
clearly differentiate the products. Some metadata, probably in the form of an
OrderedDict
as in the case of astropy.table.Table
could be used to do
so.
In order to perform likelihood analysis of maps and cubes, as well as to apply ON-OFF significance estimation techniques it is important to have integers values for counts and OFF maps produced by ring background estimation techniques (on an observation per observation basis). Therefore, we want to avoid reprojecting individual maps onto a global mosaic.
The approach should be to define the general geometry of the target mosaic map
and to perform cutouts for each observation. This can be done using for instance
astropy.Cutout2D
. The index range of the cutout in the general mosaic map
should be kept for easy summation. This step is performed with:
make_map_cutout
takes a
WcsNDMap
and a maximum offset angleAngle
orQuantity
returns the
WcsGeom
of the cutout and itsslice
For individual observations/gti, the general arguments of all maker functions are:
Reference image and energy range.
gammapy.maps.MapGeom
Maximum offset angle.
astropy.coordinates.Angle
The various maker functions are then:
make_map_counts
takes an
EventList
returns a count map/cube
make_map_exposure_true_energy
takes a pointing direction, an
EffectiveAreaTable2D
and a livetimereturns an exposure map/cube in true energy
make_map_exposure_reco_energy
takes a pointing direction, an
EffectiveAreaTable2D
, anEnergyDispersion2D
and a livetimereturns an exposure map/cube in reco energy
make_map_hadron_acceptance
takes a pointing direction, a
Background3D
and a livetimereturns an hadronic acceptance map, i.e. a predicted background map/cube.
make_map_FoV_background
takes maps/cube (
WcsNDMap
) of observed counts and hadron acceptance/predicted background and an exclusion mapreturns the map of background normalized on the observed counts in the whole FoV (excluding regions with significant gamma-ray emission).
Different energy grouping schemes should be available to ensure a reasonable number of events are used for the normalization. This scheme and the number of events used for normalization should be included in the optional serialization.
make_map_ring_background
takes maps/cube (
WcsNDMap
) of observed counts and hadron acceptance/predicted background and exclusion map. It also takes agammapy.background.AdaptiveRingBackgroundEstimator
or agammapy.background.RingBackgroundEstimator
returns the map of background normalized on the observed counts with a ring filter (excluding regions with significant gamma-ray emission). The background estimator object also contains the OFF map and the ON and OFF exposure maps.
Most likely this technique is not meant to be used for too small energy bands, so that energy grouping is probably not relevant here.
The general processing can then be performed by general classes or scripts, possibly config file driven. It should be sufficiently modular to allow for users to do their own scripts
Existing code¶
Currently, maps and cubes rely on the SkyImage
and SkyCube
classes.
There are various scripts and classes existing currently in gammapy to produce
maps and cubes (mostly developped by @adonath and @ljouvin).Image processing
can be performed with SingleObsImageMaker
and StackedObsImageMaker
,
while cube processing can be performed with SingleObsCubeMaker
and
StackedObsCubeMaker
. For images, one can also use the
IACTBasicImageEstimator
. All this code relies on high level class which
perform all the analysis sequentially (exposure, background, count maps etc).
This approach is not modular and creates a lot of code duplication. Some
cube-related analysis is required for images creating some cross-dependencies.
The proposed scheme should be much more modular and allow user to use gammapy as
a library to compose their own scripts and classes if needed. It should limit
code duplication. In particular, it uses the more general gammapy.maps
which
allows to get rid of the cross dependencies of the image and cube package we
have now.
The existing code will remain in gammapy for the moment, with possibly some bugs fixed. The new code is largely independent so that the new development should bot break user scripts.
Decision¶
This PIG was extensively discussed on Github, as well as in Gammapy weekly calls and at the Feb 2018 and July 2018 Gammapy meetings. Doing this move to new analysis code based on gammapy.maps was never controversial, bug API and implementation discussions were ongoing.
On July 27, 2018, Regis and Christoph noticed that the description in this PIG had been mostly implemented in Gammapy master already, and that further progress would come from individual improvements, not a rewrite / update of this PIG with a complete design. So we decided to merge this PIG with status “approved” to have it on the record as part of the design and evolution process for Gammapy.