PIG 11 - Light curves#
- Author: Régis Terrier, Axel Donath, David Fidalgo, Atreyee Sinha 
- Created: Jul 2, 2018 
- Withdrawn: Oct 29, 2019 
- Status: withdrawn 
- Discussion: GH 1451 
Abstract#
In this PIG we want to discuss a restructuring of the way light curves are computed and stored in Gammapy. Lightcurves in gamma-ray astronomy are the result of a series of fits of the source flux in each time bin. Lightcurve extraction covers therefore both the data reduction and the data modeling steps. The lightcurve estimation will therefore have these two steps.
Here, we propose to perform the data reduction step in each of the time bins and
store the result in the form of a Datasets (MapDataset or
SpectrumOnOffDataset depending on the selected approach). Each individual
Dataset is then modeled and fitted to extract the source flux and its errors
in each time bin. The result is then stored in a LightCurve object which
contains a Table of the fit results.
For this purpose, we propose to introduce two new classes to perform the data
reduction first and then then the fitting. The time dependent data reduction
could be a specific case of the high level pipeline, when a list of time bins is
passed with the configuration dict. The data fitting should be performed by
the new LightCurveEstimator class, which  should essentially be a wrapper
around the FluxPointEstimator class that does the same thing for spectrum
and map datasets.
Introduction#
Lightcurves in gamma-ray astronomy#
In photon counting experiments, lightcurves are often simply obtained by counting events in a given energy range in a set of time bins. In ground based gamma-ray astronomy, things are usually more complex.
The response and the instrumental background of the instruments can strongly vary over time on a night scale, e.g. because the source elevation changes or on longer time scales given the possible changes of the atmosphere transparency or the instrument efficiency.
Another complexity comes from the astrophysical background which can often pollute a source and needs to be properly removed to extract the intrinsic source flux at any given time.
For these reasons, gamma-ray lightcurve are usually the results of a fit of model on the data performed on a number of time bins to extract the source flux in these time bins.
This is more limited than e.g. time resolved spectral analysis. Although the latter share many similarities with the lightcurve extraction, it is a more complex task which we do not cover here.
Background / What we have now#
The current gammapy.time.LightCurveEstimator class assumes that a part of
the data reduction process has already been applied at it takes as input a
gammapy.spectrum.SpectrumExtraction instance for which a list of
gammapy.background.BackgroundEstimate is required. Apart from the time
intervals, the user also has to provide a gammapy.spectrum.SpectralModel
that is used to compute the expected counts in a time bin and to scale the
integral flux with respect to the excess events found in that time bin. The
parameters of the spectral model are generally obtained via a spectral fit to
the whole data sample. See current lightcurve tutorial.
Drawbacks of this approach: no clear separation of the data reduction and
modeling steps, and only 1D On-Off spectral analysis is supported and lacks
supports for MapDataset based analysis.
Proposal#
General organization of the new approach#
The approach will be split into 3 main phases:
* Time bin preparation
* Data reduction in time bins to produce a list of Dataset
* Light curve estimation to fit a model on the resulting Datasets
The end product should contain enough information to perform some rebinning and apply high level timing techniques without rerunning the whole data reduction and fitting steps.
Time bin preparation#
Independently of the actual data reduction technique chosen, the user should first provide a list/table of time intervals for which she/he wants to compute the source flux. The computation of this list/table will be done outside of the light curve estimation class.
While we could provide helper functions to prepare the time bins.
astropy.time.Time is relatively easy to use, so that a user would execute
code similar to the following example:
from astropy.time import Time
time_start = Time("2006-07-29 20:00:00.000")
time_step = 5.0 * u.min
nstep = 120
times = time_start + time_step * np.arange(nstep)
time_bins = []
for tmin, tmax in zip(times, times[1:]):
    time_bins.append((tmin,tmax))
Data reduction#
Once the time bins are determined, the user will have to select the relevant
Observations from the Datastore. The Observations are then filtered
and grouped according to the time bins using the ObservationFilter and
passed to the light curve extraction function or class. The latter could take a
geom or a region argument that will define the data reduction geometry
(and in particular, if the data reduction is 3D or 1D). In the absence, of a
RegionGeom we could simply expect a reco energy and true energy MapAxis.
We do not detail possible approaches in the current PIG. These should be re-evaluated in the more general context of data reduction. Some examples, using the current data reduction approach will be exposed to users in a dedicated notebook.
Both approaches will result in a list of Datasets consisting of
MapDataset or SpectrumDatasetOnOff with identical geometries for each
time bin.
In order to properly assign times to any Dataset, the latter must therefore
carry the time information. Minimally, this can be done with a meta information
such as time_start and time_stop. This does not give a full idea of the
coverage of time bin though. Ideally, the Dataset should track the GTI
table of the filtered Eventlist that were used for its production. It can be
stored on the Dataset itself and be serialized independently or be added as
an extension to the serialized count data container (a CountsSpectrum or
Map), as is done for OGIP spectra. In order to stack several Dataset
objects it is necessary to be able to combine GTI tables. While a simple
stacking of tables is enough in many cases, the situation of overlapping time
intervals should be considered and a proper GTI.union() should be introduced
(a functionality recurrent in HEA software see e.g. ftools mgtime)
Data Fitting#
The data fitting is the step were the Datasets are converted into a
lightcurve based on a model of the emission. The amplitude of the source model
is fitted while other parameters are usually, but not necessarily, fixed.
The fitting is very similar to the extraction of FluxPoints in the spectral
analysis and does not strongly depend on the type of Datasets considered.
The new LightCurveEstimator class should therefore be able to take as
input both SpectrumDatasetOnOff and MapDataset, in a similar fashion as
the current FluxPointEstimator.
After creating the Datasets, the user would first define the model to be
used and freeze its parameters. Then we would apply it to the various
Dataset objects before calling the LightCurveEstimator:
sky_model = SkyModel(spatial_model=spatial_model, spectral_model=spectral_model)
sky_model.parameters["index"].frozen = True
sky_model.parameters["lon_0"].frozen = True
sky_model.parameters["lat_0"].frozen = True
for dataset in datasets:
    dataset.model = sky_model
lc_estimator = LightCurveEstimator(datasets)
lightcurve = lc.run()
To help, the user deal with model parameters some helper function could be
introduced to freeze all parameters of a model. Another possibility is to assume
that all parameters are fixed except the amplitude in the
LightCurveEstimator.run() method, or pass as arguments the names of the
parameters to leave free.
Storing the results and further studies#
The results are returned as a gammapy.time.LightCurve instance (the current
container class for light curves) that, so far, essentially holds the integrated
flux + errors and the time_min/time_max of each time bin. There are many other
quantities which could be stored, such as the energy range of the flux, the
number of excess events in the time bin considered, etc. The current container
class already provides some methods to study variability, such as chi-square
test, fractional variance estimation.
Example usage:
lightcurve.plot('amplitude')
# Get fractional variance
print(lightcurve.fvar('amplitude'))
The LightCurve object should be able to rpovide some rebinning
functionalities. In particular, it stores the fit statistic scan, it should be
possible to perform minimal significance rebinning:
simple_rebin_lc = lightcurve.rebin(factor=2)
# or
min_significance_lc = lightcurve.rebin(min_significance=3)
Discussion / Alternatives#
Time bins#
To work with time bins, we could also rely on astropy.timeseries if we force
the dependency to astropy v>3.2. This would look like:
from astropy.timeseries import BinnedTimeSeries
times = BinnedTimeSeries(time_bin_start="2006-07-29 20:00:00.000",
                       time_bin_size=5 * u.min)
print(times.time_bin_start)
print(times.time_bin_end)
Light Curve Fitting#
We could provide distinct objects specialized for Map and Spectrum datasets
instead of a single object.
Lightcurve#
The Lightcurve contains  an astropy.table.Table object. It could be
improved by using the astropy_timeseries.BinnedTimeSeries object which
itself inherits from QTable and provides support for row selection with
time, rebinning and more complex methods for detailed timing studies such as
Lomb-Scargle periodograms.
Task list#
- Add - TSTARTand- TSTOPmeta info on the- Datasetas well as the- GTItable.
- Introduce - GTI.union()`method to merge- Datasets.
- Refactor the - Lightcurveclass to add a number of new content in the- Tablee.g.:- a - fit_stat_scancolumn.
- rebinning methods 
 
- Implement the new - LightCurveEstimatorclass
- Provide a notebook showing data reduction and fitting examples for both 3D and 1D cases 
Decision#
The authors have decided to withdraw the PIG. A significant part of the implementation is already there. Besides, the recent additions of the high level interface and of the new Maker scheme change the scope of the discussion.
