.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/api/datasets.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_tutorials_api_datasets.py>`
        to download the full example code. or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_api_datasets.py:


Datasets - Reduced data, IRFs, models
=====================================

Learn how to work with datasets

Introduction
------------

`~gammapy.datasets` are a crucial part of the gammapy API. `~gammapy.datasets.Dataset`
objects constitute `DL4` data - binned counts, IRFs, models and the associated
likelihoods. `~gammapy.datasets.Datasets` from the end product of the data reduction stage,
see :doc:`/tutorials/api/makers` tutorial and are passed on to the `~gammapy.modeling.Fit`
or estimator classes for modelling and fitting purposes.

To find the different types of `~gammapy.datasets.Dataset` objects that are supported see
:ref:`datasets-types`:

Setup
-----

.. GENERATED FROM PYTHON SOURCE LINES 23-40

.. code-block:: Python


    import astropy.units as u
    from astropy.coordinates import SkyCoord
    from regions import CircleSkyRegion
    import matplotlib.pyplot as plt
    from IPython.display import display
    from gammapy.data import GTI
    from gammapy.datasets import (
        Datasets,
        FluxPointsDataset,
        MapDataset,
        SpectrumDatasetOnOff,
    )
    from gammapy.estimators import FluxPoints
    from gammapy.maps import MapAxis, WcsGeom
    from gammapy.modeling.models import FoVBackgroundModel, PowerLawSpectralModel, SkyModel


.. GENERATED FROM PYTHON SOURCE LINES 41-43

Check setup
-----------

.. GENERATED FROM PYTHON SOURCE LINES 43-52

.. code-block:: Python

    from gammapy.utils.check import check_tutorials_setup
    from gammapy.utils.scripts import make_path

    # %matplotlib inline


    check_tutorials_setup()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    System:

            python_executable      : /home/runner/work/gammapy-docs/gammapy-docs/gammapy/.tox/build_docs/bin/python 
            python_version         : 3.9.19     
            machine                : x86_64     
            system                 : Linux      


    Gammapy package:

            version                : 1.3.dev241+g0271bebfc 
            path                   : /home/runner/work/gammapy-docs/gammapy-docs/gammapy/.tox/build_docs/lib/python3.9/site-packages/gammapy 


    Other packages:

            numpy                  : 1.26.4     
            scipy                  : 1.13.0     
            astropy                : 5.2.2      
            regions                : 0.8        
            click                  : 8.1.7      
            yaml                   : 6.0.1      
            IPython                : 8.18.1     
            jupyterlab             : not installed 
            matplotlib             : 3.8.4      
            pandas                 : not installed 
            healpy                 : 1.16.6     
            iminuit                : 2.25.2     
            sherpa                 : 4.16.0     
            naima                  : 0.10.0     
            emcee                  : 3.1.6      
            corner                 : 2.2.2      
            ray                    : 2.20.0     


    Gammapy environment variables:

            GAMMAPY_DATA           : /home/runner/work/gammapy-docs/gammapy-docs/gammapy-datasets/dev 


.. GENERATED FROM PYTHON SOURCE LINES 53-70

MapDataset
----------

The counts, exposure, background, masks, and IRF maps are bundled
together in a data structure named `~gammapy.datasets.MapDataset`. While the `counts`,
and `background` maps are binned in reconstructed energy and must have
the same geometry, the IRF maps can have a different spatial (coarsely
binned and larger) geometry and spectral range (binned in true
energies). It is usually recommended that the true energy bin should be
larger and more finely sampled and the reco energy bin.

Creating an empty dataset
~~~~~~~~~~~~~~~~~~~~~~~~~

An empty `~gammapy.datasets.MapDataset` can be directly instantiated from any
`~gammapy.maps.WcsGeom` object:


.. GENERATED FROM PYTHON SOURCE LINES 70-84

.. code-block:: Python


    energy_axis = MapAxis.from_energy_bounds(1, 10, nbin=11, name="energy", unit="TeV")

    geom = WcsGeom.create(
        skydir=(83.63, 22.01),
        axes=[energy_axis],
        width=5 * u.deg,
        binsz=0.05 * u.deg,
        frame="icrs",
    )

    dataset_empty = MapDataset.create(geom=geom, name="my-dataset")


.. GENERATED FROM PYTHON SOURCE LINES 85-89

It is good practice to define a name for the dataset, such that you can
identify it later by name. However if you define a name it **must** be
unique. Now we can already print the dataset:


.. GENERATED FROM PYTHON SOURCE LINES 89-93

.. code-block:: Python


    print(dataset_empty)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MapDataset
    ----------

      Name                            : my-dataset 

      Total counts                    : 0 
      Total background counts         : 0.00
      Total excess counts             : 0.00

      Predicted counts                : 0.00
      Predicted background counts     : 0.00
      Predicted excess counts         : nan

      Exposure min                    : 0.00e+00 m2 s
      Exposure max                    : 0.00e+00 m2 s

      Number of total bins            : 110000 
      Number of fit bins              : 0 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : nan

      Number of models                : 0 
      Number of parameters            : 0
      Number of free parameters       : 0


.. GENERATED FROM PYTHON SOURCE LINES 94-100

The printout shows the key summary information of the dataset, such as
total counts, fit statistics, model information etc.

`~gammapy.datasetsMapDataset.from_geom` has additional keywords, that can be used to
define the binning of the IRF related maps:


.. GENERATED FROM PYTHON SOURCE LINES 100-121

.. code-block:: Python


    # choose a different true energy binning for the exposure, psf and edisp
    energy_axis_true = MapAxis.from_energy_bounds(
        0.1, 100, nbin=11, name="energy_true", unit="TeV", per_decade=True
    )

    # choose a different rad axis binning for the psf
    rad_axis = MapAxis.from_bounds(0, 5, nbin=50, unit="deg", name="rad")

    gti = GTI.create(0 * u.s, 1000 * u.s)

    dataset_empty = MapDataset.create(
        geom=geom,
        energy_axis_true=energy_axis_true,
        rad_axis=rad_axis,
        binsz_irf=0.1,
        gti=gti,
        name="dataset-empty",
    )


.. GENERATED FROM PYTHON SOURCE LINES 122-124

To see the geometry of each map, we can use:


.. GENERATED FROM PYTHON SOURCE LINES 124-128

.. code-block:: Python


    print(dataset_empty.geoms)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    {'geom': <gammapy.maps.wcs.geom.WcsGeom object at 0x7f8afcd93bb0>, 'geom_exposure': <gammapy.maps.wcs.geom.WcsGeom object at 0x7f8afc919700>, 'geom_psf': <gammapy.maps.wcs.geom.WcsGeom object at 0x7f8afc919ca0>, 'geom_edisp': <gammapy.maps.wcs.geom.WcsGeom object at 0x7f8afcd933d0>}


.. GENERATED FROM PYTHON SOURCE LINES 129-132

Another way to create a `~gammapy.datasets.MapDataset` is to just read an existing one
from a FITS file:


.. GENERATED FROM PYTHON SOURCE LINES 132-140

.. code-block:: Python


    dataset_cta = MapDataset.read(
        "$GAMMAPY_DATA/cta-1dc-gc/cta-1dc-gc.fits.gz", name="dataset-cta"
    )

    print(dataset_cta)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MapDataset
    ----------

      Name                            : dataset-cta 

      Total counts                    : 104317 
      Total background counts         : 91507.70
      Total excess counts             : 12809.30

      Predicted counts                : 91507.69
      Predicted background counts     : 91507.70
      Predicted excess counts         : nan

      Exposure min                    : 6.28e+07 m2 s
      Exposure max                    : 1.90e+10 m2 s

      Number of total bins            : 768000 
      Number of fit bins              : 691680 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : nan

      Number of models                : 0 
      Number of parameters            : 0
      Number of free parameters       : 0


.. GENERATED FROM PYTHON SOURCE LINES 141-144

Accessing contents of a dataset
-------------------------------


.. GENERATED FROM PYTHON SOURCE LINES 147-150

To further explore the contents of a `Dataset`, you can use
e.g. `~gammapy.datasets.MapDataset.info_dict()`


.. GENERATED FROM PYTHON SOURCE LINES 152-154

For a quick info, use


.. GENERATED FROM PYTHON SOURCE LINES 154-157

.. code-block:: Python


    print(dataset_cta.info_dict())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    {'name': 'dataset-cta', 'counts': 104317, 'excess': 12809.3046875, 'sqrt_ts': 41.41009347393684, 'background': 91507.6953125, 'npred': 91507.68628538586, 'npred_background': 91507.6953125, 'npred_signal': nan, 'exposure_min': <Quantity 62842028. m2 s>, 'exposure_max': <Quantity 1.90242058e+10 m2 s>, 'livetime': <Quantity 5292.00010298 s>, 'ontime': <Quantity 5400. s>, 'counts_rate': <Quantity 19.71220672 1 / s>, 'background_rate': <Quantity 17.29170324 1 / s>, 'excess_rate': <Quantity 2.42050348 1 / s>, 'n_bins': 768000, 'n_fit_bins': 691680, 'stat_type': 'cash', 'stat_sum': nan}


.. GENERATED FROM PYTHON SOURCE LINES 158-160

For a quick view, use


.. GENERATED FROM PYTHON SOURCE LINES 160-164

.. code-block:: Python


    dataset_cta.peek()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_001.png
   :alt: Counts, Excess counts, Exposure, Background
   :srcset: /tutorials/api/images/sphx_glr_datasets_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 165-167

And access individual maps like:


.. GENERATED FROM PYTHON SOURCE LINES 167-173

.. code-block:: Python

    plt.figure()
    counts_image = dataset_cta.counts.sum_over_axes()
    counts_image.smooth("0.1 deg").plot()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_002.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_002.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 174-177

Of course you can also access IRF related maps, e.g. the psf as
`~gammapy.irf.PSFMap`:


.. GENERATED FROM PYTHON SOURCE LINES 177-181

.. code-block:: Python


    print(dataset_cta.psf)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    WcsNDMap

            geom  : WcsGeom 
            axes  : ['lon', 'lat', 'rad', 'energy_true']
            shape : (16, 12, 66, 14)
            ndim  : 4
            unit  : 1 / sr
            dtype : >f4


.. GENERATED FROM PYTHON SOURCE LINES 182-184

And use any method on the `~gammapy.irf.PSFMap` object:


.. GENERATED FROM PYTHON SOURCE LINES 184-187

.. code-block:: Python

    radius = dataset_cta.psf.containment_radius(energy_true=1 * u.TeV, fraction=0.95)
    print(radius)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0.08675] deg


.. GENERATED FROM PYTHON SOURCE LINES 188-194

.. code-block:: Python


    edisp_kernel = dataset_cta.edisp.get_edisp_kernel()
    edisp_kernel.plot_matrix()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_003.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_003.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 195-199

The `~gammapy.datasets.MapDataset` typically also contains the information on the
residual hadronic background, stored in `~gammapy.datasets.MapDataset.background` as a
map:


.. GENERATED FROM PYTHON SOURCE LINES 199-203

.. code-block:: Python


    print(dataset_cta.background)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    WcsNDMap

            geom  : WcsGeom 
            axes  : ['lon', 'lat', 'energy']
            shape : (320, 240, 10)
            ndim  : 3
            unit  : 
            dtype : >f4


.. GENERATED FROM PYTHON SOURCE LINES 204-207

As a next step we define a minimal model on the dataset using the
`~gammapy.datasets.MapDataset.models` setter:


.. GENERATED FROM PYTHON SOURCE LINES 207-215

.. code-block:: Python


    model = SkyModel.create("pl", "point", name="gc")
    model.spatial_model.position = SkyCoord("0d", "0d", frame="galactic")
    model_bkg = FoVBackgroundModel(dataset_name="dataset-cta")

    dataset_cta.models = [model, model_bkg]


.. GENERATED FROM PYTHON SOURCE LINES 216-219

Assigning models to datasets is covered in more detail in :doc:`/tutorials/api/model_management`.
Printing the dataset will now show the model components:


.. GENERATED FROM PYTHON SOURCE LINES 219-223

.. code-block:: Python


    print(dataset_cta)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MapDataset
    ----------

      Name                            : dataset-cta 

      Total counts                    : 104317 
      Total background counts         : 91507.70
      Total excess counts             : 12809.31

      Predicted counts                : 91719.65
      Predicted background counts     : 91507.69
      Predicted excess counts         : 211.96

      Exposure min                    : 6.28e+07 m2 s
      Exposure max                    : 1.90e+10 m2 s

      Number of total bins            : 768000 
      Number of fit bins              : 691680 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : 424649.33

      Number of models                : 2 
      Number of parameters            : 8
      Number of free parameters       : 5

      Component 0: SkyModel
  
        Name                      : gc
        Datasets names            : None
        Spectral model type       : PowerLawSpectralModel
        Spatial  model type       : PointSpatialModel
        Temporal model type       : 
        Parameters:
          index                         :      2.000   +/-    0.00             
          amplitude                     :   1.00e-12   +/- 0.0e+00 1 / (cm2 s TeV)
          reference             (frozen):      1.000       TeV         
          lon_0                         :      4.650   +/-    0.00 rad         
          lat_0                         :     -0.505   +/-    0.00 rad         
  
      Component 1: FoVBackgroundModel
  
        Name                      : dataset-cta-bkg
        Datasets names            : ['dataset-cta']
        Spectral model type       : PowerLawNormSpectralModel
        Parameters:
          norm                          :      1.000   +/-    0.00             
          tilt                  (frozen):      0.000                   
          reference             (frozen):      1.000       TeV         
  
  
.. GENERATED FROM PYTHON SOURCE LINES 224-227

Now we can use the `~gammapy.datasets.MapDataset.npred` method to get a map of the total predicted counts
of the model:


.. GENERATED FROM PYTHON SOURCE LINES 227-233

.. code-block:: Python


    npred = dataset_cta.npred()
    npred.sum_over_axes().plot()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_004.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_004.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 234-237

To get the predicted counts from an individual model component we can
use:


.. GENERATED FROM PYTHON SOURCE LINES 237-243

.. code-block:: Python


    npred_source = dataset_cta.npred_signal(model_names=["gc"])
    npred_source.sum_over_axes().plot()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_005.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_005.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 244-249

`~gammapy.datasets.MapDataset.background` contains the background map computed from the
IRF. Internally it will be combined with a `~gammapy.modeling.models.FoVBackgroundModel`, to
allow for adjusting the background model during a fit. To get the model
corrected background, one can use `~gammapy.datasets.MapDataset.npred_background`.


.. GENERATED FROM PYTHON SOURCE LINES 249-255

.. code-block:: Python


    npred_background = dataset_cta.npred_background()
    npred_background.sum_over_axes().plot()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_006.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_006.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 256-275

Using masks
~~~~~~~~~~~

There are two masks that can be set on a `~gammapy.datasets.MapDataset`, `~gammapy.datasets.MapDataset.mask_safe` and
`~gammapy.datasets.MapDataset.mask_fit`.

-  The `~gammapy.datasets.MapDataset.mask_safe` is computed during the data reduction process
   according to the specified selection cuts, and should not be changed
   by the user.
-  During modelling and fitting, the user might want to additionally
   ignore some parts of a reduced dataset, e.g. to restrict the fit to a
   specific energy range or to ignore parts of the region of interest.
   This should be done by applying the `~gammapy.datasets.MapDataset.mask_fit`. To see details of
   applying masks, please refer to :ref:`masks-for-fitting`.

Both the `~gammapy.datasets.MapDataset.mask_fit` and `~gammapy.datasets.MapDataset.mask_safe` must
have the same `~gammapy.maps.Map.geom` as the `~gammapy.datasets.MapDataset.counts` and
`~gammapy.datasets.MapDataset.background` maps.


.. GENERATED FROM PYTHON SOURCE LINES 275-282

.. code-block:: Python


    # eg: to see the safe data range

    dataset_cta.mask_safe.plot_grid()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_007.png
   :alt: Energy 100 GeV - 158 GeV, Energy 158 GeV - 251 GeV, Energy 251 GeV - 398 GeV, Energy 398 GeV - 631 GeV, Energy 631 GeV - 1.00 TeV, Energy 1.00 TeV - 1.58 TeV, Energy 1.58 TeV - 2.51 TeV, Energy 2.51 TeV - 3.98 TeV, Energy 3.98 TeV - 6.31 TeV, Energy 6.31 TeV - 10.0 TeV
   :srcset: /tutorials/api/images/sphx_glr_datasets_007.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 283-285

In addition it is possible to define a custom `~gammapy.datasets.MapDataset.mask_fit`:


.. GENERATED FROM PYTHON SOURCE LINES 285-299

.. code-block:: Python


    # To apply a mask fit - in energy and space

    region = CircleSkyRegion(SkyCoord("0d", "0d", frame="galactic"), 1.5 * u.deg)

    geom = dataset_cta.counts.geom

    mask_space = geom.region_mask([region])
    mask_energy = geom.energy_mask(0.3 * u.TeV, 8 * u.TeV)
    dataset_cta.mask_fit = mask_space & mask_energy
    dataset_cta.mask_fit.plot_grid(vmin=0, vmax=1, add_cbar=True)
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_008.png
   :alt: Energy 100 GeV - 158 GeV, Energy 158 GeV - 251 GeV, Energy 251 GeV - 398 GeV, Energy 398 GeV - 631 GeV, Energy 631 GeV - 1.00 TeV, Energy 1.00 TeV - 1.58 TeV, Energy 1.58 TeV - 2.51 TeV, Energy 2.51 TeV - 3.98 TeV, Energy 3.98 TeV - 6.31 TeV, Energy 6.31 TeV - 10.0 TeV
   :srcset: /tutorials/api/images/sphx_glr_datasets_008.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 300-309

To access the energy range defined by the mask you can use:

-  `~gammapy.datasets.MapDataset.energy_range_safe` : energy range defined by the `~gammapy.datasets.MapDataset.mask_safe`
-  `~gammapy.datasets.MapDataset.energy_range_fit` : energy range defined by the `~gammapy.datasets.MapDataset.mask_fit`
-  `~gammapy.datasets.MapDataset.energy_range` : the final energy range used in likelihood computation

These methods return two maps, with the `min` and `max` energy
values at each spatial pixel


.. GENERATED FROM PYTHON SOURCE LINES 309-317

.. code-block:: Python


    e_min, e_max = dataset_cta.energy_range

    # To see the low energy threshold at each point

    e_min.plot(add_cbar=True)
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_009.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_009.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 318-319

To see the high energy threshold at each point

.. GENERATED FROM PYTHON SOURCE LINES 319-324

.. code-block:: Python


    e_max.plot(add_cbar=True)
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_010.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_010.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 325-329

Just as for `~gammapy.maps.Map` objects it is possible to cutout a whole
`~gammapy.datasets.MapDataset`, which will perform the cutout for all maps in
parallel. Optionally one can provide a new name to the resulting dataset:


.. GENERATED FROM PYTHON SOURCE LINES 329-340

.. code-block:: Python


    cutout = dataset_cta.cutout(
        position=SkyCoord("0d", "0d", frame="galactic"),
        width=2 * u.deg,
        name="cta-cutout",
    )

    cutout.counts.sum_over_axes().plot()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_011.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_011.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 341-343

It is also possible to slice a `~gammapy.datasets.MapDataset` in energy:


.. GENERATED FROM PYTHON SOURCE LINES 343-351

.. code-block:: Python


    sliced = dataset_cta.slice_by_energy(
        energy_min=1 * u.TeV, energy_max=5 * u.TeV, name="slice-energy"
    )
    sliced.counts.plot_grid()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_012.png
   :alt: Energy 1.00 TeV - 1.58 TeV, Energy 1.58 TeV - 2.51 TeV, Energy 2.51 TeV - 3.98 TeV
   :srcset: /tutorials/api/images/sphx_glr_datasets_012.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 352-355

The same operation will be applied to all other maps contained in the
datasets such as `~gammapy.datasets.MapDataset.mask_fit`:


.. GENERATED FROM PYTHON SOURCE LINES 355-360

.. code-block:: Python


    sliced.mask_fit.plot_grid()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_013.png
   :alt: Energy 1.00 TeV - 1.58 TeV, Energy 1.58 TeV - 2.51 TeV, Energy 2.51 TeV - 3.98 TeV
   :srcset: /tutorials/api/images/sphx_glr_datasets_013.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 361-368

Resampling datasets
~~~~~~~~~~~~~~~~~~~

It can often be useful to coarsely rebin an initially computed datasets
by a specified factor. This can be done in either spatial or energy
axes:


.. GENERATED FROM PYTHON SOURCE LINES 368-375

.. code-block:: Python


    plt.figure()
    downsampled = dataset_cta.downsample(factor=8)
    downsampled.counts.sum_over_axes().plot()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_014.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_014.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 376-378

And the same down-sampling process is possible along the energy axis:


.. GENERATED FROM PYTHON SOURCE LINES 378-386

.. code-block:: Python


    downsampled_energy = dataset_cta.downsample(
        factor=5, axis_name="energy", name="downsampled-energy"
    )
    downsampled_energy.counts.plot_grid()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_015.png
   :alt: Energy 100 GeV - 1.00 TeV, Energy 1.00 TeV - 10.0 TeV
   :srcset: /tutorials/api/images/sphx_glr_datasets_015.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 387-390

In the printout one can see that the actual number of counts is
preserved during the down-sampling:


.. GENERATED FROM PYTHON SOURCE LINES 390-394

.. code-block:: Python


    print(downsampled_energy, dataset_cta)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MapDataset
    ----------

      Name                            : downsampled-energy 

      Total counts                    : 104317 
      Total background counts         : 91507.69
      Total excess counts             : 12809.31

      Predicted counts                : 91507.69
      Predicted background counts     : 91507.69
      Predicted excess counts         : nan

      Exposure min                    : 6.28e+07 m2 s
      Exposure max                    : 1.90e+10 m2 s

      Number of total bins            : 153600 
      Number of fit bins              : 22608 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : nan

      Number of models                : 0 
      Number of parameters            : 0
      Number of free parameters       : 0

     MapDataset
    ----------

      Name                            : dataset-cta 

      Total counts                    : 104317 
      Total background counts         : 91507.70
      Total excess counts             : 12809.31

      Predicted counts                : 91719.65
      Predicted background counts     : 91507.69
      Predicted excess counts         : 211.96

      Exposure min                    : 6.28e+07 m2 s
      Exposure max                    : 1.90e+10 m2 s

      Number of total bins            : 768000 
      Number of fit bins              : 67824 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : 44318.32

      Number of models                : 2 
      Number of parameters            : 8
      Number of free parameters       : 5

      Component 0: SkyModel
  
        Name                      : gc
        Datasets names            : None
        Spectral model type       : PowerLawSpectralModel
        Spatial  model type       : PointSpatialModel
        Temporal model type       : 
        Parameters:
          index                         :      2.000   +/-    0.00             
          amplitude                     :   1.00e-12   +/- 0.0e+00 1 / (cm2 s TeV)
          reference             (frozen):      1.000       TeV         
          lon_0                         :      4.650   +/-    0.00 rad         
          lat_0                         :     -0.505   +/-    0.00 rad         
  
      Component 1: FoVBackgroundModel
  
        Name                      : dataset-cta-bkg
        Datasets names            : ['dataset-cta']
        Spectral model type       : PowerLawNormSpectralModel
        Parameters:
          norm                          :      1.000   +/-    0.00             
          tilt                  (frozen):      0.000                   
          reference             (frozen):      1.000       TeV         
  
  
.. GENERATED FROM PYTHON SOURCE LINES 395-398

We can also resample the finer binned datasets to an arbitrary coarser
energy binning using:


.. GENERATED FROM PYTHON SOURCE LINES 398-405

.. code-block:: Python


    energy_axis_new = MapAxis.from_energy_edges([0.1, 0.3, 1, 3, 10] * u.TeV)
    resampled = dataset_cta.resample_energy_axis(energy_axis=energy_axis_new)
    resampled.counts.plot_grid(ncols=2)
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_016.png
   :alt: Energy 100 GeV - 251 GeV, Energy 251 GeV - 1.00 TeV, Energy 1.00 TeV - 2.51 TeV, Energy 2.51 TeV - 10.0 TeV
   :srcset: /tutorials/api/images/sphx_glr_datasets_016.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 406-409

To squash the whole dataset into a single energy bin there is the
`~gammapy.datasets.MapDataset.to_image()` convenience method:


.. GENERATED FROM PYTHON SOURCE LINES 409-415

.. code-block:: Python


    dataset_image = dataset_cta.to_image()
    dataset_image.counts.plot()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_017.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_017.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 416-426

SpectrumDataset
---------------

`~gammapy.datasets.SpectrumDataset` inherits from a `~gammapy.datasets.MapDataset`, and is specially
adapted for 1D spectral analysis, and uses a `~gammapy.maps.RegionGeom` instead of a
`~gammapy.maps.WcsGeom`. A `~gammapy.datasets.MapDataset` can be converted to a `~gammapy.datasets.SpectrumDataset`,
by summing the `counts` and `background` inside the `on_region`,
which can then be used for classical spectral analysis. Containment
correction is feasible only for circular regions.


.. GENERATED FROM PYTHON SOURCE LINES 426-438

.. code-block:: Python


    region = CircleSkyRegion(SkyCoord(0, 0, unit="deg", frame="galactic"), 0.5 * u.deg)
    spectrum_dataset = dataset_cta.to_spectrum_dataset(
        region, containment_correction=True, name="spectrum-dataset"
    )

    # For a quick look
    spectrum_dataset.peek()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_018.png
   :alt: Counts, Exposure, Energy Dispersion
   :srcset: /tutorials/api/images/sphx_glr_datasets_018.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 439-444

A `~gammapy.datasets.MapDataset` can also be integrated over the `on_region` to create
a `~gammapy.datasets.MapDataset` with a `~gammapy.maps.RegionGeom`. Complex regions can be handled
and since the full IRFs are used, containment correction is not
required.


.. GENERATED FROM PYTHON SOURCE LINES 444-449

.. code-block:: Python


    reg_dataset = dataset_cta.to_region_map_dataset(region, name="region-map-dataset")
    print(reg_dataset)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MapDataset
    ----------

      Name                            : region-map-dataset 

      Total counts                    : 5644 
      Total background counts         : 3323.66
      Total excess counts             : 2320.34

      Predicted counts                : 3323.66
      Predicted background counts     : 3323.66
      Predicted excess counts         : nan

      Exposure min                    : 4.66e+08 m2 s
      Exposure max                    : 1.89e+10 m2 s

      Number of total bins            : 10 
      Number of fit bins              : 6 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : nan

      Number of models                : 0 
      Number of parameters            : 0
      Number of free parameters       : 0


.. GENERATED FROM PYTHON SOURCE LINES 450-456

FluxPointsDataset
-----------------

`~gammapy.datasets.FluxPointsDataset` is a `~gammapy.datasets.Dataset` container for precomputed flux
points, which can be then used in fitting.


.. GENERATED FROM PYTHON SOURCE LINES 456-467

.. code-block:: Python


    flux_points = FluxPoints.read(
        "$GAMMAPY_DATA/tests/spectrum/flux_points/diff_flux_points.fits"
    )
    model = SkyModel(spectral_model=PowerLawSpectralModel(index=2.3))
    fp_dataset = FluxPointsDataset(data=flux_points, models=model)

    fp_dataset.plot_spectrum()
    plt.show()


.. image-sg:: /tutorials/api/images/sphx_glr_datasets_019.png
   :alt: datasets
   :srcset: /tutorials/api/images/sphx_glr_datasets_019.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 468-473

The masks on `~gammapy.datasets.FluxPointsDataset` are `~numpy.ndarray` objects
and the data is a
`~gammapy.datasets.FluxPoints` object. The `~gammapy.datasets.FluxPointsDataset.mask_safe`,
by default, masks the upper limit points.


.. GENERATED FROM PYTHON SOURCE LINES 473-476

.. code-block:: Python


    print(fp_dataset.mask_safe)  # Note: the mask here is simply a numpy array


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [[[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[ True]]

     [[False]]

     [[ True]]

     [[False]]

     [[False]]

     [[ True]]

     [[False]]

     [[False]]]


.. GENERATED FROM PYTHON SOURCE LINES 477-480

.. code-block:: Python


    print(fp_dataset.data)  # is a `FluxPoints` object


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    FluxPoints
    ----------

      geom                   : RegionGeom
      axes                   : ['lon', 'lat', 'energy']
      shape                  : (1, 1, 24)
      quantities             : ['norm', 'norm_err', 'norm_ul']
      ref. model             : pl
      n_sigma                : 1
      n_sigma_ul             : 2.0
      sqrt_ts_threshold_ul   : 2
      sed type init          : dnde


.. GENERATED FROM PYTHON SOURCE LINES 481-485

.. code-block:: Python


    print(fp_dataset.data_shape())  # number of data points


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    (24,)


.. GENERATED FROM PYTHON SOURCE LINES 486-489

For an example of fitting `~gammapy.estimators.FluxPoints`, see :doc:`/tutorials/analysis-1d/sed_fitting`,
and for using source catalogs see :doc:`/tutorials/api/catalog`.


.. GENERATED FROM PYTHON SOURCE LINES 493-513

Datasets
--------

`~gammapy.datasets.Datasets` are a collection of `~gammapy.datasets.Dataset` objects. They can be of the
same type, or of different types, eg: mix of `~gammapy.datasets.FluxPointsDataset`,
`~gammapy.datasets.MapDataset` and `~gammapy.datasets.SpectrumDataset`.

For modelling and fitting of a list of `~gammapy.datasets.Dataset` objects, you can
either:

-  (A) Do a joint fitting of all the datasets together **OR**
-  (B) Stack the datasets together, and then fit them.

`~gammapy.datasets.Datasets` is a convenient tool to handle joint fitting of
simultaneous datasets. As an example, please see :doc:`/tutorials/analysis-3d/analysis_mwl`.

To see how stacking is performed, please see :ref:`stack`.

To create a `~gammapy.datasets.Datasets` object, pass a list of `~gammapy.datasets.Dataset` on init, eg


.. GENERATED FROM PYTHON SOURCE LINES 513-519

.. code-block:: Python


    datasets = Datasets([dataset_empty, dataset_cta])

    print(datasets)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Datasets
    --------

    Dataset 0: 

      Type       : MapDataset
      Name       : dataset-empty
      Instrument : 
      Models     : 

    Dataset 1: 

      Type       : MapDataset
      Name       : dataset-cta
      Instrument : 
      Models     : ['gc', 'dataset-cta-bkg']


.. GENERATED FROM PYTHON SOURCE LINES 520-524

If all the datasets have the same type we can also print an info table,
collecting all the information from the individual calls to
`~gammapy.datasets.Dataset.info_dict()`:


.. GENERATED FROM PYTHON SOURCE LINES 524-530

.. code-block:: Python


    display(datasets.info_table())  # quick info of all datasets

    print(datasets.names)  # unique name of each dataset


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

         name     counts       excess       ... stat_type      stat_sum     
                                            ...                             
    ------------- ------ ------------------ ... --------- ------------------
    dataset-empty      0                0.0 ...      cash                nan
      dataset-cta 104317 12809.313714614138 ...      cash 44318.324424288934
    ['dataset-empty', 'dataset-cta']


.. GENERATED FROM PYTHON SOURCE LINES 531-533

We can access individual datasets in `Datasets` object by name:


.. GENERATED FROM PYTHON SOURCE LINES 533-537

.. code-block:: Python


    print(datasets["dataset-empty"])  # extracts the first dataset


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MapDataset
    ----------

      Name                            : dataset-empty 

      Total counts                    : 0 
      Total background counts         : 0.00
      Total excess counts             : 0.00

      Predicted counts                : 0.00
      Predicted background counts     : 0.00
      Predicted excess counts         : nan

      Exposure min                    : 0.00e+00 m2 s
      Exposure max                    : 0.00e+00 m2 s

      Number of total bins            : 110000 
      Number of fit bins              : 0 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : nan

      Number of models                : 0 
      Number of parameters            : 0
      Number of free parameters       : 0


.. GENERATED FROM PYTHON SOURCE LINES 538-540

Or by index:


.. GENERATED FROM PYTHON SOURCE LINES 540-544

.. code-block:: Python


    print(datasets[0])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MapDataset
    ----------

      Name                            : dataset-empty 

      Total counts                    : 0 
      Total background counts         : 0.00
      Total excess counts             : 0.00

      Predicted counts                : 0.00
      Predicted background counts     : 0.00
      Predicted excess counts         : nan

      Exposure min                    : 0.00e+00 m2 s
      Exposure max                    : 0.00e+00 m2 s

      Number of total bins            : 110000 
      Number of fit bins              : 0 

      Fit statistic type              : cash
      Fit statistic value (-2 log(L)) : nan

      Number of models                : 0 
      Number of parameters            : 0
      Number of free parameters       : 0


.. GENERATED FROM PYTHON SOURCE LINES 545-547

Other list type operations work as well such as:


.. GENERATED FROM PYTHON SOURCE LINES 547-553

.. code-block:: Python


    # Use python list convention to remove/add datasets, eg:
    datasets.remove("dataset-empty")
    print(datasets.names)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    ['dataset-cta']


.. GENERATED FROM PYTHON SOURCE LINES 554-556

Or


.. GENERATED FROM PYTHON SOURCE LINES 556-561

.. code-block:: Python


    datasets.append(spectrum_dataset)
    print(datasets.names)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    ['dataset-cta', 'spectrum-dataset']


.. GENERATED FROM PYTHON SOURCE LINES 562-565

Let’s create a list of spectrum datasets to illustrate some more
functionality:


.. GENERATED FROM PYTHON SOURCE LINES 565-577

.. code-block:: Python


    datasets = Datasets()

    path = make_path("$GAMMAPY_DATA/joint-crab/spectra/hess")

    for filename in path.glob("pha_*.fits"):
        dataset = SpectrumDatasetOnOff.read(filename)
        datasets.append(dataset)

    print(datasets)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Datasets
    --------

    Dataset 0: 

      Type       : SpectrumDatasetOnOff
      Name       : 23526
      Instrument : 
      Models     : 

    Dataset 1: 

      Type       : SpectrumDatasetOnOff
      Name       : 23592
      Instrument : 
      Models     : 

    Dataset 2: 

      Type       : SpectrumDatasetOnOff
      Name       : 23523
      Instrument : 
      Models     : 

    Dataset 3: 

      Type       : SpectrumDatasetOnOff
      Name       : 23559
      Instrument : 
      Models     : 


.. GENERATED FROM PYTHON SOURCE LINES 578-580

Now we can stack all datasets using `~gammapy.datasets.Datasets.stack_reduce()`:


.. GENERATED FROM PYTHON SOURCE LINES 580-585

.. code-block:: Python


    stacked = datasets.stack_reduce(name="stacked")
    print(stacked)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    SpectrumDatasetOnOff
    --------------------

      Name                            : stacked 

      Total counts                    : 459 
      Total background counts         : 27.50
      Total excess counts             : 431.50

      Predicted counts                : 45.27
      Predicted background counts     : 45.27
      Predicted excess counts         : nan

      Exposure min                    : 2.13e+06 m2 s
      Exposure max                    : 2.63e+09 m2 s

      Number of total bins            : 80 
      Number of fit bins              : 43 

      Fit statistic type              : wstat
      Fit statistic value (-2 log(L)) : 1530.36

      Number of models                : 0 
      Number of parameters            : 0
      Number of free parameters       : 0

      Total counts_off                : 645 
      Acceptance                      : 43 
      Acceptance off                  : 1149 


.. GENERATED FROM PYTHON SOURCE LINES 586-588

Or slice all datasets by a given energy range:


.. GENERATED FROM PYTHON SOURCE LINES 588-594

.. code-block:: Python


    datasets_sliced = datasets.slice_by_energy(energy_min="1 TeV", energy_max="10 TeV")
    plt.show()

    print(datasets_sliced.energy_ranges)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    (<Quantity [1.e+09, 1.e+09, 1.e+09, 1.e+09] keV>, <Quantity [1.e+10, 1.e+10, 1.e+10, 1.e+10] keV>)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 14.689 seconds)


.. _sphx_glr_download_tutorials_api_datasets.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/gammapy/gammapy-webpage/main?urlpath=lab/tree/notebooks/dev/tutorials/api/datasets.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: datasets.ipynb <datasets.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: datasets.py <datasets.py>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_