This is a fixed-text formatted version of a Jupyter notebook

Source detection with Gammapy

Context

The first task in a source catalogue production is to identify significant excesses in the data that can be associated to unknown sources and provide a preliminary parametrization in term of position, extent, and flux. In this notebook we will use Fermi-LAT data to illustrate how to detect candidate sources in counts images with known background.

Objective: build a list of significant excesses in a Fermi-LAT map

Proposed approach

This notebook show how to do source detection with Gammapy using the methods available in gammapy.detect. We will use images from a Fermi-LAT 3FHL high-energy Galactic center dataset to do this:

  • perform adaptive smoothing on counts image

  • produce 2-dimensional test-statistics (TS)

  • run a peak finder to detect point-source candidates

  • compute Li & Ma significance images

  • estimate source candidates radius and excess counts

Note that what we do here is a quick-look analysis, the production of real source catalogs use more elaborate procedures.

We will work with the following functions and classes:

Setup

As always, let’s get started with some setup …

[1]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from gammapy.maps import Map
from gammapy.detect import (
    ASmooth,
    TSMapEstimator,
    find_peaks,
    compute_lima_image,
)
from gammapy.catalog import SOURCE_CATALOGS
from gammapy.cube import PSFKernel
from gammapy.stats import significance
from astropy.coordinates import SkyCoord
from astropy.convolution import Tophat2DKernel
import astropy.units as u
import numpy as np
[2]:
# defalut matplotlib colors without grey
colors = [
    u"#1f77b4",
    u"#ff7f0e",
    u"#2ca02c",
    u"#d62728",
    u"#9467bd",
    u"#8c564b",
    u"#e377c2",
    u"#bcbd22",
    u"#17becf",
]

Read in input images

We first read in the counts cube and sum over the energy axis:

[3]:
counts = Map.read("$GAMMAPY_DATA/fermi-3fhl-gc/fermi-3fhl-gc-counts.fits.gz")
background = Map.read(
    "$GAMMAPY_DATA/fermi-3fhl-gc/fermi-3fhl-gc-background.fits.gz"
)
exposure = Map.read(
    "$GAMMAPY_DATA/fermi-3fhl-gc/fermi-3fhl-gc-exposure.fits.gz"
)

maps = {"counts": counts, "background": background, "exposure": exposure}

kernel = PSFKernel.read(
    "$GAMMAPY_DATA/fermi-3fhl-gc/fermi-3fhl-gc-psf.fits.gz"
)
WARNING: FITSFixedWarning: 'datfix' made the change 'Set DATE-REF to '1858-11-17' from MJD-REF'. [astropy.wcs.wcs]

Adaptive smoothing

For visualisation purpose it can be nice to look at a smoothed counts image. This can be performed using the adaptive smoothing algorithm from Ebeling et al. (2006). In the following example the threshold argument gives the minimum significance expected, values below are clipped.

[4]:
%%time
scales = u.Quantity(np.arange(0.05, 1, 0.05), unit="deg")
smooth = ASmooth(threshold=3, scales=scales)
images = smooth.run(**maps)
/Users/adonath/github/adonath/gammapy/gammapy/maps/image_utils.py:44: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.
  return np.dstack(_fftconvolve_wrap(kernel, data) for kernel in kernels)
CPU times: user 905 ms, sys: 358 ms, total: 1.26 s
Wall time: 1.28 s
[5]:
plt.figure(figsize=(15, 5))
images["counts"].plot(add_cbar=True, vmax=10);
../_images/notebooks_detect_9_0.png

TS map estimation

The Test Statistic, TS = 2 ∆ log L (Mattox et al. 1996), compares the likelihood function L optimized with and without a given source. The TS map is computed by fitting by a single amplitude parameter on each pixel as described in Appendix A of Stewart (2009). The fit is simplified by finding roots of the derivative of the fit statistics (default settings use Brent’s method).

[6]:
%%time
estimator = TSMapEstimator()
images = estimator.run(maps, kernel.data)
CPU times: user 8.5 s, sys: 134 ms, total: 8.63 s
Wall time: 8.75 s

Plot resulting images

[7]:
plt.figure(figsize=(15, 5))
images["sqrt_ts"].plot(add_cbar=True);
../_images/notebooks_detect_13_0.png
[8]:
plt.figure(figsize=(15, 5))
images["flux"].plot(add_cbar=True, stretch="sqrt", vmin=0);
/Users/adonath/software/anaconda3/envs/gammapy-dev/lib/python3.7/site-packages/matplotlib/colors.py:527: RuntimeWarning: invalid value encountered in less
  xa[xa < 0] = -1
/Users/adonath/software/anaconda3/envs/gammapy-dev/lib/python3.7/site-packages/matplotlib/colors.py:527: RuntimeWarning: invalid value encountered in less
  xa[xa < 0] = -1
../_images/notebooks_detect_14_1.png
[9]:
plt.figure(figsize=(15, 5))
images["niter"].plot(add_cbar=True);
../_images/notebooks_detect_15_0.png

Source candidates

Let’s run a peak finder on the sqrt_ts image to get a list of point-sources candidates (positions and peak sqrt_ts values). The find_peaks function performs a local maximun search in a sliding window, the argument min_distance is the minimum pixel distance between peaks (smallest possible value and default is 1 pixel).

[10]:
sources = find_peaks(images["sqrt_ts"], threshold=8, min_distance=1)
nsou = len(sources)
sources
[10]:
Table length=8
valuexyradec
degdeg
float64int64int64float64float64
32.97220099266.41449-28.97054
27.9955260272.43197-23.54282
16.183298271.16056-21.74479
14.9396993270.40919-23.47797
14.8428092270.15899-23.98049
13.95136102270.86716-21.82076
9.7245273119263.18257-31.52587
8.8532124102268.46711-25.63326
[11]:
# Plot sources on top of significance sky image
plt.figure(figsize=(15, 5))

_, ax, _ = images["sqrt_ts"].plot(add_cbar=True)

ax.scatter(
    sources["ra"],
    sources["dec"],
    transform=plt.gca().get_transform("icrs"),
    color="none",
    edgecolor="w",
    marker="o",
    s=600,
    lw=1.5,
);
../_images/notebooks_detect_18_0.png

Note that we used the instrument point-spread-function (PSF) as kernel, so the hypothesis we test is the presence of a point source. In order to test for extended sources we would have to use as kernel an extended template convolved by the PSF. Alternatively, we can compute the significance of an extended excess using the Li & Ma formalism, which is faster as no fitting is involve.

Li & Ma significance maps

We can compute significance for an observed number of counts and known background using an extension of equation (17) from the Li & Ma (1983) (see gammapy.stats.significance for details). We can perform this calculation intergating the counts within different radius. To do so we use an astropy Tophat kernel with the compute_lima_image function.

[12]:
%%time
radius = np.array([0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.4, 0.5])
pixsize = counts.geom.pixel_scales[0].value
nr = len(radius)
signi = np.zeros((nsou, nr))
excess = np.zeros((nsou, nr))
for kr in range(nr):
    npixel = radius[kr] / pixsize
    kernel = Tophat2DKernel(npixel)
    result = compute_lima_image(counts, background, kernel)
    signi[:, kr] = result["significance"].data[sources["y"], sources["x"]]
    excess[:, kr] = result["excess"].data[sources["y"], sources["x"]]
CPU times: user 171 ms, sys: 4.43 ms, total: 175 ms
Wall time: 103 ms

For simplicity we saved the significance and excess at the position of the candidates found previously on the TS map, but we could aslo have applied the peak finder on these significances maps for each scale, or alternatively implemented a 3D peak detection (in longitude, latitude, radius). Now let’s look at the significance versus integration radius:

[13]:
plt.figure()
for ks in range(nsou):
    plt.plot(radius, signi[ks, :], color=colors[ks])
plt.xlabel("Radius")
plt.ylabel("Li & Ma Significance")
plt.title("Guessing optimal radius of each candidate");
../_images/notebooks_detect_23_0.png

We can add the optimal radius guessed and the corresdponding excess to the source candidate properties table.

[14]:
# rename the value key to sqrt(TS)_PS
sources.rename_column("value", "sqrt(TS)_PS")

index = np.argmax(signi, axis=1)
sources["significance"] = signi[range(nsou), index]
sources["radius"] = radius[index]
sources["excess"] = excess[range(nsou), index]
sources
[14]:
Table length=8
sqrt(TS)_PSxyradecsignificanceradiusexcess
degdeg
float64int64int64float64float64float64float64float64
32.97220099266.41449-28.9705430.0315282778893470.2405.75250244140625
27.9955260272.43197-23.5428225.6331656567420860.15146.97657775878906
16.183298271.16056-21.7447923.8959217401539520.4398.72711181640625
14.9396993270.40919-23.4779720.8242126658741750.5404.0583801269531
14.8428092270.15899-23.9804921.7062666386120460.5414.585693359375
13.95136102270.86716-21.8207621.2205001658572460.5401.30413818359375
9.7245273119263.18257-31.525878.7254180221417230.0527.931381225585938
8.8532124102268.46711-25.633267.7395180970857940.0524.57577896118164
[15]:
# Plot candidates sources on top of significance sky image with radius guess
plt.figure(figsize=(15, 5))

_, ax, _ = images["sqrt_ts"].plot(add_cbar=True, cmap=cm.Greys_r)

phi = np.arange(0, 2 * np.pi, 0.01)
for ks in range(nsou):
    x = sources["x"][ks] + sources["radius"][ks] / pixsize * np.cos(phi)
    y = sources["y"][ks] + sources["radius"][ks] / pixsize * np.sin(phi)
    ax.plot(x, y, "-", color=colors[ks], lw=1.5);
../_images/notebooks_detect_26_0.png

Note that the optimal radius of nested sources is likely overestimated due to their neighbor. We limited this example to only the most significant source above ~8 sigma. When lowering the detection threshold the number of candidated increase together with the source confusion.

What next?

In this notebook, we have seen how to work with images and compute TS and significance images from counts data, if a background estimate is already available.

Here’s some suggestions what to do next:

  • Look how background estimation is performed for IACTs with and without the high-level interface in analysis_1 and analysis_2 notebooks, respectively

  • Learn about 2D model fitting in the image_analysis notebook

  • find more about Fermi-LAT data analysis in the fermi_lat notebook

  • Use source candidates to build a model and perform a 3D fitting (see analysis_3d, analysis_mwl notebooks for some hints)

[ ]: