{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "\n", "**This is a fixed-text formatted version of a Jupyter notebook**\n", "\n", "- Try online [![Binder](https://static.mybinder.org/badge.svg)](https://mybinder.org/v2/gh/gammapy/gammapy-webpage/v0.17?urlpath=lab/tree/mcmc_sampling.ipynb)\n", "- You can contribute with your own notebooks in this\n", "[GitHub repository](https://github.com/gammapy/gammapy/tree/master/tutorials).\n", "- **Source files:**\n", "[mcmc_sampling.ipynb](../_static/notebooks/mcmc_sampling.ipynb) |\n", "[mcmc_sampling.py](../_static/notebooks/mcmc_sampling.py)\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Fitting and error estimation with MCMC\n", "\n", "## Introduction\n", "\n", "The goal of Markov Chain Monte Carlo (MCMC) algorithms is to approximate the posterior distribution of your model parameters by random sampling in a probabilistic space. For most readers this sentence was probably not very helpful so here we'll start straight with and example but you should read the more detailed mathematical approaches of the method [here](https://www.pas.rochester.edu/~sybenzvi/courses/phy403/2015s/p403_17_mcmc.pdf) and [here](https://github.com/jakevdp/BayesianAstronomy/blob/master/03-Bayesian-Modeling-With-MCMC.ipynb).\n", "\n", "### How does it work ?\n", "\n", "The idea is that we use a number of walkers that will sample the posterior distribution (i.e. sample the Likelihood profile).\n", "\n", "The goal is to produce a \"chain\", i.e. a list of $\\theta$ values, where each $\\theta$ is a vector of parameters for your model.
\n", "If you start far away from the truth value, the chain will take some time to converge until it reaches a stationary state. Once it has reached this stage, each successive elements of the chain are samples of the target posterior distribution.
\n", "This means that, once we have obtained the chain of samples, we have everything we need. We can compute the distribution of each parameter by simply approximating it with the histogram of the samples projected into the parameter space. This will provide the errors and correlations between parameters.\n", "\n", "\n", "Now let's try to put a picture on the ideas described above. With this notebook, we have simulated and carried out a MCMC analysis for a source with the following parameters:
\n", "$Index=2.0$, $Norm=5\\times10^{-12}$ cm$^{-2}$ s$^{-1}$ TeV$^{-1}$, $Lambda =(1/Ecut) = 0.02$ TeV$^{-1}$ (50 TeV) for 20 hours.\n", "\n", "The results that you can get from a MCMC analysis will look like this :\n", "\n", "\n", "\n", "On the first two top panels, we show the pseudo-random walk of one walker from an offset starting value to see it evolve to a better solution.\n", "In the bottom right panel, we show the trace of each 16 walkers for 500 runs (the chain described previsouly). For the first 100 runs, the parameter evolve towards a solution (can be viewed as a fitting step). Then they explore the local minimum for 400 runs which will be used to estimate the parameters correlations and errors.\n", "The choice of the Nburn value (when walkers have reached a stationary stage) can be done by eye but you can also look at the autocorrelation time.\n", "\n", "### Why should I use it ?\n", "\n", "When it comes to evaluate errors and investigate parameter correlation, one typically estimate the Likelihood in a gridded search (2D Likelihood profiles). Each point of the grid implies a new model fitting. If we use 10 steps for each parameters, we will need to carry out 100 fitting procedures. \n", "\n", "Now let's say that I have a model with $N$ parameters, we need to carry out that gridded analysis $N*(N-1)$ times. \n", "So for 5 free parameters you need 20 gridded search, resulting in 2000 individual fit. \n", "Clearly this strategy doesn't scale well to high-dimensional models.\n", "\n", "Just for fun: if each fit procedure takes 10s, we're talking about 5h of computing time to estimate the correlation plots. \n", "\n", "There are many MCMC packages in the python ecosystem but here we will focus on [emcee](https://emcee.readthedocs.io), a lightweight Python package. A description is provided here : [Foreman-Mackey, Hogg, Lang & Goodman (2012)](https://arxiv.org/abs/1202.3665)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import astropy.units as u\n", "from astropy.coordinates import SkyCoord\n", "from gammapy.irf import load_cta_irfs\n", "from gammapy.maps import WcsGeom, MapAxis\n", "from gammapy.modeling.models import (\n", " ExpCutoffPowerLawSpectralModel,\n", " GaussianSpatialModel,\n", " SkyModel,\n", ")\n", "from gammapy.datasets import MapDataset\n", "from gammapy.makers import MapDatasetMaker\n", "from gammapy.data import Observation\n", "from gammapy.modeling.sampling import (\n", " run_mcmc,\n", " par_to_model,\n", " plot_corner,\n", " plot_trace,\n", ")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import logging\n", "\n", "logging.basicConfig(level=logging.INFO)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Simulate an observation\n", "\n", "Here we will start by simulating an observation using the simulate_dataset method." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "irfs = load_cta_irfs(\n", " \"\$GAMMAPY_DATA/cta-1dc/caldb/data/cta/1dc/bcf/South_z20_50h/irf_file.fits\"\n", ")\n", "\n", "observation = Observation.create(\n", " pointing=SkyCoord(0 * u.deg, 0 * u.deg, frame=\"galactic\"),\n", " livetime=20 * u.h,\n", " irfs=irfs,\n", ")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "SkyModel\n", "\n", " Name : source\n", " Datasets names : None\n", " Spectral model type : ExpCutoffPowerLawSpectralModel\n", " Spatial model type : GaussianSpatialModel\n", " Temporal model type : None\n", " Parameters:\n", " index : 2.000 \n", " amplitude : 3.00e-12 1 / (cm2 s TeV)\n", " reference (frozen) : 1.000 TeV \n", " lambda_ : 0.050 1 / TeV \n", " alpha (frozen) : 1.000 \n", " lon_0 : 0.000 deg \n", " lat_0 : 0.000 deg \n", " sigma : 0.200 deg \n", " e (frozen) : 0.000 \n", " phi (frozen) : 0.000 deg \n", "\n", "\n" ] } ], "source": [ "# Define sky model to simulate the data\n", "spatial_model = GaussianSpatialModel(\n", " lon_0=\"0 deg\", lat_0=\"0 deg\", sigma=\"0.2 deg\", frame=\"galactic\"\n", ")\n", "\n", "spectral_model = ExpCutoffPowerLawSpectralModel(\n", " index=2,\n", " amplitude=\"3e-12 cm-2 s-1 TeV-1\",\n", " reference=\"1 TeV\",\n", " lambda_=\"0.05 TeV-1\",\n", ")\n", "\n", "sky_model_simu = SkyModel(\n", " spatial_model=spatial_model, spectral_model=spectral_model, name=\"source\"\n", ")\n", "print(sky_model_simu)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Define map geometry\n", "axis = MapAxis.from_edges(\n", " np.logspace(-1, 2, 30), unit=\"TeV\", name=\"energy\", interp=\"log\"\n", ")\n", "geom = WcsGeom.create(\n", " skydir=(0, 0), binsz=0.05, width=(2, 2), frame=\"galactic\", axes=[axis]\n", ")\n", "\n", "empty_dataset = MapDataset.create(geom=geom, name=\"dataset-mcmc\")\n", "maker = MapDatasetMaker(selection=[\"background\", \"edisp\", \"psf\", \"exposure\"])\n", "dataset = maker.run(empty_dataset, observation)\n", "dataset.models.append(sky_model_simu)\n", "dataset.fake()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "