{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "\n", "**This is a fixed-text formatted version of a Jupyter notebook**\n", "\n", "- Try online[![Binder](https://static.mybinder.org/badge.svg)](https://mybinder.org/v2/gh/gammapy/gammapy-webpage/v0.19?urlpath=lab/tree/tutorials/api/fitting.ipynb)\n", "- You may download all the notebooks in the documentation as a\n", "[tar file](../../_downloads/notebooks-0.19.tar).\n", "- **Source files:**\n", "[fitting.ipynb](../../_static/notebooks/fitting.ipynb) |\n", "[fitting.py](../../_static/notebooks/fitting.py)\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Fitting\n", "\n", "\n", "## Prerequisites\n", "\n", "- Knowledge of spectral analysis to produce 1D On-Off datasets, [see the following tutorial](../analysis/1D/spectral_analysis.ipynb)\n", "- Reading of pre-computed datasets [see the MWL tutorial](../analysis/3D/analysis_mwl.ipynb)\n", "- General knowledge on statistics and optimization methods\n", "\n", "## Proposed approach\n", "\n", "This is a hands-on tutorial to `~gammapy.modeling`, showing how to do perform a Fit in gammapy. The emphasis here is on interfacing the `Fit` class and inspecting the errors. To see an analysis example of how datasets and models interact, see the [model management notebook](model_management.ipynb). As an example, in this notebook, we are going to work with HESS data of the Crab Nebula and show in particular how to :\n", "- perform a spectral analysis\n", "- use different fitting backends\n", "- access covariance matrix information and parameter errors\n", "- compute likelihood profile\n", "- compute confidence contours\n", "\n", "See also: [Models gallery tutorial](models.ipynb) and `docs/modeling/index.rst`.\n", "\n", "\n", "## The setup" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:08.661595Z", "iopub.status.busy": "2021-11-22T21:09:08.660604Z", "iopub.status.idle": "2021-11-22T21:09:09.355080Z", "shell.execute_reply": "2021-11-22T21:09:09.355275Z" } }, "outputs": [], "source": [ "import numpy as np\n", "from astropy import units as u\n", "import matplotlib.pyplot as plt\n", "from matplotlib.ticker import StrMethodFormatter\n", "import scipy.stats as st\n", "from gammapy.modeling import Fit\n", "from gammapy.datasets import Datasets, SpectrumDatasetOnOff\n", "from gammapy.modeling.models import LogParabolaSpectralModel, SkyModel\n", "from gammapy.visualization.utils import plot_contour_line\n", "from itertools import combinations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model and dataset\n", "\n", "First we define the source model, here we need only a spectral model for which we choose a log-parabola" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.362612Z", "iopub.status.busy": "2021-11-22T21:09:09.362317Z", "iopub.status.idle": "2021-11-22T21:09:09.363532Z", "shell.execute_reply": "2021-11-22T21:09:09.363725Z" } }, "outputs": [], "source": [ "crab_spectrum = LogParabolaSpectralModel(\n", " amplitude=1e-11 / u.cm ** 2 / u.s / u.TeV,\n", " reference=1 * u.TeV,\n", " alpha=2.3,\n", " beta=0.2,\n", ")\n", "\n", "crab_spectrum.alpha.max = 3\n", "crab_spectrum.alpha.min = 1\n", "crab_model = SkyModel(spectral_model=crab_spectrum, name=\"crab\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data and background are read from pre-computed ON/OFF datasets of HESS observations, for simplicity we stack them together.\n", "Then we set the model and fit range to the resulting dataset." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.366152Z", "iopub.status.busy": "2021-11-22T21:09:09.365835Z", "iopub.status.idle": "2021-11-22T21:09:09.452875Z", "shell.execute_reply": "2021-11-22T21:09:09.453053Z" } }, "outputs": [], "source": [ "datasets = []\n", "for obs_id in [23523, 23526]:\n", " dataset = SpectrumDatasetOnOff.read(\n", " f\"$GAMMAPY_DATA/joint-crab/spectra/hess/pha_obs{obs_id}.fits\"\n", " )\n", " datasets.append(dataset)\n", "\n", "dataset_hess = Datasets(datasets).stack_reduce(name=\"HESS\")\n", "\n", "# Set model and fit range\n", "dataset_hess.models = crab_model\n", "e_min = 0.66 * u.TeV\n", "e_max = 30 * u.TeV\n", "dataset_hess.mask_fit = dataset_hess.counts.geom.energy_mask(e_min, e_max)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fitting options\n", "\n", "\n", "\n", "First let's create a `Fit` instance:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.455137Z", "iopub.status.busy": "2021-11-22T21:09:09.454833Z", "iopub.status.idle": "2021-11-22T21:09:09.456105Z", "shell.execute_reply": "2021-11-22T21:09:09.456270Z" } }, "outputs": [], "source": [ "scipy_opts = {\n", " \"method\": \"L-BFGS-B\",\n", " \"options\": {\"ftol\": 1e-4, \"gtol\": 1e-05},\n", " \"backend\": \"scipy\",\n", "}\n", "fit_scipy = Fit(store_trace=True, optimize_opts=scipy_opts)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default the fit is performed using MINUIT, you can select alternative optimizers and set their option using the `optimize_opts` argument of the `Fit.run()` method. In addition we have specified to store the trace of parameter values of the fit.\n", "\n", "Note that, for now, covaraince matrix and errors are computed only for the fitting with MINUIT. However depending on the problem other optimizers can better perform, so sometimes it can be useful to run a pre-fit with alternative optimization methods.\n", "\n", "For the \"scipy\" backend the available options are described in detail here: \n", "https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.533196Z", "iopub.status.busy": "2021-11-22T21:09:09.462717Z", "iopub.status.idle": "2021-11-22T21:09:09.578563Z", "shell.execute_reply": "2021-11-22T21:09:09.578750Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 116 ms, sys: 3.69 ms, total: 120 ms\n", "Wall time: 119 ms\n" ] } ], "source": [ "%%time\n", "result_scipy = fit_scipy.run([dataset_hess])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the \"sherpa\" backend you can choose the optimization algorithm between method = {\"simplex\", \"levmar\", \"moncar\", \"gridsearch\"}. \n", "Those methods are described and compared in detail on http://cxc.cfa.harvard.edu/sherpa/methods/index.html \n", "The available options of the optimization methods are described on the following page https://cxc.cfa.harvard.edu/sherpa/methods/opt_methods.html" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.581348Z", "iopub.status.busy": "2021-11-22T21:09:09.580986Z", "iopub.status.idle": "2021-11-22T21:09:09.936568Z", "shell.execute_reply": "2021-11-22T21:09:09.936757Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "No covariance estimate - not supported by this backend.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 188 ms, sys: 3.23 ms, total: 191 ms\n", "Wall time: 355 ms\n" ] } ], "source": [ "%%time\n", "sherpa_opts = {\"method\": \"simplex\", \"ftol\": 1e-3, \"maxfev\": int(1e4)}\n", "fit_sherpa = Fit(store_trace=True, backend=\"sherpa\", optimize_opts=sherpa_opts)\n", "results_simplex = fit_sherpa.run([dataset_hess])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the \"minuit\" backend see https://iminuit.readthedocs.io/en/latest/reference.html for a detailed description of the available options. If there is an entry ‘migrad_opts’, those options will be passed to [iminuit.Minuit.migrad](https://iminuit.readthedocs.io/en/latest/reference.html#iminuit.Minuit.migrad). Additionally you can set the fit tolerance using the [tol](https://iminuit.readthedocs.io/en/latest/reference.html#iminuit.Minuit.tol\n", ") option. The minimization will stop when the estimated distance to the minimum is less than 0.001*tol (by default tol=0.1). The [strategy](https://iminuit.readthedocs.io/en/latest/reference.html#iminuit.Minuit.strategy) option change the speed and accuracy of the optimizer: 0 fast, 1 default, 2 slow but accurate. If you want more reliable error estimates, you should run the final fit with strategy 2.\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.983754Z", "iopub.status.busy": "2021-11-22T21:09:09.942622Z", "iopub.status.idle": "2021-11-22T21:09:09.985034Z", "shell.execute_reply": "2021-11-22T21:09:09.985209Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 44.7 ms, sys: 1.78 ms, total: 46.5 ms\n", "Wall time: 45.1 ms\n" ] } ], "source": [ "%%time\n", "fit = Fit(store_trace=True)\n", "minuit_opts = {\"tol\": 0.001, \"strategy\": 1}\n", "fit.backend = \"minuit\"\n", "fit.optimize_opts = minuit_opts\n", "result_minuit = fit.run([dataset_hess])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fit quality assessment\n", "\n", "There are various ways to check the convergence and quality of a fit. Among them:\n", "\n", "Refer to the automatically-generated results dictionary:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.986983Z", "iopub.status.busy": "2021-11-22T21:09:09.986702Z", "iopub.status.idle": "2021-11-22T21:09:09.987924Z", "shell.execute_reply": "2021-11-22T21:09:09.988083Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "OptimizeResult\n", "\n", "\tbackend : scipy\n", "\tmethod : L-BFGS-B\n", "\tsuccess : True\n", "\tmessage : CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH\n", "\tnfev : 60\n", "\ttotal stat : 30.35\n", "\n", "OptimizeResult\n", "\n", "\tbackend : scipy\n", "\tmethod : L-BFGS-B\n", "\tsuccess : True\n", "\tmessage : CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH\n", "\tnfev : 60\n", "\ttotal stat : 30.35\n", "\n", "\n" ] } ], "source": [ "print(result_scipy)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.989724Z", "iopub.status.busy": "2021-11-22T21:09:09.989424Z", "iopub.status.idle": "2021-11-22T21:09:09.990617Z", "shell.execute_reply": "2021-11-22T21:09:09.990775Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "OptimizeResult\n", "\n", "\tbackend : sherpa\n", "\tmethod : simplex\n", "\tsuccess : True\n", "\tmessage : Optimization terminated successfully\n", "\tnfev : 135\n", "\ttotal stat : 30.35\n", "\n", "\n" ] } ], "source": [ "print(results_simplex)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.992560Z", "iopub.status.busy": "2021-11-22T21:09:09.992272Z", "iopub.status.idle": "2021-11-22T21:09:09.993356Z", "shell.execute_reply": "2021-11-22T21:09:09.993525Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "OptimizeResult\n", "\n", "\tbackend : minuit\n", "\tmethod : migrad\n", "\tsuccess : True\n", "\tmessage : Optimization terminated successfully.\n", "\tnfev : 37\n", "\ttotal stat : 30.35\n", "\n", "OptimizeResult\n", "\n", "\tbackend : minuit\n", "\tmethod : migrad\n", "\tsuccess : True\n", "\tmessage : Optimization terminated successfully.\n", "\tnfev : 37\n", "\ttotal stat : 30.35\n", "\n", "\n" ] } ], "source": [ "print(result_minuit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the trace of the fit e.g. in case the fit did not converge properly" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:09.996766Z", "iopub.status.busy": "2021-11-22T21:09:09.996429Z", "iopub.status.idle": "2021-11-22T21:09:09.998260Z", "shell.execute_reply": "2021-11-22T21:09:09.998434Z" } }, "outputs": [ { "data": { "text/html": [ "
Table length=37\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
total_statcrab.spectral.amplitudecrab.spectral.alphacrab.spectral.beta
float64float64float64float64
30.3495305504007653.812242568529405e-112.19574692692016170.22648272085464238
30.349724572537423.815793736236919e-112.19574692692016170.22648272085464238
30.349711732448323.8086914008218915e-112.19574692692016170.22648272085464238
30.3495393267520443.812951391706827e-112.19574692692016170.22648272085464238
30.3495367226651423.811533745351983e-112.19574692692016170.22648272085464238
30.350508869928023.812242568529405e-112.1984246187607060.22648272085464238
30.3505439346710483.812242568529405e-112.1930677748479820.22648272085464238
30.349538991820413.812242568529405e-112.1960147621443140.22648272085464238
30.3495420259938273.812242568529405e-112.19547907709368450.22648272085464238
30.350295171987193.812242568529405e-112.19574692692016170.22788952829627426
30.350332821139683.812242568529405e-112.19574692692016170.22507591341301048
30.3495367441913.812242568529405e-112.19574692692016170.22662340159880556
30.349540025425833.812242568529405e-112.19574692692016170.22634204011047918
30.3495306329608073.812180829128438e-112.1957673285487830.22649745095864532
30.3495304674176373.81221697206664e-112.1957553852167110.22648882779045226
30.3495379823073823.8129257953704704e-112.1957553852167110.22648882779045226
30.349537901310413.8115081487628087e-112.1957553852167110.22648882779045226
30.3495381454876143.81221697206664e-112.1959874220839980.22648882779045226
30.3495377370920853.81221697206664e-112.1955233373892430.22648882779045226
30.3495378043772853.81221697206664e-112.1957553852167110.2266262374110642
30.3495380779678963.81221697206664e-112.1957553852167110.22635141816984033
30.3495304674176373.81221697206664e-112.1957553852167110.22648882779045226
30.3495379823073823.8129257953704704e-112.1957553852167110.22648882779045226
30.349537901310413.8115081487628087e-112.1957553852167110.22648882779045226
30.3495381454876143.81221697206664e-112.1959874220839980.22648882779045226
30.3495377370920853.81221697206664e-112.1955233373892430.22648882779045226
30.3495378043772853.81221697206664e-112.1957553852167110.2266262374110642
30.3495380779678963.81221697206664e-112.1957553852167110.22635141816984033
30.349530774657023.8123587367274056e-112.1957553852167110.22648882779045226
30.3495307581296033.8120752074058735e-112.1957553852167110.22648882779045226
30.3495308075100033.81221697206664e-112.1958017934673990.22648882779045226
30.3495307252353043.81221697206664e-112.1957089765276160.22648882779045226
30.349530739445743.81221697206664e-112.1957553852167110.22651630971457465
30.34953079328993.81221697206664e-112.1957553852167110.22646134586632988
30.349535814890153.8129257953704704e-112.1959874220839980.22648882779045226
30.349537158653483.8129257953704704e-112.1957553852167110.2266262374110642
30.3495593668412483.81221697206664e-112.1959874220839980.2266262374110642
" ], "text/plain": [ "\n", " total_stat crab.spectral.amplitude crab.spectral.alpha crab.spectral.beta\n", " float64 float64 float64 float64 \n", "------------------ ----------------------- ------------------- -------------------\n", "30.349530550400765 3.812242568529405e-11 2.1957469269201617 0.22648272085464238\n", " 30.34972457253742 3.815793736236919e-11 2.1957469269201617 0.22648272085464238\n", " 30.34971173244832 3.8086914008218915e-11 2.1957469269201617 0.22648272085464238\n", "30.349539326752044 3.812951391706827e-11 2.1957469269201617 0.22648272085464238\n", "30.349536722665142 3.811533745351983e-11 2.1957469269201617 0.22648272085464238\n", " 30.35050886992802 3.812242568529405e-11 2.198424618760706 0.22648272085464238\n", "30.350543934671048 3.812242568529405e-11 2.193067774847982 0.22648272085464238\n", " 30.34953899182041 3.812242568529405e-11 2.196014762144314 0.22648272085464238\n", "30.349542025993827 3.812242568529405e-11 2.1954790770936845 0.22648272085464238\n", " 30.35029517198719 3.812242568529405e-11 2.1957469269201617 0.22788952829627426\n", " 30.35033282113968 3.812242568529405e-11 2.1957469269201617 0.22507591341301048\n", " 30.349536744191 3.812242568529405e-11 2.1957469269201617 0.22662340159880556\n", " 30.34954002542583 3.812242568529405e-11 2.1957469269201617 0.22634204011047918\n", "30.349530632960807 3.812180829128438e-11 2.195767328548783 0.22649745095864532\n", "30.349530467417637 3.81221697206664e-11 2.195755385216711 0.22648882779045226\n", "30.349537982307382 3.8129257953704704e-11 2.195755385216711 0.22648882779045226\n", " 30.34953790131041 3.8115081487628087e-11 2.195755385216711 0.22648882779045226\n", "30.349538145487614 3.81221697206664e-11 2.195987422083998 0.22648882779045226\n", "30.349537737092085 3.81221697206664e-11 2.195523337389243 0.22648882779045226\n", "30.349537804377285 3.81221697206664e-11 2.195755385216711 0.2266262374110642\n", "30.349538077967896 3.81221697206664e-11 2.195755385216711 0.22635141816984033\n", "30.349530467417637 3.81221697206664e-11 2.195755385216711 0.22648882779045226\n", "30.349537982307382 3.8129257953704704e-11 2.195755385216711 0.22648882779045226\n", " 30.34953790131041 3.8115081487628087e-11 2.195755385216711 0.22648882779045226\n", "30.349538145487614 3.81221697206664e-11 2.195987422083998 0.22648882779045226\n", "30.349537737092085 3.81221697206664e-11 2.195523337389243 0.22648882779045226\n", "30.349537804377285 3.81221697206664e-11 2.195755385216711 0.2266262374110642\n", "30.349538077967896 3.81221697206664e-11 2.195755385216711 0.22635141816984033\n", " 30.34953077465702 3.8123587367274056e-11 2.195755385216711 0.22648882779045226\n", "30.349530758129603 3.8120752074058735e-11 2.195755385216711 0.22648882779045226\n", "30.349530807510003 3.81221697206664e-11 2.195801793467399 0.22648882779045226\n", "30.349530725235304 3.81221697206664e-11 2.195708976527616 0.22648882779045226\n", " 30.34953073944574 3.81221697206664e-11 2.195755385216711 0.22651630971457465\n", " 30.3495307932899 3.81221697206664e-11 2.195755385216711 0.22646134586632988\n", " 30.34953581489015 3.8129257953704704e-11 2.195987422083998 0.22648882779045226\n", " 30.34953715865348 3.8129257953704704e-11 2.195755385216711 0.2266262374110642\n", "30.349559366841248 3.81221697206664e-11 2.195987422083998 0.2266262374110642" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result_minuit.trace" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check that the fitted values and errors for all parameters are reasonable, and no fitted parameter value is \"too close\" - or even outside - its allowed min-max range" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:10.001464Z", "iopub.status.busy": "2021-11-22T21:09:10.001151Z", "iopub.status.idle": "2021-11-22T21:09:10.002550Z", "shell.execute_reply": "2021-11-22T21:09:10.002785Z" } }, "outputs": [ { "data": { "text/html": [ "
Table length=4\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
typenamevalueuniterrorminmaxfrozenlink
str8str9float64str14float64float64float64boolstr1
spectralamplitude3.8122e-11cm-2 s-1 TeV-13.546e-12nannanFalse
spectralreference1.0000e+00TeV0.000e+00nannanTrue
spectralalpha2.1958e+002.626e-011.000e+003.000e+00False
spectralbeta2.2649e-011.397e-01nannanFalse
" ], "text/plain": [ "\n", " type name value unit error min max frozen link\n", " str8 str9 float64 str14 float64 float64 float64 bool str1\n", "-------- --------- ---------- -------------- --------- --------- --------- ------ ----\n", "spectral amplitude 3.8122e-11 cm-2 s-1 TeV-1 3.546e-12 nan nan False \n", "spectral reference 1.0000e+00 TeV 0.000e+00 nan nan True \n", "spectral alpha 2.1958e+00 2.626e-01 1.000e+00 3.000e+00 False \n", "spectral beta 2.2649e-01 1.397e-01 nan nan False " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result_minuit.parameters.to_table()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot fit statistic profiles for all fitted parameters, using `~gammapy.modeling.Fit.stat_profile()`. For a good fit and error estimate each profile should be parabolic. The specification for each fit statistic profile can be changed on the `~gammapy.modeling.Parameter` object, which has `.scan_min`, `.scan_max`, `.scan_n_values` and `.scan_n_sigma` attributes." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:10.023266Z", "iopub.status.busy": "2021-11-22T21:09:10.010269Z", "iopub.status.idle": "2021-11-22T21:09:10.176446Z", "shell.execute_reply": "2021-11-22T21:09:10.176639Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "total_stat = result_minuit.total_stat\n", "\n", "fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(14, 4))\n", "\n", "for ax, par in zip(axes, crab_model.parameters.free_parameters):\n", " par.scan_n_values = 17\n", "\n", " profile = fit.stat_profile(datasets=[dataset_hess], parameter=par)\n", " ax.plot(profile[f\"{par.name}_scan\"], profile[\"stat_scan\"] - total_stat)\n", " ax.set_xlabel(f\"{par.unit}\")\n", " ax.set_ylabel(\"Delta TS\")\n", " ax.set_title(f\"{par.name}: {par.value:.1e} +- {par.error:.1e}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Inspect model residuals. Those can always be accessed using `~Dataset.residuals()`, that will return an array in case a the fitted `Dataset` is a `SpectrumDataset` and a full cube in case of a `MapDataset`. For more details, we refer here to the dedicated fitting tutorials: [analysis_3d.ipynb](../analysis/3D/analysis_3d.ipynb) (for `MapDataset` fitting) and [spectrum_analysis.ipynb](../analysis/1D/spectral_analysis.ipynb) (for `SpectrumDataset` fitting)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Covariance and parameters errors\n", "\n", "After the fit the covariance matrix is attached to the model. You can get the error on a specific parameter by accessing the `.error` attribute:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:10.178835Z", "iopub.status.busy": "2021-11-22T21:09:10.178486Z", "iopub.status.idle": "2021-11-22T21:09:10.179865Z", "shell.execute_reply": "2021-11-22T21:09:10.180100Z" } }, "outputs": [ { "data": { "text/plain": [ "0.2625817875684573" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crab_model.spectral_model.alpha.error" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And you can plot the total parameter correlation as well:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:10.196367Z", "iopub.status.busy": "2021-11-22T21:09:10.196042Z", "iopub.status.idle": "2021-11-22T21:09:10.261815Z", "shell.execute_reply": "2021-11-22T21:09:10.262076Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "crab_model.covariance.plot_correlation()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an example, this step is needed to produce a butterfly plot showing the envelope of the model taking into account parameter uncertainties." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:10.264241Z", "iopub.status.busy": "2021-11-22T21:09:10.263951Z", "iopub.status.idle": "2021-11-22T21:09:10.530467Z", "shell.execute_reply": "2021-11-22T21:09:10.530719Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "energy_bounds = [1, 10] * u.TeV\n", "crab_spectrum.plot(energy_bounds=energy_bounds, energy_power=2)\n", "ax = crab_spectrum.plot_error(energy_bounds=energy_bounds, energy_power=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Confidence contours\n", "\n", "\n", "In most studies, one wishes to estimate parameters distribution using observed sample data.\n", "A 1-dimensional confidence interval gives an estimated range of values which is likely to include an unknown parameter.\n", "A confidence contour is a 2-dimensional generalization of a confidence interval, often represented as an ellipsoid around the best-fit value.\n", "\n", "Gammapy offers two ways of computing confidence contours, in the dedicated methods `Fit.minos_contour()` and `Fit.stat_profile()`. In the following sections we will describe them." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An important point to keep in mind is: *what does a $N\\sigma$ confidence contour really mean?* The answer is it represents the points of the parameter space for which the model likelihood is $N\\sigma$ above the minimum. But one always has to keep in mind that **1 standard deviation in two dimensions has a smaller coverage probability than 68%**, and similarly for all other levels. In particular, in 2-dimensions the probability enclosed by the $N\\sigma$ confidence contour is $P(N)=1-e^{-N^2/2}$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Computing contours using `Fit.stat_contour()` " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After the fit, MINUIT offers the possibility to compute the confidence confours.\n", "gammapy provides an interface to this functionality through the `Fit` object using the `.stat_contour` method.\n", "Here we defined a function to automate the contour production for the different parameterer and confidence levels (expressed in term of sigma):" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:10.533798Z", "iopub.status.busy": "2021-11-22T21:09:10.533494Z", "iopub.status.idle": "2021-11-22T21:09:10.534567Z", "shell.execute_reply": "2021-11-22T21:09:10.534786Z" } }, "outputs": [], "source": [ "def make_contours(fit, datasets, result, npoints, sigmas):\n", " cts_sigma = []\n", " for sigma in sigmas:\n", " contours = dict()\n", " for par_1, par_2 in combinations([\"alpha\", \"beta\", \"amplitude\"], r=2):\n", " contour = fit.stat_contour(\n", " datasets=datasets,\n", " x=result.parameters[par_1],\n", " y=result.parameters[par_2],\n", " numpoints=npoints,\n", " sigma=sigma,\n", " )\n", " contours[f\"contour_{par_1}_{par_2}\"] = {\n", " par_1: contour[par_1].tolist(),\n", " par_2: contour[par_2].tolist(),\n", " }\n", " cts_sigma.append(contours)\n", " return cts_sigma" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can compute few contours." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:10.615692Z", "iopub.status.busy": "2021-11-22T21:09:10.615385Z", "iopub.status.idle": "2021-11-22T21:09:15.046404Z", "shell.execute_reply": "2021-11-22T21:09:15.046594Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 4.49 s, sys: 19.1 ms, total: 4.51 s\n", "Wall time: 4.51 s\n" ] } ], "source": [ "%%time\n", "sigmas = [1, 2]\n", "cts_sigma = make_contours(\n", " fit=fit,\n", " datasets=[dataset_hess],\n", " result=result_minuit,\n", " npoints=10,\n", " sigmas=sigmas,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we prepare some aliases and annotations in order to make the plotting nicer." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.049677Z", "iopub.status.busy": "2021-11-22T21:09:15.049373Z", "iopub.status.idle": "2021-11-22T21:09:15.050552Z", "shell.execute_reply": "2021-11-22T21:09:15.050774Z" } }, "outputs": [], "source": [ "pars = {\n", " \"phi\": r\"$\\phi_0 \\,/\\,(10^{-11}\\,{\\rm TeV}^{-1} \\, {\\rm cm}^{-2} {\\rm s}^{-1})$\",\n", " \"alpha\": r\"$\\alpha$\",\n", " \"beta\": r\"$\\beta$\",\n", "}\n", "\n", "panels = [\n", " {\n", " \"x\": \"alpha\",\n", " \"y\": \"phi\",\n", " \"cx\": (lambda ct: ct[\"contour_alpha_amplitude\"][\"alpha\"]),\n", " \"cy\": (\n", " lambda ct: np.array(1e11)\n", " * ct[\"contour_alpha_amplitude\"][\"amplitude\"]\n", " ),\n", " },\n", " {\n", " \"x\": \"beta\",\n", " \"y\": \"phi\",\n", " \"cx\": (lambda ct: ct[\"contour_beta_amplitude\"][\"beta\"]),\n", " \"cy\": (\n", " lambda ct: np.array(1e11)\n", " * ct[\"contour_beta_amplitude\"][\"amplitude\"]\n", " ),\n", " },\n", " {\n", " \"x\": \"alpha\",\n", " \"y\": \"beta\",\n", " \"cx\": (lambda ct: ct[\"contour_alpha_beta\"][\"alpha\"]),\n", " \"cy\": (lambda ct: ct[\"contour_alpha_beta\"][\"beta\"]),\n", " },\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally we produce the confidence contours figures." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.064701Z", "iopub.status.busy": "2021-11-22T21:09:15.064403Z", "iopub.status.idle": "2021-11-22T21:09:15.221256Z", "shell.execute_reply": "2021-11-22T21:09:15.221507Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, axes = plt.subplots(1, 3, figsize=(16, 5))\n", "colors = [\"m\", \"b\", \"c\"]\n", "for p, ax in zip(panels, axes):\n", " xlabel = pars[p[\"x\"]]\n", " ylabel = pars[p[\"y\"]]\n", " for ks in range(len(cts_sigma)):\n", " plot_contour_line(\n", " ax,\n", " p[\"cx\"](cts_sigma[ks]),\n", " p[\"cy\"](cts_sigma[ks]),\n", " lw=2.5,\n", " color=colors[ks],\n", " label=f\"{sigmas[ks]}\" + r\"$\\sigma$\",\n", " )\n", " ax.set_xlabel(xlabel)\n", " ax.set_ylabel(ylabel)\n", "plt.legend()\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Computing contours using `Fit.stat_surface()`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This alternative method for the computation of confidence contours, although more time consuming than `Fit.minos_contour()`, is expected to be more stable. It consists of a generalization of `Fit.stat_profile()` to a 2-dimensional parameter space. The algorithm is very simple:\n", "- First, passing two arrays of parameters values, a 2-dimensional discrete parameter space is defined;\n", "- For each node of the parameter space, the two parameters of interest are frozen. This way, a likelihood value ($-2\\mathrm{ln}\\,\\mathcal{L}$, actually) is computed, by either freezing (default) or fitting all nuisance parameters;\n", "- Finally, a 2-dimensional surface of $-2\\mathrm{ln}(\\mathcal{L})$ values is returned.\n", "Using that surface, one can easily compute a surface of $TS = -2\\Delta\\mathrm{ln}(\\mathcal{L})$ and compute confidence contours.\n", "\n", "Let's see it step by step." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First of all, we can notice that this method is \"backend-agnostic\", meaning that it can be run with MINUIT, sherpa or scipy as fitting tools. Here we will stick with MINUIT, which is the default choice:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an example, we can compute the confidence contour for the `alpha` and `beta` parameters of the `dataset_hess`. Here we define the parameter space:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.224056Z", "iopub.status.busy": "2021-11-22T21:09:15.223759Z", "iopub.status.idle": "2021-11-22T21:09:15.225074Z", "shell.execute_reply": "2021-11-22T21:09:15.225324Z" } }, "outputs": [], "source": [ "result = result_minuit\n", "par_alpha = result.parameters[\"alpha\"]\n", "par_beta = result.parameters[\"beta\"]\n", "\n", "par_alpha.scan_values = np.linspace(1.55, 2.7, 20)\n", "par_beta.scan_values = np.linspace(-0.05, 0.55, 20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we run the algorithm, by choosing `reoptimize=False` for the sake of time saving. In real life applications, we strongly recommend to use `reoptimize=True`, so that all free nuisance parameters will be fit at each grid node. This is the correct way, statistically speaking, of computing confidence contours, but is expected to be time consuming." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.229870Z", "iopub.status.busy": "2021-11-22T21:09:15.229448Z", "iopub.status.idle": "2021-11-22T21:09:15.664468Z", "shell.execute_reply": "2021-11-22T21:09:15.664639Z" } }, "outputs": [], "source": [ "fit = Fit(backend=\"minuit\", optimize_opts={\"print_level\": 0})\n", "stat_surface = fit.stat_surface(\n", " datasets=[dataset_hess],\n", " x=par_alpha,\n", " y=par_beta,\n", " reoptimize=False,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to easily inspect the results, we can convert the $-2\\mathrm{ln}(\\mathcal{L})$ surface to a surface of statistical significance (in units of Gaussian standard deviations from the surface minimum):" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.666513Z", "iopub.status.busy": "2021-11-22T21:09:15.666216Z", "iopub.status.idle": "2021-11-22T21:09:15.667529Z", "shell.execute_reply": "2021-11-22T21:09:15.667706Z" } }, "outputs": [], "source": [ "# Compute TS\n", "TS = stat_surface[\"stat_scan\"] - result.total_stat" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.669326Z", "iopub.status.busy": "2021-11-22T21:09:15.669036Z", "iopub.status.idle": "2021-11-22T21:09:15.670065Z", "shell.execute_reply": "2021-11-22T21:09:15.670236Z" } }, "outputs": [], "source": [ "# Compute the corresponding statistical significance surface\n", "stat_surface = np.sqrt(TS.T)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that, as explained before, $1\\sigma$ contour obtained this way will not contain 68% of the probability, but rather " ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.671800Z", "iopub.status.busy": "2021-11-22T21:09:15.671497Z", "iopub.status.idle": "2021-11-22T21:09:15.672542Z", "shell.execute_reply": "2021-11-22T21:09:15.672774Z" } }, "outputs": [], "source": [ "# Compute the corresponding statistical significance surface\n", "# p_value = 1 - st.chi2(df=1).cdf(TS)\n", "# gaussian_sigmas = st.norm.isf(p_value / 2).T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can plot the surface values together with contours:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2021-11-22T21:09:15.690479Z", "iopub.status.busy": "2021-11-22T21:09:15.690187Z", "iopub.status.idle": "2021-11-22T21:09:15.748281Z", "shell.execute_reply": "2021-11-22T21:09:15.748527Z" }, "nbsphinx-thumbnail": { "tooltip": "Learn how the model, dataset and fit Gammapy classes work together in a detailed modeling and fitting use-case." } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(figsize=(8, 6))\n", "x_values = par_alpha.scan_values\n", "y_values = par_beta.scan_values\n", "\n", "# plot surface\n", "im = ax.pcolormesh(x_values, y_values, stat_surface, shading=\"auto\")\n", "fig.colorbar(im, label=\"sqrt(TS)\")\n", "ax.set_xlabel(f\"{par_alpha.name}\")\n", "ax.set_ylabel(f\"{par_beta.name}\")\n", "\n", "# We choose to plot 1 and 2 sigma confidence contours\n", "levels = [1, 2]\n", "contours = ax.contour(\n", " x_values, y_values, stat_surface, levels=levels, colors=\"white\"\n", ")\n", "ax.clabel(contours, fmt=\"%.0f$\\,\\sigma$\", inline=3, fontsize=15);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that, if computed with `reoptimize=True`, this plot would be completely consistent with the third panel of the plot produced with `Fit.stat_contour` (try!)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, it is always remember that confidence contours are approximations. In particular, when the parameter range boundaries are close to the contours lines, it is expected that the statistical meaning of the contours is not well defined. That's why we advise to always choose a parameter space that com contain the contours you're interested in." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1.0, "eqLabelWithNumbers": true, "eqNumInitial": 1.0, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "nbsphinx": { "orphan": true } }, "nbformat": 4, "nbformat_minor": 4 }