Note

You are not reading the stable version of Gammapy documentation.
Access the latest stable version v1.3 or the list of Gammapy releases.

Derivation of the WStat formula

you can write down the likelihood formula as

L(non,noff,α;μsig,μbkg)=(μsig+μbkg)nonnon!exp((μsig+μbkg))×(μbkg/α)noffnoff!exp(μbkg/α),

where μsig and μbkg are respectively the number of expected signal and background counts in the ON region, as defined in the Introduction. By taking two time the negative log likelihood and neglecting model independent and thus constant terms, we define the WStat.

W=2(μsig+(1+1/α)μbkgnonlog(μsig+μbkg)nofflog(μbkg/α))

In the most general case, where μsrc and μbkg are free the minimum of W is at

μsig=nonαnoffμbkg=αnoff

Profile Likelihood

Most of the times you probably won’t have a model in order to get μbkg. The strategy in this case is to treat μbkg as so-called nuisance parameter, i.e. a free parameter that is of no physical interest. Of course you don’t want an additional free parameter for each bin during a fit. Therefore one calculates an estimator for μbkg by analytically minimizing the likelihood function. This is called ‘profile likelihood’.

dlogLdμbkg=0

This yields a quadratic equation for μbkg

αnonμsig+αμbkg+noffμbkg(α+1)=0

with the solution

μbkg=C+D2α(α+1)

where

C=α(non+noff)(α+1)μsigD2=C2+4(α+1)αnoffμsig

Goodness of fit

The best-fit value of the WStat as defined now contains no information about the goodness of the fit. We consider the likelihood of the data non and noff under the expectation of non and noff,

L(non,noff;non,noff)=nnononnon!exp(non)×nnoffoffnoff!exp(noff)

and add twice the log likelihood

2logL(non,noff;non,noff)=2(non(log(non)1)+noff(log(noff)1))

to WStat. In doing so, we are computing the likelihood ratio:

2logL(non,noff,α;μsig,μbkg)L(non,noff;non,noff)

Intuitively, this log-likelihood ratio should asymptotically behave like a chi-square with m-n degrees of freedom, where m is the number of measurements and n the number of model parameters.

Final result

W=2(μsig+(1+α)μbkgnonnoffnon(log(μsig+αμbkg)log(non))noff(log(μbkg)log(noff)))

Special cases

The above formula is undefined if non or noff are equal to zero, because of the nlogn terms, that were introduced by adding the goodness of fit terms. These cases are treated as follows.

If non=0 the likelihood formulae read

L(0,noff,α;μsig,μbkg)=exp((μsig+αμbkg))×(μbkg)noffnoff!exp(μbkg),

and

L(0,noff;0,noff)=nnoffoffnoff!exp(noff)

WStat is derived by taking 2 times the negative log likelihood and adding the goodness of fit term as ever

W=2(μsig+(1+α)μbkgnoffnoff(log(μbkg)log(noff)))

Note that this is the limit of the original Wstat formula for non0.

The analytical result for μbkg in this case reads:

μbkg=noffα+1

When inserting this into the WStat we find the simplified expression.

W=2(μsig+nofflog(1+α))

If noff=0 Wstat becomes

W=2(μsig+(1+α)μbkgnonnon(log(μsig+αμbkg)log(non))

and

μbkg=non1+αμsigα

For μsig>non(α1+α), μbkg becomes negative which is unphysical.

Therefore we distinct two cases. The physical one where

μsig<non(α1+α).

is straightforward and gives

W=2(μsig(1α)+nonlog(α1+α))

For the unphysical case, we set μbkg=0 and arrive at

W=2(μsig+non(log(non)log(μsig)1))