Derivation of the WStat formula
you can write down the likelihood formula as
\[L (n_{\mathrm{on}}, n_{\mathrm{off}}, \alpha; \mu_{\mathrm{sig}},
\mu_{\mathrm{bkg}}) = \frac{(\mu_{\mathrm{sig}}+
\mu_{\mathrm{bkg}})^{n_{\mathrm{on}}}}{n_{\mathrm{on}} !}
\exp{(-(\mu_{\mathrm{sig}}+ \mu_{\mathrm{bkg}}))}\times
\frac{(\mu_{\mathrm{bkg}}/\alpha)^{n_{\mathrm{off}}}}{n_{\mathrm{off}}
!}\exp{(-\mu_{\mathrm{bkg}}/\alpha)},\]
where \(\mu_{\mathrm{sig}}\) and \(\mu_{\mathrm{bkg}}\) are respectively
the number of expected signal and background counts in the ON region,
as defined in the Notations. By taking two
time the negative log likelihood and neglecting model independent and thus
constant terms, we define the WStat.
\[W = 2 \big(\mu_{\mathrm{sig}} + (1 + 1/\alpha)\mu_{\mathrm{bkg}}
- n_{\mathrm{on}} \log{(\mu_{\mathrm{sig}} + \mu_{\mathrm{bkg}})}
- n_{\mathrm{off}} \log{(\mu_{\mathrm{bkg}}/\alpha)}\big)\]
In the most general case, where \(\mu_{\mathrm{sig}}\) and
\(\mu_{\mathrm{bkg}}\) are free the minimum of \(W\) is at
\[\begin{split}\mu_{\mathrm{sig}} = n_{\mathrm{on}} - \alpha\,n_{\mathrm{off}} \\
\mu_{\mathrm{bkg}} = \alpha\,n_{\mathrm{off}}\end{split}\]
Profile Likelihood
Most of the times you probably won’t have a model in order to get
\(\mu_{\mathrm{bkg}}\). The strategy in this case is to treat
\(\mu_{\mathrm{bkg}}\) as so-called nuisance parameter, i.e. a free
parameter that is of no physical interest. Of course you don’t want an
additional free parameter for each bin during a fit. Therefore one calculates an
estimator for \(\mu_{\mathrm{bkg}}\) by analytically minimizing the
likelihood function. This is called ‘profile likelihood’.
\[\frac{\mathrm d \log L}{\mathrm d \mu_{\mathrm{bkg}}} = 0\]
This yields a quadratic equation for \(\mu_{\mathrm{bkg}}\)
\[\frac{\alpha n_{\mathrm{on}}}{\mu_{\mathrm{sig}}+\mu_{\mathrm{bkg}}} + \frac{\alpha n_{\mathrm{off}}}{\mu_{\mathrm{bkg}}} - (\alpha
+ 1) = 0\]
with the solution
\[\mu_{\mathrm{bkg}} = \frac{C + D}{2(\alpha + 1)}\]
where
\[\begin{split}C = \alpha(n_{\mathrm{on}} + n_{\mathrm{off}}) - (\alpha+1)\mu_{\mathrm{sig}} \\
D^2 = C^2 + 4 (\alpha+1)\alpha n_{\mathrm{off}} \mu_{\mathrm{sig}}\end{split}\]
Goodness of fit
The best-fit value of the WStat as defined now contains no information about the
goodness of the fit. We consider the likelihood of the data
\(n_{\mathrm{on}}\) and \(n_{\mathrm{off}}\) under the expectation of
\(n_{\mathrm{on}}\) and \(n_{\mathrm{off}}\).
\[L (n_{\mathrm{on}}, n_{\mathrm{off}}, \alpha; n_{\mathrm{on}} - \alpha n_{\mathrm{off}}, \alpha n_{\mathrm{off}}) =
\frac{n_{\mathrm{on}}^{n_{\mathrm{on}}}}{n_{\mathrm{on}} !}
\exp{(-n_{\mathrm{on}})}\times
\frac{n_{\mathrm{off}}^{n_{\mathrm{off}}}}{n_{\mathrm{off}} !}
\exp{(-n_{\mathrm{off}})}\]
and add twice the log likelihood
\[2 \log L (n_{\mathrm{on}}, n_{\mathrm{off}}; \alpha; n_{\mathrm{on}} - \alpha n_{\mathrm{off}},
\alpha n_{\mathrm{off}}) = 2 (n_{\mathrm{on}} ( \log{(n_{\mathrm{on}})} - 1 ) +
n_{\mathrm{off}} ( \log{(n_{\mathrm{off}})} - 1))\]
to WStat. In doing so, we are computing the likelihood ratio:
\[-2 \log \frac{L(n_{\mathrm{on}},n_{\mathrm{off}},\alpha;
\mu_{\mathrm{sig}},\mu_{\mathrm{bkg}})}
{L(n_{\mathrm{on}},n_{\mathrm{off}}, \alpha; n_{\mathrm{on}} - \alpha n_{\mathrm{off}}, \alpha n_{\mathrm{off}})}\]
Intuitively, this log-likelihood ratio should asymptotically behave like a
chi-square with m-n
degrees of freedom, where m
is the number of
measurements and n
the number of model parameters.
Final result
\[W = 2 \big(\mu_{\mathrm{sig}} + (1 + \frac{1}{\alpha})\mu_{\mathrm{bkg}} -
n_{\mathrm{on}} - n_{\mathrm{off}} - n_{\mathrm{on}}
(\log{(\mu_{\mathrm{sig}} + \mu_{\mathrm{bkg}}) -
\log{(n_{\mathrm{on}})}}) - n_{\mathrm{off}} (\log(\frac{{\mu_{\mathrm{bkg}}}}{\alpha}) -
\log{(n_{\mathrm{off}})})\big)\]
Special cases
The above formula is undefined if \(n_{\mathrm{on}}\) or
\(n_{\mathrm{off}}\) are equal to zero, because of the \(n\log{{n}}\)
terms, that were introduced by adding the goodness of fit terms. These cases are
treated as follows.
If \(n_{\mathrm{on}} = 0\) the likelihood formulae read
\[L (0, n_{\mathrm{off}}, \alpha; \mu_{\mathrm{sig}}, \mu_{\mathrm{bkg}}) =
\exp{(-(\mu_{\mathrm{sig}}+ \mu_{\mathrm{bkg}}))}\times
\frac{(\mu_{\mathrm{bkg}}/\alpha)^{n_{\mathrm{off}}}}{n_{\mathrm{off}}
!}\exp{(-\mu_{\mathrm{bkg}}/\alpha))},\]
and
\[L (0, n_{\mathrm{off}}, \alpha; 0 - \alpha n_{\mathrm{off}}, \alpha n_{\mathrm{off}} ) =
\frac{n_{\mathrm{off}}^{n_{\mathrm{off}}}}{n_{\mathrm{off}} !}
\exp{(-n_{\mathrm{off}})}\]
WStat is derived by taking 2 times the negative log likelihood and adding the
goodness of fit term as ever
\[W = 2 \big(\mu_{\mathrm{sig}} + (1 + \frac{1}{\alpha})\mu_{\mathrm{bkg}} -
n_{\mathrm{off}} - n_{\mathrm{off}} (\log{(\mu_{\mathrm{bkg}}/\alpha)} -
\log{(n_{\mathrm{off}})})\big)\]
Note that this is the limit of the original Wstat formula for
\(n_{\mathrm{on}} \rightarrow 0\).
The analytical result for
\(\mu_{\mathrm{bkg}}\) in this case reads:
\[\mu_{\mathrm{bkg}} = \frac{\alpha n_{\mathrm{off}}}{\alpha + 1}\]
When inserting this into the WStat we find the simplified expression.
\[W = 2\big(\mu_{\mathrm{sig}} + n_{\mathrm{off}} \log{(1 + \alpha)}\big)\]
If \(n_{\mathrm{off}} = 0\) Wstat becomes
\[W = 2 \big(\mu_{\mathrm{sig}} + (1 + \frac{1}{\alpha})\mu_{\mathrm{bkg}} -
n_{\mathrm{on}} - n_{\mathrm{on}} (\log{(\mu_{\mathrm{sig}} +
\mu_{\mathrm{bkg}}) - \log{(n_{\mathrm{on}})}})\]
and
\[\mu_{\mathrm{bkg}} = \frac{\alpha n_{\mathrm{on}}}{1+\alpha} -
{\mu_{\mathrm{sig}}}\]
For \(\mu_{\mathrm{sig}} > n_{\mathrm{on}} (\frac{\alpha}{1 + \alpha})\),
\(\mu_{\mathrm{bkg}}\) becomes negative which is unphysical.
Therefore we distinct two cases. The physical one where
\(\mu_{\mathrm{sig}} < n_{\mathrm{on}} (\frac{\alpha}{1 + \alpha})\).
is straightforward and gives
\[W = -2\big(\mu_{\mathrm{sig}} \left(\frac{1}{\alpha}\right) +
n_{\mathrm{on}} \log{\left(\frac{\alpha}{1 + \alpha}\right)\big)}\]
For the unphysical case, we set \(\mu_{\mathrm{bkg}}=0\) and arrive at
\[W = 2\big(\mu_{\mathrm{sig}} + n_{\mathrm{on}}(\log{(n_{\mathrm{on}})} -
\log{(\mu_{\mathrm{sig}})} - 1)\big)\]