4. Performance Measures

Here we explore various functions that evaluate the performance of a classifier. We start by defining some notations:

  • \(C\) is the confusion matrix, where entry \(C_{ij}\) is the number of objects in class \(i\) but have been as class \(j\),
  • \(m_i\) is the number of objects in class \(i\): \(m_i = \sum_j C_{ij}\),
  • \(m\) is the total sample size: \(m = \sum_i m_i\),
  • \(k\) is the number of classes.

4.1. Naive Accuracy

One simple way to evaluate the overall performance of a classifier is to compute the naive accuracy rate, which is simply the total fraction of objects that have been correctly classified:

\[\text{Naive Accuracy Rate} = \dfrac{\sum_i C_{ii}}{m}\]

This is implemented in the function naive_accuracy().

4.2. Balanced Accuracy

However, if the dataset is imbalanced, this measure would not work well. A better approach is to use the posterior balanced accuracy. Let \(A_i\) be the accuracy rate of class \(i\):

\[A_i = \dfrac{C_{ii}}{m_i}\]

Before running a classifier, we know nothing of its performance, so we can assume the accuracy rate follows a flat prior distribution. In particular, the Beta distribution with parameters \(\alpha = \beta = 1\) (i.e. a uniform distriubtion) seems appropriate here:

\[A_i \sim Beta(1, 1) \qquad \forall i\]

Given an accuracy rate \(A_i\) for each class \(i\), the number of correct predictions in class \(i\) will follow a Binomial distribution with \(A_i\) as the probability of success:

\[\big( C_{ii} \mid A_i \big) \sim Bin\big(m_i, A_i\big) \qquad \forall i\]

In Bayesian terms, this is our likelihood. Now we know that with respect to a Binomial likelihood, the Beta distribution is conjugate to itself. Thus the posterior distribution of \(A_i\) will also be Beta with parameters:

\[\big( A_i \mid C_{ii} \big) \sim Beta \big( 1 + C_{ii}, 1 + m_i - C_{ii} \big) \qquad \forall i\]

get_beta_parameters() is a helper function that extracts the beta paremeters form a confusion matrix.

4.2.1. Convultion

One way to define the balanced accuracy \(A\) is to take the average of the individual accuracy rates \(A_i\):

\[A = \dfrac{1}{k} \sum_i A_i\]

We call \(\big( A \mid C \big)\) the posterior balanced accuracy. One nice feature of this measure is that it’s a probability distribution (instead of a simple point estimate). This allows us to construct confidence intervals, etc. And even though there is no closed form solution for the density function of \(\big( A \mid C \big)\), we can still compute it by performing the convolution \(k\) times. This is implemented in convolve_betas().

4.2.2. Expected Balanced Accuracy

We have just approximated the density of the sum of \(k\) Beta distributions. The next step is to present our results. One measure we can report is the expected value of the posterior balanced accuracy. This is implemented in balanced_accuracy_expected().

4.2.3. Distribution of Balanced Accuracy

We can also construct an empirical distribution for the posterior expected accuracy. First we need to compute the pdf of the sum of beta distributions \(\sum_i A_i\), given a subset \(x\) of the domain. See beta_sum_pdf().

However we’re interested in the average of the accuracy rates, \(A = \dfrac{1}{k} \sum_i A_i = \dfrac{1}{k} A_T\). We can rewrite the pdf of \(A\) as:

\[\begin{split}F_A (a) &= \mathbb{P} (A \leq a) \\ &= \mathbb{P}\bigg( \dfrac{1}{k} \sum_i A_i \leq a \bigg) \\ &= \mathbb{P}\bigg( \sum_i A_i \leq ka \bigg) \\ &= F_{A_T}(ka)\end{split}\]

Differnetiating with respect to \(a\), we’d get:

\[\begin{split}f_A(a) &= \dfrac{\partial}{\partial a} F_A(a) \\ &= \dfrac{\partial}{\partial a} F_{A_T}(ka) \\ &= k \cdot f_{A_T} (ka)\end{split}\]

See beta_avg_pdf() for the implementation.

To make a violin plot of the posterior balanced accuracy, we need to run a Monte Carlo simulation, which requires us to have the inverse cdf of \(A\). beta_sum_cdf(), beta_avg_cdf(), and beta_avg_inv_cdf() are used to approximate the integral of the pdf using the trapezium rule.

4.3. Recall

We can also compute the recall of each class. The recall of class \(i\) is defined as:

\[\text{Recall}_i = \dfrac{C_{ii}}{\sum_j C_{ij}}\]

Intuitively, the recall measures a classifier’s ability to find all the positive samples (and hence minimising the number of false negatives). See recall().

4.4. Precision

Another useful measure is the precision. The precision of class \(i\) is defined as:

\[\text{Precision}_i = \dfrac{C_{ii}}{\sum_j C_{ji}}\]

Intuitively, the precision measures a classifier’s ability to minimise the number of false positives. See precision().