Let $tp, fp, fn,$ and $tn$ be the number of true positives, false positives, false negatives and true negatives obtained on some set by a binary classifier. Let $P$ and $R$ be the precision and recall given by $P=\frac{tp}{tp+fp}, R = \frac{tp}{tp+fn}$. Let $N=tp+fp+fn+tn$ be the total size of the set. The precision and recall for the negative class are $P'=\frac{tn}{tn+fn}, R'=\frac{tn}{tn+fp}$.
We want to show that $P'$ increases with $R$, and $R'$ increases with $P$ (and hence analyzing the performance of any one class is sufficient).
We use $$P'=\frac{N-(tp+fp)-fn}{N-(tp+fp)} = 1 - \frac{fn}{N-(tp+fp)}$$, and
$$R = \frac{tp}{tp+fn} \Rightarrow tp+fn = \frac{tp}{r} \Rightarrow fn = tp(\frac{1}{R}-1)$$, to get $$P' = 1 - (\frac{1}{R}-1) \frac{tp}{N-(tp+fp)} = 1 - (\frac{1}{R}-1) \frac{1}{\frac{N}{tp}-(1+\frac{fp}{tp})}$$.
$R$ increases when $tp$ increases (if $fn$ decreases, then $tp$ has to increase). Also, an increase in $tp$ has no effect on $fp$. Thus, by increasing $R$ (or $tp$), the second term in the RHS to decrease, thus increasing $P'$. QED.
A similar relation holds between $P$ and $R'$ by symmetry between the two classes.
So what?
We want to show that $P'$ increases with $R$, and $R'$ increases with $P$ (and hence analyzing the performance of any one class is sufficient).
We use $$P'=\frac{N-(tp+fp)-fn}{N-(tp+fp)} = 1 - \frac{fn}{N-(tp+fp)}$$, and
$$R = \frac{tp}{tp+fn} \Rightarrow tp+fn = \frac{tp}{r} \Rightarrow fn = tp(\frac{1}{R}-1)$$, to get $$P' = 1 - (\frac{1}{R}-1) \frac{tp}{N-(tp+fp)} = 1 - (\frac{1}{R}-1) \frac{1}{\frac{N}{tp}-(1+\frac{fp}{tp})}$$.
$R$ increases when $tp$ increases (if $fn$ decreases, then $tp$ has to increase). Also, an increase in $tp$ has no effect on $fp$. Thus, by increasing $R$ (or $tp$), the second term in the RHS to decrease, thus increasing $P'$. QED.
A similar relation holds between $P$ and $R'$ by symmetry between the two classes.
So what?
- When comparing the results from two binary classifiers, we might be interested in how the classifiers performs on both classes, rather than just one. In such cases, it is sufficient to compute both precision and recall for one of the classes and compare the same for the two classifiers. If one classifier has better precision on the positive class, it also means that it has better recall on the negative class.
- If we want to find the best hyper-parameters for a classifier using grid search, and we want to find the parameter for which the the classifier does well on both classes, then we could define "best" in terms of a score that incorporates both precision and recall (e.g. the F1 score). This ensures, e.g., that precision is not optimized at the cost of recall (i.e. at the cost of precision of the negative class).
No comments:
Post a Comment