## Please enable JavaScript to view this site.

This guide is for an old version of Prism. Browse the latest version or update Prism
 Classification

Often, the goal of logistic regression is to simply classify observations as one of the two possible outcomes that the model describes. We've already discussed how, based on a set of values for the independent (X) variables, the probability of "success" (Y=1) can be calculated. Classification works by using that predicted probability and applying the following set of rules:

1.Define a cutoff value (between 0 and 1)

2.Determine the probability of success for an observation and compare it to the cutoff value

3.If the determined probability is greater than the cutoff value, classify the observations as a success. If the determined probability is less than the cutoff value, classify the observation as a failure

Prism provides two methods that report results based on the logistic model's classification of the data. These include the area under the ROC curve (AUC) and the 2x2 Classification table, both described below.

## Area under the ROC curve (AUC)

Area under the ROC curve (AUC) provides an aggregate value of how well the model correctly classifies the 0s and 1s with all possible cutoff values. AUC values range between 0.5 and 1, where an area of 0.5 means that the model predicts which outcomes will be 1 or 0 no better than flipping a coin, and an area of 1 means that the model predicts perfectly. To understand reported AUC values in more detail, look at some examples of various extremes for ROC curves from logistic regression.

## Classification table (2x2)

The classification table reports a 2x2 table that displays the numbers of correctly classified values at the user-specified cutoff. This table has four entries that report the number of observed 0s (and 1s) that were correctly (and incorrectly) predicted. Additionally, the classification table will provide information on total number of observed 1s and 0s, total number of predicted 1s and 0s, the percent of correctly classified 1s and 0s, the percent of total correctly classified observations, and the positive and negative predictive power.

The total number of observed (entered) 0s = A + B

The total number of observed (entered) 1s = C + D

The total number of predicted 0s = A + C

The total number of predicted 1s = B + D

The percent of observed 0s correctly classified = (A/(A+B))*100

The percent of observed 1s correctly classified = (D/(C+D))*100

The percent of all observations correctly classified = ((A+D)/(A+B+C+D))*100

The negative predictive power (%) = (A/(A+C))*100

The positive predictive power (%) = (D/(B+D))*100

 Predicted 0 Predicted 1 Observed 0 A B Observed 1 C D

There are a number of other values that can be obtained from the classification table that Prism doesn’t report directly. For example, common values obtained from this sort of data include the False Discovery Rate (B/(B+D)), the False Negative Rate (C/(C+D)), and many others. For additional information, read more about values that can be calculated from classification tables.

Note also that two common values – Sensitivity and Specificity – can be calculated from the classification table. For the selected cutoff value (default 0.5), the sensitivity and specificity can be calculated from the values in the classification table as:

Sensitivity = D/(C+D)

Specificity = A/(A+B)

Prism reports these in the "percent correctly classified" column. The “percent of observed 1s correctly classified” is the sensitivity, and “percent of observed 0s correctly classified” is the specificity. Prism reports these as percentages.