In previous sections, it was established that principal components are defined in a ranked manner: the first principal component “explains” (or accounts for) the majority of the variance in the data. The second PC “explains” the next most amount of variance in the data, and so on. Recall also that:

•The primary objective of PCA is dimensionality reduction

•The variance explained by each PC is given by its eigenvalue

•There are as many possible PCs as there are original variables (assuming more observations than variables in the data)

Taken together, this means that if we were to define all possible PCs for a dataset, we would end up with a new dataset that has the same dimension as the original - not a dataset with reduced dimension. So we must come up with a way to specify which PCs we want to retain, and which we do not. This overall process of defining components as linear combinations of the original variables and retaining the “most important” of these is called feature extraction (not to be confused with feature selection).

There are a few different common ways that a subset of PCs can be selected from the total number of possible PCs, and most all of them have to do with eigenvalues (discussed previously)

## Methods for component selection

The following pages describe different techniques for selecting a subset of PCs along with the strengths of each. Many of the classical techniques described here are based on rudimentary criteria, and were historically relied on before computational simulations became widely accessible. Parallel Analysis improves on many of these techniques through Monte Carlo simulation. If you only read about one of these methods, read about Parallel Analysis.

Classic methods for selecting PCs

Parallel Analysis