2.8.7. Variable Compression by Principal Component
Analysis

The Principal
Component Analysis (PCA), which originates from psychometrics, can be used as
preprocessing tool for neural networks. Thereby the PCA compresses the
independent variables into fewer principal components, which are then used as
new input variables for the neural networks. The PCA finds the direction in
space along which the variance of the data is the largest. This direction is
called the first principal component. The second principal component is the
direction in space orthogonal to the first principal component, which describes
maximum variance not covered by the first principal component, and so on. The
data matrix is decomposed by the PCA into a product of a loading matrix_{}and of a
sore matrix T and a matrix
containing the residuals E:

_{}

(19)

Similar
to the PLS only the first few principal components are used with similar methods
to determine the optimal number (see section 2.5).

Yet,
the variable compression by principal components is affected by some (at least
theoretical) drawbacks. Using only few principal components does not ensure
that the information preserved in these components is useful for the calibration
of the relationship of interest. For example, if noise dominates the variations
of the input variables, the variations caused by the sensor responses due to
the analytes might not be included as the corresponding principal components
with small singular values are discarded [107]. Additionally,
nonlinear relationships are often spread over many principal components, which
might not be included in the model (see also discussions in the sections
6.1 and 9.2.5). As in contrast to the PLS the principal
components are determined only on the basis of the variances of the independent
variables and not on the basis of an optimal regression, no synergetic effects
of the combination of the PCA and the neural networks can be expected.