Having
defined the type and concentration range of the analytes of interest and of
the additional factors like temperature or humidity (or generally the independent
variables), a plan has to be setup, which determines the number and compositions
of the samples to be measured. This plan is known as experimental design in
chemometrics. The experimental design tries to cover optimally the space spanned
by the independent variables with as few samples as possible to understand the
effects of these variables and to model the relationships between the dependent
and independent variables. Among the many existing types of experimental designs,
several designs are specialized for optimization strategies like the Central
Composite Designs, Doehlert Design or Box-Behnken Design [2],[3],
several designs are mixture designs when all components add up to 100% and several
designs such as the D-optimal designs [4]-[6]
are specialized for a constrained variable space. In this study, the concentrations
of the different analytes should be independently varied and the number of concentration
levels and thus the number of samples should not be constrained rendering most
of these designs useless. Thus, full factorial designs are used, which combine
all levels of all independent variables (all defined concentration levels of
all analytes). This results in a rapidly increasing number n
of samples for an increasing number x
of analytes and for an increasing number l
of concentration levels per analyte:
(1)
In this work,
full factorial designs with and without equidistant levels are used for the
calibration data sets. For most validation data sets, also full factorial designs
are used. Thereby, the meshes of the two designs are interleaved with a maximum
distance of the meshes allowing the validation data to give a realistic estimation
of the network performance in a real-world situation [7].