Abstract:
In this paper, the application of machine learning and deep learning in the spectral analysis of multicomponent gas mixtures is considered. The experimental setup consists of a quantum cascade laser with a tuning range of 5.3–12 $\mu$m, a peak power of up to 150 mW, and an astigmatic Herriott gas cell with an optical path length of up to 76 m. Acetone, ethanol, methanol, and their mixtures are used as test substances. For the detection and clustering of substances, including molecular biomarkers, methods of machine learning, such as stochastic embedding of neighbors with a t-distribution, principal component analysis and classification methods, such as random forest, gradient boosting, and logistic regression, are proposed. A shallow convolutional neural network based on TensorFlow (Google) and Keras is used for the spectral analysis of gas mixtures. Model spectra of substances are used as a training sample, and model and experimental spectra are used as a test sample. It is shown that neural networks trained on model spectra (NIST database) can recognize substances in experimental gas mixtures. We propose using machine learning methods for clustering and classification of pure substances and gas mixtures and neural networks for the identification of gas mixture components. Using the experimental setup described, the experimentally obtained concentration limits are 80 ppb for acetone and 100 – 120 ppb for ethanol and methanol. The possibility of using the proposed methods for analyzing spectra of human exhaled air is shown, which is significant for biomedical applications.