Fusão de dados espectrais em quimiometria: Uma abordagem em química do petróleo

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal do Espírito Santo

Resumo

Data fusion from different analytical sources can be a viable way to estimate physicochemical properties ou classificate crude oil samples, when compared to using a single analytical technique. This is because outputs from different instrumental techniques can carry additional information and act synergistically during calibration. In this work, the potential of data fusion strategies was investigated to estimate seven properties of crude oil: sulfur content (S), total nitrogen content (NT), basic nitrogen content (NB), total acid number (NAT), saturated (SAT), aromatic (ARO) and polar (POL) fractions. A total of 125 crude oil samples were available, divided into 70% for calibration and 30% for prediction. The physicochemical properties of the samples were determined using standardized experimental methods. Partial least squares (PLS) regression models were constructed from pre-treated mid-infrared spectroscopy (MIR) data and 1H and 13C nuclear magnetic resonance (RMN) spectroscopy data, using the data in individual form and also fused at low-level, mid-level with PCA and PLS and high-level in different combinations. The optimal number of latent variables was determined by cross-validation, using the Venetian Blind method. The results showed that, while mid-level data fusion by PLS increased the accuracy of some of the models, low, mid by PCA and high-level fusion provided negligible improvements. Using mid-level by PLS data fusion, the contents of S, NT, NB, NAT, SAT, ARO and POL were estimated, respectively, with RMSEP equal to 0,057 m/m%, 0,041 m/m%, 0,0067 m/m%, 0,16 mgKOHg –1 , 4,73 m/m%, 3,66 m/m% and 6,50 m/m% and coefficient of determination equal to 0,90, 0,83, 0,98, 0,91, 0,84, 0,67 and 0,66. Low and mid-level by PCA data fusion was also applied to near infrared (NIR) and MIR spectroscopy data for discrimination of crude oil samples into sweet/sour, poor/rich in nitrogen, non-acidic/acidic/highly acidic and light/ medium/heavy. The classification models were obtained through the method of discriminant analysis by partial least squares (PLS-DA). Classification models based on individual spectra and on the low and mid-level fused spectra were compared using the performance parameters sensitivity, specificity, efficiency, accuracy and total prediction error. Compared to the individual models, low and/or med-level data fusion provided greater accuracy and lower total prediction error rate in all four classifications proposed in this study. The classification models built from data fusion showed high accuracy, reaching values equal to or greater than 92%. These results demonstrate that the proposed spectroscopic techniques can complement each other, through the data fusion strategy, generating a greater synergistic effect and, consequently, producing models with greater predictive capacity.

Descrição

Palavras-chave

Fusão de dados, Petróleo bruto, RMN, PLS

Citação

Avaliação

Revisão

Suplementado Por

Referenciado Por