Intelligent Discrimination of Growing Areas based on Near-Infrared Spectra
The tobacco growing area is an important aspect for the consistency of cigarette aroma and the control of cigarette quality. The fragrance of tobacco leaves would be different for climates and planting environments, such as soil and rainfalls. Accurately discriminating tobacco growing areas is very important to maintain the specifications of cigarettes. In this thesis, the relationship between tobacco near-infrared (NIR) spectra and growing areas is studied. Soft computing models and statistical classifiers are established, and the performance of the developed classifiers are compared in the prediction accuracy and in evaluations derived from confusion matrix. An artificial neural network (ANN) classifier and a statistical model are firstly developed. The best prediction accuracy of ANN model reaches to 79.3% in 226 training samples and 78.7% in 66 testing samples, respectively, which are 2.2% and 4.5% higher than the best results of the conventional statistical model in training (77.1%) and in testing (74.2%), respectively. A support vector machine (SVM) model is proposed to investigate the characteristics of growing areas based on risk error minimum, and produces a higher classification accuracy than ANN model does, demonstrating the effectiveness and robustness of SVM model. In addition, a genetic algorithm (GA) optimized SVM (GA-SVM) model is proposed for taking the influence of the interaction of individual inputs on the performance of classifiers into account. With the application of GA, the sensitive input subset is identified and used to discrimination models. The simulation results demonstrate that the GA-SVM model has the best performance among the other developed models, and the model complexity is simplified, which is shown by requiring fewer inputs to achieve the equivalent prediction accuracy. The GA-SVM classifier is preferred for solving multi-category problems.