Regularized Regression Methods and Neural Networks for Modeling Fish Population Health with Water Quality Variables in the Athabasca Oil Sands Region

McMillan, Patrick
Journal Title
Journal ISSN
Volume Title
University of Guelph

This thesis aims to develop statistical models for fish population health measures including adjusted trout-perch body weight, gonad weight, and liver weight with the use of climate, environmental, and water quality variables measured in the Athabasca River. To identify relevant variables, we considered three variable selection techniques: stepwise regression, the lasso, and the elastic net (EN). The lasso and EN generally produced regression models with better performance for each response. Uranium (U), tungsten, tellurium (Te), pH, molybdenum (Mo), and antimony were found important for at least one response. Uranium, Te, and Mo had relatively large coefficients in both the adjusted gonad and liver weight models suggesting they may be influential on the development of trout-perch organs. Neural networks (NNs) are considered to improve the prediction accuracy of the fish population endpoints. The NNs were found to outperform the regularization techniques in predicting the adjusted body weight, but not the adjusted gonad or liver weights.

Neural Network, Variable Selection, Bayesian Hyperparameter Optimization, Sentinel Fish Populations, Oil Sands, Environmental Monitoring