Advanced Analytics for Disease Forecasting - A Comparitive Analysis of Statistical and Machine Learning Methods

dc.contributor.advisorBerke, Olaf
dc.contributor.advisorNg, Victoria
dc.contributor.authorOrang, Armin
dc.date.accessioned2024-01-08T22:27:19Z
dc.date.available2024-01-08T22:27:19Z
dc.date.copyright2024-01
dc.date.created2023-12-15
dc.degree.departmentDepartment of Population Medicine
dc.degree.grantorUniversity of Guelph
dc.degree.nameDoctor of Philosophy
dc.degree.programmePopulation Medicine
dc.description.abstractInfectious diseases continue to evolve and as seen during the recent coronavirus (COVID-19) pandemic, they remain a serious threat to the health of human populations. To keep pace with the evolution of infectious diseases, surveillance methods must similarly advance. This thesis aims to explore the application of statistical and machine learning methods for disease forecasting and outbreak detection. Methods were applied to temporal and spatio-temporal data from infectious disease surveillance in Canada. Infectious diseases of interest were seasonal influenza, COVID-19, and Lyme disease. Accurately forecasting the timing and magnitude of peak seasonal influenza incidence is important for public health preparedness. In 2019, COVID-19 emerged and posed a substantial risk to the health of Canadians. COVID-19 incidence data provide the opportunity to evaluate the statistical and machine learning ability to model emerging diseases. The expanding geographic range of Lyme disease is also a growing concern to Canadian public health. As Lyme disease incidence has been linked to changing weather patterns, projecting its incidence under the different climate scenarios of Representative Concentration Pathway 4.5 and 8.5 is of interest. Seasonal Autoregressive Integrated Moving Average was shown to outperform artificial neural networks in forecasting seasonal influenza activity in Canada. However, when applied to COVID-19 incidence in the public health units of Toronto and Wellington-Dufferin-Guelph, random forest outperformed several statistical learning models. Additionally, machine learning accurately forecasted spatio-temporal Lyme disease incidence in Ontario. For the same dataset, Bayesian statistics did not converge. Endemic-Epidemic modeling showed solid performance in measures of power of detection, sensitivity, specificity, and timeliness for detecting simulated COVID-19 outbreaks in spatio-temporal data structures. Farrington Flexible (FF) required tuning before demonstrating robust performance. Results indicate that both statistical and machine learning are valuable for disease surveillance. Machine learning is a flexible tool, displaying strong forecasting performance across different data structures. With advances in computational power and availability of “big data”, machine learning will continue to play an important role in disease forecasting. However, the “black box” problem of machine learning makes it unfit for explanatory purposes. Therefore, traditional statistical models should still be applied to identify possible risk factors for disease incidence.
dc.description.sponsorshipPublic Health Agency of Canada
dc.identifier.urihttps://hdl.handle.net/10214/28061
dc.language.isoen
dc.publisherUniversity of Guelph
dc.rightsAttribution-NoDerivatives 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nd/4.0/
dc.subjectDisease forecasting
dc.subjectStatistical learning
dc.subjectMachine learning
dc.titleAdvanced Analytics for Disease Forecasting - A Comparitive Analysis of Statistical and Machine Learning Methods
dc.typeThesis

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Orang_Armin_202401_PhD.pdf
Size:
2.66 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
Orang_Armin_202401_PhD_Erratum.pdf
Size:
404.95 KB
Format:
Adobe Portable Document Format
Description:
Erratum document
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
353 B
Format:
Item-specific license agreed upon to submission
Description: