Computational Inference for Network-based Individual-level Models of Infectious Disease Transmission
Infectious disease data is very often only partially observed; for example, the exact time of infection for an individual is generally missing, or it may be measured only approximately due to effects of measurement error. We can account for such data uncertainty in our analysis. However, doing so may cause computational problems. In the first part of this thesis, a simulation study is performed to ascertain the consequences of ignoring infection-time uncertainty. We detail results obtained on the trade-off between model-inferential quality and computation time by using a family of discrete-time heterogeneous infectious disease transmission models known as individual-level models (ILMs). We focus particularly on network-based ILMs fitted by using Markov chain Monte Carlo (MCMC) under a Bayesian framework. Modeling approaches undertaken vary from those under "fixed data" assumptions to those under a "full data augmentation approach''. The impact of applying a misspecified distribution to describe the infectious period distribution is also considered. Methods that may help to overcome the inferential and/or computational issues involved in the use of such models are examined. In the second part, we quantify the ability of aggregated infectious disease transmission data obtained under varying levels of clustering to produce a substantive reduction in computation-time requirements for approximating the posterior distribution while maintaining data quality. Results obtained via different clustering assumptions are compared. We also examine the effect of using different model terms to account for inter-cluster variability when fitting ILMs to aggregated data. We consider the impact of linear effects on the fit as well as the impact when this assumption is relaxed. Finally, an investigation of the effectiveness of various MCMC algorithms in sampling from a series of highly correlated, discrete target distributions is performed. Relative effectiveness of various adaptive multistage MCMC approaches, based upon hybrid combinations of independence samplers, is considered. Results are compared to those obtained from traditional single-stage MCMC algorithms and a direct Monte Carlo method (our gold standard). Root mean square error, mean absolute difference, and effective sample size rate are used to assess and compare performance of these algorithms.