On Optimization and Regularization for Grouped Dirichlet-multinomial Regression

Loading...
Thumbnail Image

Date

2018-05-07

Authors

Crea, Catherine

Journal Title

Journal ISSN

Volume Title

Publisher

University of Guelph

Abstract

This thesis focuses on developing the grouped Dirichlet-multinomial (DM) regression model for ecological applications with particular attention to optimization (for parameter estimation) and regularization (for variable selection). We adapt the grouped DM regression model for discrete choice behaviour to the analysis of mutualistic interactions between plant and pollinator species within a given ecosystem. The DM model provides a flexible approach to modelling over-dispersed grouped data and is fully parametric, but has not been well studied. The first part of this thesis focuses on establishing the DM model as a viable approach for analyzing pollination networks that can provide insights into the mechanisms driving ecological processes. Next, we study the behaviour of various parameterizations of the DM likelihood and identify non-convex regions that are either flat or non-smooth. Correspondingly, we evaluate the performance of three optimization methods (derivative and derivative-free) and assess their robustness to misspecification of dispersion structure. The last part of this thesis implements regularized regression for most parameterizations of the grouped Dirichlet-multinomial model using standard and adaptive lasso methods. Tuning parameters are selected using an information criterion while optimization is achieved via the fast iterative shrinkage-thresholding algorithm. All the proposed methods are evaluated via simulated and empirical data sets and all implementations of the standard and regularized grouped DM regression model are publicly available as routines in R.

Description

Keywords

overdispersion, quasi-Newton methods, Dirichlet-multinomial regression, adaptive lasso, variable selection, information criteria, proximal gradient methods, plant-pollinator networks, linkage rules, network structure, log-likelihood slice

Citation