Main content

Data Integration from Multiple Historical Sources to Study Canadian Casualties of WWI

Show full item record

Title: Data Integration from Multiple Historical Sources to Study Canadian Casualties of WWI
Author: Gadgil, Harshavardhan
Department: School of Computer Science
Program: Computer Science
Advisor: Antonie, Luiza
Abstract: Longitudinal data (data that observe the same entities at different points in time), are of interest to historians and social scientists because they create opportunities to study populations over time. In this thesis, we construct longitudinal data by integrating data from four historical sources to study Canadian casualties of World War I. Due to the unavailability of labeled data for two out of three linkage tasks and our application's low tolerance for false matches, we develop a simple stepwise deterministic strategy to integrate the four datasets. For one of three linkage tasks where labeled data are available, we compare the strategy with linkage that incorporates a Support Vector Machine. With the longitudinal dataset constructed, we demonstrate its utility by performing a multivariate regression analysis to determine the factors that influenced a Canadian soldier's likelihood of survival in World War I. The findings of this research indicate that a cautious stepwise deterministic strategy that incorporates approximate comparisons and domain knowledge, can perform on par with a linkage approach that incorporates a supervised learning algorithm, without requiring labeled data. The regression analysis reveals several fascinating patterns of historical importance in early 19th century Canada, demanding further historical investigation.
URI: http://hdl.handle.net/10214/10284
Date: 2017-03
Rights: Attribution-NonCommercial 2.5 Canada


Files in this item

Files Size Format View Description
Gadgil_Harshavardhan_201703_Msc.pdf 2.365Mb PDF View/Open MSc. Thesis

This item appears in the following Collection(s)

Show full item record

Attribution-NonCommercial 2.5 Canada Except where otherwise noted, this item's license is described as Attribution-NonCommercial 2.5 Canada