Show simple item record

dc.contributor.authorErion, Gabriel Gandhien_US
dc.date.accessioned2015-07-16T16:26:20Z
dash.embargo.terms2016-05-01en_US
dc.date.created2015-05en_US
dc.date.issued2015-06-26en_US
dc.date.submitted2015en_US
dc.identifier.citationErion, Gabriel Gandhi. 2015. Addressing Missing Data in Viral Genetic Linkage Analysis Through Multiple Imputation and Subsampling-Based Likelihood Optimization. Bachelor's thesis, Harvard College.en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:17417574
dc.description.abstractThis thesis addresses the intersection of two important areas in epidemiology and statistics: genetic linkage analysis and missing data methods, respectively. Genetic linkage analysis is a promising method in viral epidemiology which involves learning about transmission patterns by studying clusters of similar gene sequences. For example, similar sequences found in a pair of geographically distinct communities may imply disease transmission between the two locations. However, this analysis is sensitive to missing data, which can introduce substantial bias. This thesis presents a multiple-imputation approach which corrects for much, though not all, of the bias in genetic linkage analysis. It also introduces a novel resampling-based approach that generates a weighted distribution of complete datasets and is even more effective than imputation for reducing bias. This work highlights the importance of missing data in genetic linkage studies and presents ways to provide more accurate epidemiological information by correcting for missing data. The new resampling-based approach presented in this paper is also general enough to be applied to many types of missing-data problems involving complex datasets; such broader applications are a promising avenue for future research.en_US
dc.description.sponsorshipApplied Mathematicsen_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dash.licenseLAAen_US
dc.subjectBiology, Biostatisticsen_US
dc.subjectStatisticsen_US
dc.subjectHealth Sciences, Epidemiologyen_US
dc.titleAddressing Missing Data in Viral Genetic Linkage Analysis Through Multiple Imputation and Subsampling-Based Likelihood Optimizationen_US
dc.typeThesis or Dissertationen_US
dash.depositing.authorErion, Gabriel Gandhien_US
dc.date.available2016-05-02T07:31:22Z
thesis.degree.date2015en_US
thesis.degree.grantorHarvard Collegeen_US
thesis.degree.levelUndergraduateen_US
thesis.degree.nameABen_US
dc.type.materialtexten_US
thesis.degree.departmentApplied Mathematicsen_US
dash.identifier.vireohttp://etds.lib.harvard.edu/college/admin/view/53en_US
dash.title.page1en_US
dash.author.emailgabe.erion@gmail.comen_US
dash.identifier.drsurn-3:HUL.DRS.OBJECT:25267861en_US
dash.contributor.affiliatedErion, Gabriel


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record