Show simple item record

dc.contributor.advisorTrippa, Lorenzo
dc.contributor.advisorHuttenhower, Curtis
dc.contributor.authorRen, Boyu
dc.date.accessioned2019-05-20T10:23:33Z
dc.date.created2017-05
dc.date.issued2017-05-10
dc.date.submitted2017
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:40046490*
dc.description.abstractHigh-dimensional count data arising from multinomial sampling is ubiquitous in microbiome studies. This dissertation aims to develop flexible Bayesian framework to model high-dimensional count data, which provides reliable and automatic inference for biological questions in microbiome studies. In Chapter 1, we present a nonparametric Bayesian model for dependent distributions to depict simultaneously multiple species sampling sequences. Our marginal prior for each sampling sequence is a normalized Gamma process and the dependence between the sequences is represented by a low-dimensional latent factors. The resulting posterior samples of model parameters can be used to evaluate uncertainty in analyses routinely applied in microbiome studies such as ordination. In Chapter 2, we extend the latent factor model in Chapter 1 to enable estimating of effect of covariates. We proved analytically and numerically that this augmented model is identifiable and it separates the effect of covariates and that of latent factors accurately. We provides techniques to transform model parameters to interpretable results. An application of this model on a longitudinal microbiome dataset illustrates the use of this model in microbiome studies. Chapter 3 focuses more on a bioinformatics tool that simulates realistic microbiome data and benchmarks statistical tools for microbiome studies. We model the count as over-dispersed Poisson outcome by a hierarchical lognormal distribution. We then propose a heuristic algorithm which generates data that resemble real microbiome data. A benchmark of a previously published method illustrates the simulated data provide accurate characterization of the method.
dc.description.sponsorshipBiostatistics
dc.format.mimetypeapplication/pdf
dc.language.isoen
dash.licenseLAA
dc.subjectBiology, Biostatistics
dc.subjectBiology, Bioinformatics
dc.subjectStatistics
dc.titleBayesian Statistical Framework for High-Dimensional Count Data and its Application in Microbiome Studies
dc.typeThesis or Dissertation
dash.depositing.authorRen, Boyu
dc.date.available2019-05-20T10:23:33Z
thesis.degree.date2017
thesis.degree.grantorGraduate School of Arts & Sciences
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
dc.contributor.committeeMemberParmigiani, Giovanni
dc.contributor.committeeMemberBacallado, Sergio
dc.type.materialtext
thesis.degree.departmentBiostatistics
dash.identifier.vireohttp://etds.lib.harvard.edu/gsas/admin/view/1573
dc.description.keywordsBayesian nonparametrics; Microbiome
dc.identifier.orcid0000-0002-5300-1184
dash.author.emailboyuren158@gmail.com


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record