Batch effects correction for microbiome data with Dirichlet-multinomial regression

Published in Bioinformatics. 2019 Mar 1;35(5):807-14, 2019

Recommended citation: Zhenwei Dai, Sunny H Wong, Jun Yu, Yingying Wei. Bioinformatics. 2019 Mar 1;35(5):807-14.

[PDF]

Abstract

Metagenomic sequencing techniques enable quantitative analyses of the microbiome. However, combining the microbial data from these experiments is challenging due to the variations between experiments. The existing methods for correcting batch effects do not consider the interactions between variables—microbial taxa in microbial studies—and the overdispersion of the microbiome data. Therefore, they are not applicable to microbiome data. We develop a new method, Bayesian Dirichlet-multinomial regression meta-analysis (BDMMA), to simultaneously model the batch effects and detect the microbial taxa associated with phenotypes. BDMMA automatically models the dependence among microbial taxa and is robust to the high dimensionality of the microbiome and their association sparsity. Simulation studies and real data analysis show that BDMMA can successfully adjust batch effects and substantially reduce false discoveries in microbial meta-analyses.