清华合成与系统生物学中心
Tsinghua Center for Synthetic and Systems Biology

A Penalized Bayesian Approach to Predicting Sparse Protein-DNA Binding Landscapes

Time: 10:30-12:00 AM, Dec 4
Place: FIT 1-312
Speaker: Prof. Qing Zhou, UCLA
ABSTRACT:
Cellular processes are controlled, directly or indirectly, by the binding of hundreds of different DNA binding factors (DBFs) to the genome. One key to deeper understanding of the cell is discovering where, when, and how strongly these DBFs bind to the DNA sequence. Direct measurement of DBF binding sites (e.g. through ChIP-Chip or ChIP-Seq experiments) is expensive, noisy, and not available for every DBF in every cell type.  Naïve and most existing computational approaches to detecting which DBFs bind in a set of genomic regions of interest often perform poorly, due to the high false discovery rates and restrictive requirements for prior knowledge. We develop SparScape, a penalized Bayesian method for identifying DBFs active in the considered regions and predicting a joint probabilistic binding landscape. Utilizing a sparsity-inducing penalization, SparScape is able to select a small subset of DBFs with enriched binding sites in a set of DNA sequences from a much larger candidate set. This substantially reduces the false positives in prediction of binding sites. Analysis of ChIP-Seq data in mouse embryonic stem cells and simulated data show that SparScape dramatically outperforms the naïve motif scanning method and the comparable computational approaches in terms of DBF identification and binding site prediction.