P B E A M

Welcome to the homepage of PBEAM

1 What is PBEAM ?

PBEAM is a parallel version of BEAM (Bayesian Epistatis Association Mapping). BEAM uses Markov Chain Monte Carlo (MCMC) to search for both single-marker and interaction effects from case-control SNP data.

BEAM is developed by Zhang Yu and Liu JS. http://www.people.fas.harvard.edu/~junliu/BEAM/

Reference:

Zhang, Y. and J.S. Liu, Bayesian inference of epistatic interactions in case-control studies. Nat Genet, 2007. 39(9): p. 1167-73.

1.1 Basic ideas of PBEAM

Genome-wide association study (GWAS) is a computing-extensive task, especially for the methods targeting the epistatic effect of marker combinations. To handle this difficult problem, we need more sophisticated algorithms and stronger computing power, such as parallel computing on a cluster.

Monte Carlo Methods estimate the result by picking samples from a distribution. So the Monte Carlo methods have intrinsic advantage to be parallelized. BEAM run one Markov chain at a time and sample from the chain(s) to estimate the result, while PBEAM run several chains simultaneously and combine the samples from these chains to estimate the final result.

1.2 Performance of PBEAM

Asymptotically, the results of PBEAM and BEAM are the same. Our experiments showed that the results of PBEAM and BEAM are consistent in limited iteration.

PBEAM run the various chains independently. The communication cost among processes is neglectable when the size of data is large enough. That is, PBEAM running on n CPU can reduce the execution time to 1/n at the same precision with BEAM.

2 Download

PBEAM employs MPI to parallelize BEAM. We provide executables for different platform and MPI implementation.

However, it's better for users to compile the executable from source. The possible version conflicts can be avoided by compilation.

2.1 source

PBEAM source download

Theoretically, any MPI implementation is possible for compilation of PBEAM. Our compilation is based on mpich2 and Microsoft MPI.

PBEAM uses the GNU Scientific Library (GSL). So for windows users, you must also compile GSL on your platform.

A port of the GSL for Microsoft VS2008: http://fp.gladman.plus.com/computing/gnu_scientific_library.htm

For Linux users, we provided a simple makefile to facilitate the compilation.

2.2 executables

configurationsnote
Linux 64bit + mpich2download
Linux 32bit + mpich2 download
Windows 32bit + mpich2 download
Windows 64bit + Microsoft MPI download This can be exectuted directly on Windows Compute Cluster Server.

data file example download

parameter file example download

3 Running PBEAM

You must put the executable of PBEAM to a place where all the computers in this cluster can read (e.g. a shared directory or the same path in all computer).

For Linux users, please refer to mpich2 documents for how to launch mpi program. The command line to run PBEAM is like:

mpiexec -n 4 /path2program/PBEAM inputfile outputfile log_directory parameterfile

For Windows CCS(Computing Cluster Server) users, your can submit a mpi job to the job scheduler

mpiexec -n 4 \path2program\PBEAM inputfile outputfile log_directory parameterfile

Please refer to the README.txt in packages for the explanation of the command line parameters.

4 Contact

All questions and comments are welcomed.

You can direct them to Tao Peng or Pufeng Du at the Department of Automation, Tsinghua University, Beijing, P.R. China, 100084.

pengtao@mails.tsinghua.edu.cn

dpf05@mails.tsinghua.edu.cn