Phylogibbs Online

PhyloGibbs is an algorithm for discovering regulatory sites in a collection of DNA sequences, including multiple alignments of orthologous sequences from related organisms. Many existing approaches to either search for sequence-motifs that are overrepresented in the input data, or for sequence-segments that are more conserved evolutionary than expected. PhyloGibbs combines these two approaches and identifies significant sequence-motifs by taking both over-representation and conservation signals into account.

PhyloGibbs runs on arbitrary collections of multiple local sequence alignments of orthologous sequences. The algorithm searches over all ways in which an arbitrary number of binding sites for an arbitrary number of transcription factors can be assigned to the multiple sequence alignments. These binding site configurations are scored by a Bayesian probabilistic model that treats aligned sequences by an explicit model for the evolution of binding sites and 'background' intergenic DNA that takes the phylogenetic relationship between the species in the alignment into account. The algorithm uses simulated annealing and Monte-Carlo Markov-chain sampling to rigorously assign posterior probabilities to all the binding sites that it reports.

List of the most important features:


PhyloGibbs should be cited as:

Siddharthan R, Siggia ED, van Nimwegen E
PhyloGibbs: A Gibbs sampling motif finder that incorporates phylogeny
PLoS Comput Biol 1(7): e67 (2005)


Siddharthan R, van Nimwegen E, Siggia ED. (2004)
PhyloGibbs: A Gibbs sampler incorporating phylogenetic information,
in Eskin E, Workman C (eds), RECOMB 2004 Satellite Workshop on Regulatory Genomics,
LNBI 3318, (Springer-Verlag Berlin Heidelberg 2005).
NOTE: This is a preliminary report which has been largely supersededby significant changes to the code since then.

