IQ Biology Blog: Genomics Conference
IQ Biology Blog: Probabilistic Models in Genomics Conference
The Probabilistic Models in Genomics conference at Cold Spring Harbor (Oct. 14-17) featured talks from Richard Durbin, Michael I. Jordon and Elizabeth Thompson. Being a graduate student, really of only three years, (and maybe still naïve to the ways scientific edict), I was unabashedly giddy. Naming only a small portion, these speakers really helped shaped fields: Hidden Markov Models (Durbin), Bayesian non-parameters (Jordon) and pedigree bio-statistics (Thompson). I suppose I signed up for the conference to the present a poster, and although I received nice comments and very useful questions, this conference was not about asserting my work but more an attempt to gain new perspective from the many talks I attended. In short, I would rank this as a “must attend” conference for any budding computational biologist / bioinformatician / biostatistician.
The big “take home” for me is the current status of gene regulatory inference. As I see it, the very ambitious goal of this field is to define for each gene their key regulators. Whether this regulation arises from open-chromatin (ATAC-seq was certainly a popular topic at CSH), or causal linkage from another gene (via expression data), this field attempts to integrate all forms of high-throughput sequencing data in the hope of working out the vastly complicated regulatory network (referred to, unoriginally, as the regularome). Competitions like the DREAM challenge provide “gold standards” of gene networks (experimentally validated) and I was surprised to learn that current state-of-the-art algorithms achieve accuracies of around 20 percent. In this era of “Big Data," why the poor level of prediction?
Well, genome-wide association studies show that 82 percent of diseased associated single nucleotide polymorphisms occur inside, or are in linkage disequilibrium with, enhancer regions. Of importance is not necessarily the protein coding regions (although important mutations in p53 and BRACA1 are critical to cancerous phenotypes), but how the genes are regulated by the vast non-coding regions of our genome. Network inference via Microarray or RNA-seq expression has little to say on these regions given that we ask for changes only over annotated, coding regions.
We might look instead to the emerging technology of Hi-C and ChIA-PET. Excitingly, this new methodology pairs genomic loci by their three-dimensional proximity, effectively destroying the old notion that the genome is some static, linear coordinate system. In short, I would encourage those working extensively with expression data to collaborate with those in the 3D-genome community as we might annotate enhancer-gene interactions more extensively and work out some of the latent and causal intermediaries between distinct genes.
In total, the conference surveyed many diverse fields and I can’t do justice to the conference’s far-reaching scope in a single blog post. But, I was exhilarated by the many collaborations and the exciting atmosphere that surrounds probabilistic modeling in biology. With any luck, upcoming years of this conference will feature more Interdisciplinary Quantitative Biology students.