Monday, September 10, 2012

Human regulatory network architecture.

Architecture of the human regulatory network derived from ENCODE data
Gerstein et al. 2012 Nature.

This paper uses a ChIP sequencing to identify binding sites for 119 transcription factors in 5 cell lines (from only one human, I think. maybe?). They used this data to construct a network of transcription factors and the genes they regulate. Their overall goal was to describe the architecture of the regulatory network, identify correlations between network position and other genomic properties, and test of selection acts differently on different places in the network. 

They have a lot of results, and a lot of the data presented in the main text feels a bit anecdotal, so instead of providing a laundry list of all of them, I'll just point out things I found interesting with the caveat that I don't really understand most of their methods.

1) They looked at situations where two transcription factors have an overlapping binding site, which they call coassociation. Transcription factors tend to coassociate with different partners in sites that are near a gene ('proximal') and far from a gene ('distal'). However, this conclusion appears to be based on supplementary figure 2C3, which only shows associations between one focal transcription factor and those factors that differ between proximal and distal sites.

2) The researchers constructed a network of associations between transcription factors and their targets and found they could group transcription factors into three levels of hierarchy. Highly connected factors tend to be highly expressed across tissues, which is unsurprising to me.

3) The researchers used diversity data from the 1000 genomes project to measure constraint on target genes and transcription factors. They found the strongest constraint on genes that are regulated by many transcription factors, followed by transcription factors that regulate many genes. They also found that transcription factors at the top level of the network are more constrained than those at the middle and lower level.

4) They also took a stab at one of my pet interests: allele-specific expression. It's a bit complicated, but what I think is going on is that when transcription factors bind preferentially to an allele, this allele is also more likely to be preferentially expressed downstream. However, this section is really unclear to me because allele-specific expression is generally defined as being any difference in expression level between alleles, not a preference for one allele, so I'm not sure what they mean when they say things like "X% of genes show allele-specific expression from the paternal allele") If my interpretation is right, then this suggests that most allele-specific binding is enhancing expression? But who knows. It's a bit frustrating that with 271 pages of supplement, they can't find the space to clearly define their terms.

5) Finally, the researchers compared diversity in transcription factor binding sites that show allele-specific binding to those that don't. They found that the allele-specific sites have a higher SNP density, suggesting that they're under less constraint than those binding sites without allele-specific binding. The authors think that this result, that allele-specific binding sites are under less constraint, is 'surprising'. I don't find it surprising AT ALL. If the genetic variation that causes allele-specific binding is deleterious and subject to purifying selection (which we think is the case for most variation), then this result makes perfect sense.

2 comments:

  1. I'm confused about point #4. Could you clarify a bit? I thought that allele specific expression was defined as a preference for one allele? (After all, if there is a difference between two alleles isn't one 'preferred'?) Talking about ASE always makes me think about imprinting, which is what I assume they're getting at with paternal expression.

    ReplyDelete
  2. Yeah, I didn't explain that well. allele-specific expression is defined as preferential expression of one allele in a gene and it's usually used to describe a gene. I've never noticed it used in the way the authors did -- that an allele itself shows allele-specific expression meaning that it's the preferentially expressed allele. But, the world of human system biology is pretty far away from E&E, so they clearly have a different way of using the term.

    ReplyDelete