Berthelot C, Muffato M, Abecassis J, Roest Crollius H.

Cell Rep. 2015 Mar 24;10(11):1913-24. doi: 10.1016/j.celrep.2015.02.046

The 3D organization of chromatin explains evolutionary fragile genomic regions.

Explains the power law distribution of breakpoints in mammals and yeast with chromosome contacts (Hi-C) and open chromatin (ENCODE).

“We [...] reconstruct the ancestral gene order in the 95-million-year-old ancestral genome of Boreoeutheria, the last common ancestor of primates, rodents, and laurasiatherians. [...] This reconstructed genome was further annotated with respect to its intergenic regions [...] their lengths, GC content and their proportion of conserved non-coding sequence as defined by GERP. [...] We then identified evolutionary rearrangement breakpoints that have occurred in the human, mouse, dog, cow, and horse lineages. [...] We identified a total of 751 breakpoints, 20 of which correspond to independent breakpoint reuse. [...] The identified breakpoints show the typical characteristics of rearrangement breakpoints; i.e., they occur in GC-rich, gene-dense regions possessing lower proportions of conserved non-coding sequence. [...] Breakpoint events per intergene increase as a power law of intergene length rather than a proportionality law. [...] Ancestral intergenes with high CNE content have been disrupted by significantly fewer breakpoints than intergenes of similar length with lower CNE content. [...] Rpeated elements and recombination frequencies are distributed radically differently from breakpoints, eliminating them as potential candidates to explain the breakpoint pattern. [...] The density of open chromatin is similar to the pattern of breakpoints with the proportion of DNA in an open state decreasing as intergene size increases. [...] Simulating inversions in the human genome according to contact probability [...] rearrangements were allowed to occur only between open chromatin regions, using chromatin state profiles for different cell types published by the ENCODE consortium. Under this model, the simulated average number of breakpoints per intergene closely reproduces the relationship with intergene length observed in real data.”