One of the challenges of working with ancient DNA samples is that damage accumulates over time, breaking up the structure of the double helix into ever smaller fragments. In the samples we've worked with, these fragments scatter and mix with contaminants, making reconstructing a genome a large technical challenge.
But a dramatic paper released on Thursday shows that this isn't always true. Damage does create progressively smaller fragments of DNA over time. But, if they're trapped in the right sort of material, they'll stay right where they are, essentially preserving some key features of ancient chromosomes even as the underlying DNA decays. Researchers have now used that to detail the chromosome structure of mammoths, with some implications for how these mammals regulated some key genes.
DNA meets Hi-C
The backbone of DNA's double helix consists of alternating sugars and phosphates, chemically linked together (the bases of DNA are chemically linked to these sugars). Damage from things like radiation can break these chemical linkages, with fragmentation increasing over time. When samples reach the age of something like a Neanderthal, very few fragments are longer than 100 base pairs. Since chromosomes are millions of base pairs long, it was thought that this would inevitably destroy their structure, as many of the fragments would simply diffuse away.
But that will only be true if the medium they're in allows diffusion. And some scientists suspected that permafrost, which preserves the tissue of some now-extinct Arctic animals, might block that diffusion. So, they set out to test this using mammoth tissues, obtained from a sample termed YakInf that's roughly 50,000 years old.
The challenge is that the molecular techniques we use to probe chromosomes take place in liquid solutions, where fragments would just drift away from each other in any case. So, the team focused on an approach termed Hi-C, which specifically preserves information about which bits of DNA were close to each other. It does this by exposing chromosomes to a chemical that will link any pieces of DNA that are close physical proximity. So, even if those pieces are fragments, they'll be stuck to each other by the time they end up in a liquid solution.
A few enzymes are then used to convert these linked molecules to a single piece of DNA, which is then sequenced. This data, which will contain sequence information from two different parts of the genome, then tells us that those parts were once close to each other inside a cell.
Interpreting Hi-C
On its own, a single bit of data like this isn't especially interesting; two bits of genome might end up next to each other at random. But when you have millions of bits of data like this, you can start to construct a map of how the genome is structured.
There are two basic rules governing the pattern of interactions we'd expect to see. The first is that interactions within a chromosome are going to be more common than interactions between two chromosomes. And, within a chromosome, parts that are physically closer to each other on the molecule are more likely to interact than those that are farther apart.
So, if you are looking at a specific segment of, say, chromosome 12, most of the locations Hi-C will find it interacting with will also be on chromosome 12. And the frequency of interactions will go up as you move to sequences that are ever closer to the one you're interested in.
On its own, you can use Hi-C to help reconstruct a chromosome even if you start with nothing but fragments. But the exceptions to the expected pattern also tell us things about biology. For example, genes that are active tend to be on loops of DNA, with the two ends of the loop held together by proteins; the same is true for inactive genes. Interactions within these loops tend to be more frequent than interactions between them, subtly altering the frequency with which two fragments end up linked together during Hi-C.