For lots of different organisms—humans, mice, and even a few exotic creatures like seahorses—we’ve figured out the whole genome. We know on which chromosomes specific genes are located, and we also know the order of the millions of bases that make up each chromosome. We organize this information into a “map” of each genome.
A genome contains all of the instructions for building and operating an organism. Having a genome map helps us do things like diagnose and treat diseases, improve crops, track down the basis of inherited traits, and much more.
Read on to learn more about what it takes to build a genome map.
Mapping a species' genome for the first time is a lot of work. But once you've done it, you have a reference sequence that you can use as a basis for comparison.
The human genome is a great example. The first map took twenty years to complete. The second was published just three years later, and four more the following year. One reason for the increase in speed is that there have been tremendous advances in technology. But another factor is that any newly sequenced human genome is assembled using another human genome as a reference. Once you know the order and sequence of all of the genes, you just need to figure out the 1 out of 1,000 or so nucleotides where that individual's genes vary.
Scientists very often make their reference genomes publically available online. There it is available to help researchers around the world do any kind of genetic studies on that organism.
Today, a human genome can be sequenced in a day. To learn more about the advances that make this possible, visit Why the Time is Right
Some Assembly Required
Putting together a genome map for the first time is a big job. One issue is that genomes contain a huge amount of information—billions of base pairs in complex organisms like crops, pets, and people. Another issue is that current DNA sequencing technology limits us to being able to read only a few hundred to a few thousand bases at a time. You can't just put a chromosome into a machine and read its whole sequence.
To get around the limitations of technology, we break up a genome into millions of pieces that are short enough to sequence. Once we know the sequence of the pieces, we stitch them back together to make a complete genome. In other words, sequencing technology can provide details about individual puzzle pieces, but it takes another step to figure out how those pieces fit together to make entire chromosomes. The final challenge is understanding what all of the sequences mean: which parts are genes, and what those genes do.
Mapping at different levels of detail
On its own, sequencing and assembling short fragments of DNA generally won't give you enough information to map an entire genome. For example, if a genome contains a lot of repetitive DNA sequences or multiple copies of very similar genes (both are very common scenarios), it won't work to just match up short overlapping DNA sequences—there would be too many possible solutions to the puzzle.
To make a genome map, researchers generally combine information from multiple methods, some of which are described below. With the help of computer software, fragments of a genome generated from different kinds of mapping can be computationally stitched together.
Using specific sequences of DNA that have fluorescent dyes on them, it is possible to visualizing the positions of genes on chromosomes. Whole, condensed chromosomes are visible under a microscope. Through complementary base pairing, single strands of DNA will bind to specific sequences in chromosomes. And the fluorescent dyes make these regions light up under the microscope.
Researchers use optical mapping to measure the distances between specific, short DNA sequences (or markers). This method uses DNA fragments that are 100 to 150 times longer what can be read with DNA sequencing, often providing enough information to read through repetitive or duplicated segments of DNA. To learn more about this technique, visit Physical Mapping.
Depth of Coverage
Depth of coverage refers to the degree to which individual sequencing "reads" overlap one another across the genome. Individual DNA sequencing reads often have small errors, sort of like typos, and areas where the sequence is a little unclear. For example, it may be difficult to tell whether there are 3 T's in a row or 4.
By looking at multiple overlapping reads that cover the same area, it is possible to spot errors and clear up ambiguity. And the more overlapping reads you have—say 2 out of 3 reads agree, or 4 out of 5—the easier it is to find and disregard the errors.
Greater depth of coverage also makes it possible to tell the difference between errors and genetic variation. For organisms with two parents, each individual has two copies of every chromosome. Those two copies are nearly but not exactly identical. If half of the reads show an A at a certain position, and half show G, then both are probably correct—they're just reading from different chromosomes.
For sequencing a new genome for the first time, researchers usually aim for about 50x coverage (that is, 50 overlapping fragments at each position). For detecting human genetic variation in comparison to a reference genome, 10x to 30x coverage is usually deep enough.
Genetic Science Learning Center. (2010, December 9) Genome Mapping.
Retrieved July 17, 2018, from https://learn.genetics.utah.edu/content/cotton/genome/
Genome Mapping [Internet]. Salt Lake City (UT): Genetic Science Learning Center; 2010
[cited 2018 Jul 17] Available from https://learn.genetics.utah.edu/content/cotton/genome/
Genetic Science Learning Center. "Genome Mapping." Learn.Genetics.
December 9, 2010. Accessed July 17, 2018. https://learn.genetics.utah.edu/content/cotton/genome/.