Behind the Sequence

In this blog post we will begin to discuss the purpose of whole genome sequencing, that is what information scientists are looking at and analyzing during the process. This technology is very complicated, so this post will provide you with some of the scientific context behind genetic sequencing. Look out for our next few posts, which will continue this discussion and cover the ways scientists are using the information from whole genome sequencing for research and with patients.

All living things, from your house cat to the plants in your window box to the mold growing on the blueberries in your fridge, have a genome. Genomes work like a chain. You can think of it like a barrel of monkeys.  If each monkey hooked hand-to-hand in the chain represented a base, A, T, C, or G, then the sheer length of the monkey chain would represent the length or size of the genome. All genomes are divided into chromosomes, so in reality, the genome is composed several separate chains of bases. However, whole genome sequencing examines all of these chains at once, setting each chain of bases end to end.

The same four bases make up the DNA of all living things, but as we know, you look very different from your house cat and moldy blueberries, and there are even considerable differences between you and other humans. How is this possible? Well, if the same four bases make up every genome in every living thing, then the things that must determine our differences are the length of our genome and the order of the bases that make up the chain. It’s true that the genomes of different species vary considerably in length, but all human genomes are about 3 billion base pairs long, so the differences among humans must lie mainly in the order of the bases themselves.

SNPs (pronounced “snips”) or Single Nucleotide Polymorphisms are differences in single, individual nucleotides or bases (A, T, C, or G) throughout the genome. While my genome might start off with A, yours might start with a G, and this would be an example of a single nucleotide polymorphism or SNP. There are about 10-15 million base pair differences in each human genome, and these differences are what are known as SNPs. Together they account for many of the differences that you see among human beings. They can cause differences in physical traits like hair color and height, as well as internal differences, like how different people respond differently to the same medication or how different people are susceptible to different diseases.

Of the 3 billion base pairs that reside inside each and every one of your own cells, 99.5% of them are identical to those that make up the genome sequence in other humans.  If only .05% of your base pairs are unique, then how can they be responsible for so many significant differences among human beings?

It’s important to remember that even though only .05% of your base pairs are responsible for most of the differences between you and other humans, .05% of 3 billion is still 15 million base pairs. Not only is there a lot of room for differences in 15 million SNPs, but one single change can make a world of difference. Some conditions like Progeria are caused by a single base pair change. One base pair out of 3 billion is a miniscule difference, but small changes in the genome can have an immense effect on the human body.

SNPs are one of the main things that are examined in whole genome sequencing. Comparing the sequence of a sick individual with sequences of healthy individuals and noting the differences, can help doctors determine where a problem might be coming from. Larger changes or mutations that, unlike SNPs, involve more than one base can also be found using whole genome sequencing and used in much the same way as SNPs, to determine the cause of something that is occurring in the body.

This video from 23&Me explains the importance of SNPs and how they cause many of the differences we see among different people, and many too that we cannot see. For more information on SNPs and the ways that certain differences are passed down or inherited through generations, see the Genes in Life section on Genetics 101.