Finishing the Human Genome

A consortium of researchers announces it’s finally sequenced the complete genome, uncovering more than 100 new genes.

By Nathaniel Scharping
Dec 24, 2021 6:00 AMMar 21, 2023 8:11 PM
Colorful bars representing DNA’s four complementary bases
(Credit: Gio.tto/shutterstock)

Newsletter

Sign up for our email newsletter for the latest science news
 

This article appeared in the January/February 2022 issue of Discover magazine as "Finishing the Human Blueprint." Become a subscriber for unlimited access to our archive.


At long last, scientists have declared “mission accomplished” on the complete sequencing of the human genome — one of the most ambitious research undertakings of the past few decades. The news may trigger déjà vu: Scientists with the Human Genome Project first announced they had sequenced the human genome in 2003.

That initial effort came with some notable omissions, though. A sizable chunk of the genome remained inaccessible, the era’s technology unable to parse more complex DNA regions. Though additional work added more clarity, around 8 percent of the human genome remained a mystery — until this year, when an international collaboration called the Telomere-to-Telomere (T2T) Consortium filled the gaps.

Many of these tricky regions include long stretches of highly repetitive DNA sequences. Though they often don’t code for proteins, the body’s building blocks, these sequences likely contain important clues to understanding rare genetic diseases, says Karen Miga, a satellite DNA biologist at the University of California, Santa Cruz. The sections might also alter what is known about the basics of human biology, such as cell division.

“We had a pretty darn good first sequence of the human genome,” says Eric Green, director of the National Human Genome Research Institute and a member of the Human Genome Project. But when it came to more complex stretches of the genome, the computers and “the little chemical tricks we do in the test tube, they just choke.”

Initially, scientists used the so-called “shotgun sequencing” technique. It broke longer DNA sequences into small, overlapping pieces that computer algorithms sometimes struggled to stitch back together. Today, more advanced methods empower geneticists to read sequences that measure hundreds of thousands of base pairs (the “letters” that compose DNA) in length, with an occasional length in the millions. That allowed them to “thread through and resolve some of these trickier bits,” says Miga, who helped lead the recent project.

That effort, involving dozens of scientists from around 30 institutions, finalized the human genome sequence in a series of papers posted to bioRxiv, a preprint server, in May 2021. The researchers added nearly 200 million base pairs to the archive of the genome, including 115 genes that likely code for proteins.

The new additions offer a wealth of information for geneticists to comb through. Some genes “probably have new roles that we haven’t even imagined yet for how the cell functions,” Miga says.

In the meantime, there’s work still to be done. For one, the current version of the genome represents a single person. The T2T team, now merged with the Human Pangenome Reference Center at Washington University, is working to add more diverse sequences to their database — so the human genome may contain further surprises.

1 free article left
Want More? Get unlimited access for as low as $1.99/month

Already a subscriber?

Register or Log In

1 free articleSubscribe
Discover Magazine Logo
Want more?

Keep reading for as low as $1.99!

Subscribe

Already a subscriber?

Register or Log In

More From Discover
Stay Curious
Join
Our List

Sign up for our weekly science updates.

 
Subscribe
To The Magazine

Save up to 40% off the cover price when you subscribe to Discover magazine.

Copyright © 2024 LabX Media Group