What really happened when the human genome was deciphered – and is happening now
The human genome is an international research project, the main goal of which was to determine the sequence of nucleotides that make up DNA and to identify 25-30 thousand genes in the human genome. This project is called the largest international collaboration ever in biology. Wikipedia
The sequence of the human genome is contained in our DNA and is made up of long chains of “base pairs” that make up our 23 chromosomes. Along our chromosomes are sequences of base pairs that make up our 30,000 genes.
21 years ago, US President Bill Clinton and British Prime Minister Tony Blair and Celera Genomics Corporation announced that the Human Genome Project – “initial sequencing of the human genome” / initial determination of the DNA sequences of genes, has been completed. From that moment on, biology entered the “post-genomic era.”
To call this year 2000 “the year of decoding of the human genome” can only be conditional – scientists have only established the first “draft” of the DNA sequence of all 23 human chromosomes with many gaps. The “clean” version appeared three years later, in 2003. After that, the Human Genome project ended – but there was no decoding of the human genome. Comprehension and addition of the data obtained then continues to this day.
The Human Genome Project is sometimes called the most successful international scientific collaboration (process of joint activity) in the history of science. However, at the start, the scientific community did not feel unanimous optimism: the preparation was accompanied by public discussions and devastating articles, the authors of which argued that it was impossible to read the sequence of human DNA, and taxpayers’ money should be spent on something more useful.
Although the technical ability to sequence (determine the sequence) of the genome was shown back in the 70s, when the first genome of the virus was deciphered, they did not immediately dare to decipher the human genome. This idea took shape thanks to molecular biologist Robert Sinsheimer of the University of California, Santa Cruz.
Robert was not only a major researcher, but also an outstanding organizer of science. As head of the University of Santa Cruz, he turned it into a first-class academic institution. In 1985, he organized a symposium at this university, at which he spoke about his idea of decoding the complete sequence of nucleotides in human DNA. Later on, this idea grew into a powerful international project “Human Genome”.
His fellow astronomers were then working on the creation of the largest (at that time) ground-based telescope, and Sinsheimer was thinking about a project of this magnitude in biology.
In 1985, he brought together several leading geneticists to discuss a project for sequencing the human genome. The team concluded that the idea was tempting but not feasible.
By that time, even the E. coli genome of “only” five million base pairs in size had not been deciphered, and the maximum length of a nucleotide sequence that could be read at a time by Sanger’s method was several hundred nucleotides.
The discussion was attended by Walter Gilbert, who, 10 years earlier, had proposed his method of DNA sequencing (known as the Maxam-Gilbert method or the method of chemical degradation of DNA), almost simultaneously with Frederick Sanger.
Walter Gilbert was inspired by the idea of creating a genomic institute and attracted the discoverer of the structure of DNA, James Watson and Charles Delisi, who headed the health and environment division of the US Department of Energy. The latter saw the genomic project as a logical continuation of research on the effects of radiation on humans. In 1986, they already calculated the cost of decoding the sequence of the human genome.
Skeptical colleagues estimated the duration of the project as tens of years of routine work, if DNA is “read” by small research teams – and only in this case, in their opinion, the work can be done well. The amount of work to be done seemed incredibly huge. However, Watson and Delisi decided to place their bets on large automated centers and international cooperation. The final plan for the US part of the project was designed for 15 years with a three billion dollars cost.
This figure seems large – but, for example, the Apollo space project, implemented twenty years earlier, cost the Americans 10 times more (excluding inflation). At the same time, as a result of the implementation of the Human Genome project, scientists promised something no less significant than a flight into space – at least, to understand the nature of 4000 hereditary diseases and to advance medical genetics and related technologies.
Despite the criticism and price tag, they managed to push through both the Department of Energy and the US National Institutes of Health (NIH). In 1990, the project started.
The $ 3 billion project was formally launched in 1990 by the US Department of Energy and the National Institutes of Health and was expected to last 15 years. Employees of the International Human Genome Sequencing Consortium (International Human Genome Sequencing Consortium) from 20 scientific groups from the USA, Great Britain, Germany, France, Japan and China, Russia worked on the project.
A few years after the start, the lone marathon of the international consortium became a race. Craig Venter, who originally headed one of the laboratories at the US National Institutes of Health (NIH), developed a new way to study genomes called “expressed sequence tags”, which significantly accelerated the process of finding genes from their transcripts.
Armed with this technology and the backing of venture capitalists, he left the NIH and founded the TIGR Institute for Genomic Research.
In 1998, Venter teamed up with the manufacturer of automatic sequencers under the name Celera Genomics and announced that it would also work on decoding the human genome. Starting eight years later than Human Genome, Venter was going to complete the task in just three years – while the international consortium was not going to finish earlier than seven years.
His company planned to capitalize on this by patenting genes associated with hereditary diseases (however, in 2000, Clinton said that the genome sequence is public, and it cannot be patented, so the businessman’s efforts in a sense were in vain).
The emergence of a competitor spurred the Human Genome, and the goal was eventually achieved two years earlier. The federal project was agreed with Celera, and the results of both projects were simultaneously announced at the same press conference on June 26, 2001.
The room was attended by the founder of Human Genome, Jim Watson, and John White, director of PE Corporations, which sponsored Venter, both of whom made it clear that the war had been ended in a bad world. The article by Venter’s group was published in Science, a day after the publication of the article “The Human Genome” in Nature.
By the time the project was launched in 1990, several short viral genomes and plasmids (auxiliary circular DNA molecules from bacteria), the size of which was limited to tens of thousands of base pairs, had been decoded. The “human genome” was going to read a genome several orders of magnitude larger: three billion pairs – that is how many “letters” a single set of human chromosomes (23 chromosomes) contains. According to the majority, the number of genes contained in this “chronicle” should have been about 100 thousand.
Unsurprisingly, this task seemed insurmountable to many leading geneticists. However, as the project progressed, advances in technology made it easier for scientists to work.
First results and further development
By 2000, it was possible to get an idea of the human DNA sequence in the composition of euchromatin – the sites from which transcription is actively carried out, that is, the reading of data by RNA polymerase. According to scientists, euchromatin makes up about 95 percent of the entire genome. The rest of the DNA is hidden in tightly packed protein complexes and is silent most of the time.
After decoding, the number of genes in the human genome had to be reduced from 100 thousand to 30 thousand – this number is only twice that of a fly or a worm, the authors of the historical publication in Nature wrote.
Scientists also learned that the human genome contains a lot of repeats and mobile elements, the vast majority of which no longer work. In addition, the human genome is very diverse – geneticists have estimated that the number of single-nucleotide polymorphisms in it (areas in which different people can have one or another nucleotide) reaches 1.5 million. This became clear, among other things, due to the fact that the project used DNA from a large number of volunteers, and not from one person.
Genome for medicine
In the twenty years since the completion of the assembly of the draft genome version, sequencing and sequence analysis technologies have developed so much that today it will cost you not three billion dollars, but only a few hundred to find out the sequence of the coding regions of the genome (exome).
Genomic studies can show that carriers of this variant of a gene have a disease, for example, five times more often than carriers of another variant. This knowledge can help you adjust your lifestyle to minimize the likelihood of adverse health effects. But the calculation of the risks of developing the disease is entirely based on statistics, while there is still room for developing the mathematical apparatus.
As for predicting a child’s genome for sports and music, this is, of course, still a thing from fantasy the realm. However, genome or exome sequencing is currently weighing focusing on fetal genetics, giving birth to ‘healthy’ children if the parents are carriers of some harmful mutations.