Human Genome Project, 18 years since declared completed

Since person of Francis Collins connects both BioLogos and Human Genome Project and because genetics recently show ups around me quite often, my thoughts were recently orbiting around it. I’m not biologist, mathematical physics is the only science that I know a little bit, so it is impossible to me to assess what was the legacy of it since the moment when it was declared as completed in 2003.

More round date, like 2023 would probably bring much more attention to it, at the same time I found information, that only in May of this year (2021) “complete genome” phase ended. Does anyone here know what this means for biology? And what we learn from it since 2003? I think it is good place to talk about it a little.

1 Like

Great topic. The advances in genetics have been profound in medicine due to knowledge of the genome. Counseling can be done with parents and future parents, diagnosis of genetic conditions, treatment can be customized, risk of future disease can be accessed and intervention done. While many specific examples can be made, I’ll give one example that directly affected a friend, and affects many. My friend had several close relatives dxed with breast cancer, and genetic tested show her positive for the BRCA gene mutation, placing her at very high risk for breast and ovarian cancer. She elected for removal of ovaries and prophylactic mastectomy (which is not for everyone but was her preference) and now has a much better chance of living a long and healthy life.

3 Likes

If the HGP did say that they had a complete sequence of the human genome in 2003 then they would have been wrong. Francis Collins described it as the “first draft of the human genome” which is probably the best description. There wasn’t a gap-free sequence of the reference genome until very recently (May 2021). I wouldn’t be surprised if there was still some QC to be done on the May 2021 sequence, but I could be wrong.

What it means for science is that people can look for important function in the last few percent of DNA that wasn’t a part of the reference sequence. The last chunks of the genome to be sequenced were highly repetitive so there wasn’t an expectation of find a lot of new genes, but it looks like they were still able to find a few coding genes.

I haven’t read the entire paper, but I would guess that the 2,226 paralogous gene copies are mostly pseudogenes of gene duplicates.

Will any of these newly discovered genes and sequence be important in medicine? Who knows. They could be. I’m sure there are many scientists crunching that data right now. One of the first things I would do is look at old RNA-seq data to see if there are any differentially expressed RNA’s that match up to these newly discovered genes. In other words, are any of these genes expressed differently in cancers, heritable diseases, infections, or other pathologies? The data is probably sitting out there right now ready for someone to run the analysis.

3 Likes

Whoever declared it finished is a fool as a human is not a single cell organism. We should understand by now that what makes us human is far more than the genome that people have looked at. Genetically speaking the genome present in a single stem cell is a genetic minority in a human :slight_smile:

I don’t see how those two are connected. We can determine the number of bases missing in our sequencing runs while also believing that humans are more than the DNA sequence of their genome.

What do you mean by that?

There is at least few public talks of Francis Collins when he states that HGM was ended “before schedule and under budget” (he used similar words). So, I was quite surprised when I found information about full genome delivered few months ago.

Of course public talk doesn’t need to be as precise and rigorious as scientific article, so prof. Collins or anybody else is allowed to make such simplifications. I just was imagined that he means something else by “HGM was ended”.

They never had the goal of getting a complete sequence, so that makes sense. The very rapid development of sequencing techniques in the late 1990’s is what made it possible to get the first draft sequences done under budget and ahead of schedule. They were also in a race with Celera which spiced things up.

Scientific jargon and hedging doesn’t always translate well into common language. For example, “the initial sequence is complete” is translated to “the sequencing is complete”.

1 Like

apart from the genetic variability within the cells derived from the original, most of the cells that make up a living human are acquired on the way :slight_smile:

Even if you choose to define a living human that way, this has nothing to do with the sequencing of a complete human genome.

There are somatic mutations along the way, and from what I have read they outnumber the number of mutations passed along in the germline cells.

I think we are all aware that we start out as a single cell and develop into a full grown human through cellular replication.

Still missed the point. Humans are symbiotic organisms, so if we only look at the genome of the cells without walls we miss a lot :slight_smile: It is the interaction that counts

There are efforts to sequence the human microbiome, though it’s not easy to be confident about what might just be passing through and what is actually resident in some fashion. (Note that the popular claim that far more cells in you are bacterial than human is wrong; it was based on one rough estimate that got popularized, and a more systematic calculation suggests that the numbers are similar to each other in order of magnitude.) But because there are many different microbes, not evenly distributed, thorough sampling is much harder than targeting the genome, which can be documented from any one cell. (Yes, there are going to be a number of individual mutations, but the level of variation is generally small unless something’s wrong, e.g. cancer).

No one is saying anything different. No one is claiming that the DNA sequence of the human genome holds every answer to every question. To use an analogy, we can determine how many pages are missing in a dictionary without also believing that a dictionary holds the meaning of life.

On the other hand, most human cells don’t have genomes in them.

Actually, all live human cells have genomes - the set of all genes in them. Mature mammal red blood cells lose the nucleus, but still have mitochondrial genomes. Depending on your definition, platelets might also be thought of as cells that lose the nucleus, or just as pieces of cells.

As another complication, “genomic sequencing” or “genomic analysis” often refers to any technique that gets a lot more sequence at a time than the old one gene at a time approach. This often is only a subset, and sometimes a small subset, of the total genome, but still is a lot of data. E.g., the name “genomic skimming” conveys the idea that this is a relatively small and select bit of the total genome.

Mature erythrocytes lose both their nuclei and their mitochondria and therefore do not have genomes. Since most human cells are mature erythrocytes, most human cells do not have genomes.

Is “genome” a synonym for “complete set of DNA”. Because I thought every cell carries a complete set of our DNA. Not that I teach any life science courses, but still - I should stop repeating that outsider’s “folk wisdom” if it isn’t true.

There is some ambiguity, but basically yes.

Every cell but red blood cells (and also cells in the lens of the eye, I gather). Those cells lose their DNA at some point in their maturation.

Ahh! and those are the “erythrocytes” (which I just now looked up!) Thanks for the clarifications and education.

They must not lose all of it though? Because blood left at a crime scene still has enough DNA to be useful, right? Or is it just skin or hair cells they look for?

Blood contains white blood cells, which do have DNA. There is also a variable amount of extracellular DNA floating around in blood, which ultimately comes from cells.

1 Like