For two decades, there has been a reference human genome based on just a few people. Researchers create a new version that combines genetic material from different parts of the world.
The genome is the set of DNA instructions that help every living organism to develop and function.. Genome sequences vary slightly between individuals. In the case of humans, the genomes of two people are more than 99 percent identical. The small differences that remain contribute to the uniqueness of each person and provide, for example, information about health, help diagnose diseases and develop treatments.
To understand these genetic differences, the scientists created reference sequences of the human genome (His name is GRCh38) using digital fusion to be used as a ‘standard’ for comparison, helping to align, group and study other sequences in our genome.
Related topics
Despite its importance and continuous improvements, GRCh38 presents limitations when representing the diversity of the human speciesbecause it consists of the genomes of only about 20 individuals, and most of the reference sequence is from just one of them.
Now, the Human Pangenome Reference Consortium (HPRC) is publishing in Nature a new, cutting-edge set of sequences that improve on this ‘standard’ genome and bring together much more diversity than was previously available. It is a pangenome novel reference.
(See also: They Record Mysterious Sounds in Earth’s Stratosphere)
So far, the first draft has been submitted, which includes genome sequences from 47 people from around the world and from different ancestry (Africans, Americans, Asians and Europeans), but the researchers plan to increase this number to 350 by mid-2024. Since people carry chromosomes in pairs, the The current repertoire includes 94 different genomic sequences and the goal is to reach 700 different sequences when the project is finished.
New “messages” in the DNA
Regarding the human genome reference, WThe pangenome adds 119 million base pairs, or “letters,” in DNA and 1,115 genetic duplications. (mutations in which a region of DNA containing a gene is duplicated), and it increases the number of structural variants detected by 104 percent, providing a more complete picture of genetic diversity within the human genome.
Since the publication of the first draft human genome, several projects have worked to improve its quality and complete this repertoire (eg, with the recent Telomere-to-Telomere Project, T2T). However, the single and linear repertoire still does not correctly model the genetic diversity of our species, since there are multiple genomic variants that are not common to all people,” explains Lsink, one of several co-authors on this work, Santiago Marco Sola, of the Autonomous University of Barcelona.
“The proposed solution was to model a non-linear reference that contains the genetic variations present in the population,” he explains, taking into account the genetic diversity of our species. This is called the pangenome and uses the structure of a graph (or histogram) to model the genomic variations that occur in different individuals.
(You may be interested in: Stories of the Universe: Light, Science, and Action!)
with supercomputers
Marco Sola, also attached to the Barcelona Supercomputing Center (BSC-CNS), points out that this project would not be possible without supercomputers: “If creating a linear genetic repertoire (such as GRCh38) requires the alignment and assembly of hundreds of billions of DNA, The pangenomic reference needs to process several orders of magnitude more information.”
in it BSC-CNS MareNostrum 4 Giant The methods were developed and subsequently included in the pangenome project, although the computation and processing of the final results presented has now been implemented in other international supercomputing infrastructures.
Applications
“The human pangenome repertoire will allow us to represent tens of thousands of novel genetic variants in previously inaccessible regions of the genome,” says co-author and researcher Wen-Wei Liao, from Yale University (USA). We can accelerate clinical research by improving our understanding of the link between Genes and disease traits. ”
“Everyone has a unique genome, so using a single reference genome sequence per person can lead to inequities in genomic analyses.”And, for example, predicting a genetic disease can’t work as well for someone whose genome is different much more than the reference genome.”
(Continue reading: The first radiation belt observed outside the solar system)
Hence the importance of the new pangenome. Researchers and primary clinicians working with genomics need access to reference sequences that reflect the remarkable diversity of human populations. This will help make the reference useful for all people, which will help reduce the chances of health disparities spreading,” said Eric Green, NHGRI director.
“Creating and improving a reference for humankind is in line with the goal of our institute to fight for global diversity in all aspects of genomics research, which is critical to advancing genetic knowledge and implementing genomic medicine in an equitable manner.”
The work of this international consortium has a budget of approx $40 million over five yearsincluding efforts to establish the human pangenome repertoire, improve DNA sequencing technology, operate a clearinghouse, conduct outreach activities, and generate resources for the scientific community to use this new repertoire.
More news at EL TIEMPO