Don't throw away 'Junk DNA'

Human genome found to be a control panel switching disease-related genes; could revolutionize diagnostics, treatments.

DNA (photo credit: REUTERS)
(photo credit: REUTERS)
Geneticists looking for the origin of heritable diseases “now have a new sandbox to play in,” said Shaare Zedek Medical Center’s medical genetics director, Prof. Ephrat Levy-Lahad, following the discovery that so-called “junk DNA” in the human genome has an important purpose after all.
Instead of 99 percent of the genome being “irrelevant filler” – as had been thought – and only the remaining 1% of the genome of 3 billion base pairs encoding for vital proteins, the “junk DNA” serves as millions of DNA switches that power the human genome’s operating system. It thus comprises a massive control panel; without these switches, genes would not work and mutations in these regions might lead to human disease.
The locations of some four million switches were discovered and published Thursday in three journals: Nature, Genome Biology and Genome Research by an international research team of hundreds of scientists led by the University of Washington in Seattle.
The University of Washington’s ENCODE project stands for “ENCyclopedia Of DNA Elements.”
The on/off switches controlling genes were encrypted within the remaining genome. Without these switches, named “regulatory DNA,” genes are inert.
Researchers around the world have been focused on identifying regulatory DNA in order to understand how the genome works. The researchers created the first detailed maps showing where regulatory DNA is located within hundreds of different kinds of living cells. They also compiled a dictionary of the instructions written within regulatory DNA in the genome’s programming language.
Levy-Lahad told The Jerusalem Post that “scientists haven’t called the 99% ‘junk DNA’ for years, because it became clear that it wasn’t wasted. These parts of the DNA that don’t encode for protein must have had a reason to be there, but geneticists didn’t know the purpose.”
The ENCODE team have now made it possible to understand the process much better. In general, researchers are beginning to understand how genes are regulated – turned on and off. Many diseases are not due to protein-coding regions but switches that turn genes on and off, she continued.
“I would anticipate that many diseases that we don’t understand well will become clearer. For example, there is a large heritable component to type II diabetes, but [we] haven’t yet found specific genes responsible for it,” the Shaare Zedek geneticist said.
Although the onset of diabetes occurs usually due to overweight, poor diet and lack of exercise, there is a much greater likelihood of developing the disease if one parent had it, Levy-Lahad said.
“The defective genes could have been there, but in previous generations, people were leaner than now, she explained, so it might have not showed up. Autism also has a genetic component, and so do many other diseases. Now, with this discovery, we will have more places in the genome to look for genetic changes that matter. So far, we geneticists haven’t fished around even in the whole 1%.”
The ENCODE discovery, Levy-Lahad continued, “will in the long term lead to better diagnostics and then to improved treatments and cures.”
Just as the Human Genome Project revolutionized biomedical research, ENCODE will drive new understanding and open new avenues for biomedical science, said the research leaders from the US National Genome Research Institute (NHGRI) and the EMBL-European Bioinformatics Institute (EMBL-EBI) in the UK.
“Our genome is simply alive with switches: millions of places that determine whether a gene is switched on or off,” says Ewan Birney of EMBL-EBI, the lead analysis coordinator.
“With ENCODE, we can see that around 80% of the genome is actively doing something. We found that a much bigger part of the genome – a surprising amount, in fact – is involved in controlling when and where proteins are produced, than in simply manufacturing the building blocks.”
“ENCODE data can be used by any disease researcher, whatever pathology they may be interested in,” said Ian Dunham of EMBL-EBI, who played a key role in coordinating the analysis.
“In many cases, you may have a good idea of which genes are involved in your disease, but you might not know which switches are involved. Sometimes these switches are very surprising, because their location might seem more logically connected to a completely different disease. ENCODE gives us a set of very valuable leads to follow to discover key mechanisms at play in health and disease. Those can be exploited to create entirely new medicines or to repurpose existing treatments.”
Until recently, generating and storing large volumes of data has been a challenge in biomedical research. Now, with the falling cost and rising productivity of genome sequencing, the focus has shifted to analysis – making sense of the data produced in genome-wide association studies.
ENCODE combined the efforts of 442 scientists in 32 labs in the United Kingdom, United States, Spain, Singapore and Japan. They generated and analyzed over 15 trillion bytes of raw data – all of which are now publicly available.
ENCODE combined the efforts of 442 scientists in 32 labs in the UK, US, Spain, Singapore and Japan. They generated and analyzed over 15 trillion bytes of raw data – all of which is now publicly available. The study used around 300 years’ worth of computer time studying 147 tissue types to determine what turns specific genes on and off, and how that ‘switch’ differs between cell types. All of the published ENCODE content, in all three journals, is connected digitally through topical ‘threads’, so that readers can follow their area of interest between papers and all the way down to the original data.