A research team headed by Weizmann Institute scientists has succeeded in cracking the genetic code that sets down the rules for where on the DNA strand tiny spheres called nucleosomes are located. Their findings appeared Thursday in the prestigious journal Nature.
Nucleosomes provide the lowest level of compaction required to fit DNA into the cell nucleus. They are made up of DNA and four pairs of proteins called histones, and are important in regulating the transcription (the transfer of genetic information from DNA to RNA, which enables the cell to follow the gene's instructions) of DNA by preventing RNA polymerase from accessing the promoter regions of genes not needed by the cell. If the requirements of the cell change, enzymes known as remodeling factors can remove or change the position of the nucleosome to allow access.
Several diseases, including cancer, are typically accompanied or caused by DNA mutations and the way DNA organizes itself to form chromosomes. Such mutational processes may be influenced by the relative accessibility of the DNA to various proteins and by the organization of the DNA in the cell nucleus. Therefore, the scientists believe that the nucleosome positioning code they discovered may aid scientists in the future in understanding the mechanisms underlying many diseases.
Until Dr. Eran Segal, from the Rehovot institute's computer science and applied mathematics department, and research student Yair Field came to their conclusions together with colleagues from Chicago's Northwestern University, no one knew what determined how, when and where a nucleosome would be positioned along the DNA sequence.
Nucleosomes, which resemble "beads on a string of DNA" when observed through an electron microscope, are known to play an important role in the cell's day-to-day function. Access to DNA wrapped in a nucleosome is blocked for many proteins, including those responsible for some of life's most basic processes. Among these barred proteins are factors that trigger DNA replication, transcription and repair. The positioning of nucleosomes defines the segments in which these processes can and can't take place.
These limitations are significant, as most of the DNA is packaged into nucleosomes. A single nucleosome contains about 150 genetic bases (the "letters" that make up a genetic sequence), while the free area between neighboring nucleosomes is only about 20 bases long. It is in these nucleosome-free regions that processes such as transcription can be initiated.
For many years, scientists had been unable to agree on whether the placement of nucleosomes in living cells was controlled by the genetic sequence itself. Segal's team managed to prove that the DNA sequence indeed encodes "zoning" information on where to place nucleosomes. They also characterized this code and then, using the DNA sequence alone, were able to accurately predict a large number of nucleosome positions in yeast cells.
Since the proteins that form the core of the nucleosome are among the most conserved in nature by evolution, the scientists believe the genetic code they identified is also conserved in many organisms, including humans.
To unravel the code, the scientists examined 200 different nucleosome sites on DNA to determine whether the sequences around them had something in common. Mathematical analysis revealed similarities between the nucleosome-bound sequences, and eventually uncovered a specific "code word." This "code word" consists of a periodic signal that appears every 10 bases on the sequence.
The regular repetition of this signal helps the DNA segment bend sharply into the spherical shape required to form a nucleosome. To identify this nucleosome positioning code, the research team used models of probability to characterize the sequences bound by nucleosomes and developed a computer algorithm to predict the encoded organization of nucleosomes along an entire chromosome.
The team's findings provided insight into another mystery that has long been puzzling molecular biologists: how cells direct transcription factors to their intended sites on the DNA, as opposed to the many similar but functionally irrelevant sites along the genomic sequence.