TA researchers: Genome can reveal surname

Algorithm makes it possible to identify family names on the basis of genetic data in the Y chromosome.

Chromosomes (photo credit: Memorial University of Newfoundland)
(photo credit: Memorial University of Newfoundland)
Show geneticists the chromosomes in your genome – and they are quite likely to identify your surname.
This astounding feat has been accomplished by Tel Aviv University researchers, who have just published their findings in the prestigious journal Science.
The Israeli researchers, who work at TAU and at the Whitehead Institute of Biomedical Research in Boston, developed an algorithm that makes it possible to identify family names on the basis of genetic data in the Y chromosome, which – conveniently – is handed down through the generations from father to son (except for slight mutations along the way). This discovery has long-term implications, including for the privacy of personal information.
Since the human genetic code has been cracked, many people have been trying to learn about their genetic heritage, said Prof. Eran Halperin of TAU’s school of computer sciences and the department of microbiology and biotechnology. To meet this need, Dr. Yaniv Ehrlich of Whitehead, together will Halperin and doctoral student David Golan of TAU’s statistics department, built the computer algorithm that could determine the individual’s family line only from his Y chromosome.
A sample of 900 American men were examined for data on their Y chromosome that was stored in the Internet databank with the genomes of 135,000 people. This comprised an accurate representative sample of surnames, especially those of European ancestry.
The algorithm identified the surname of one in eight tested, said Halperin.
In one case, the researchers managed to identify the name and the fact that he lived in California, all according to his Y chromosome. They also found the chromosome Y information about famed geneticist Craig Venter, who headed the Genome Project, after narrowing down the identity to only two men in California.
The significance of the research, said Halperin, involves “more than a handful of useful applications such as locating relatives and identifying bodies in natural disasters and other calamities. But there is also something sinister, as if a person publishes his genome on the Internet, even if done anonymously, his identity could be exposed, and this was from only one chromosome and not from all 22 pairs,” he added.
Despite this shortcoming, the researchers regard the discovery as positive. People who disclose their genome must be made aware of the risk of exposure, said Ehrlich. “We believe that legislators must take special care when they plan such databanks.”