From a certain perspective, the dog is just another mammal, albeit with a genome slightly smaller and “cleaner”—“there is less junk,” Lindblad-Toh said—than that of its human companion or the ubiquitous lab rat, to which the researchers also compared it in Nature. Tucked within the 2.4 billion base pairs of the dog’s DNA are some 19,300 genes. By comparison, the human genome consists of approximately 2.9 billion base pairs of DNA and, at most recent count, approximately 22,000 genes. Approximately 72 percent of the dog genes are orthologous, meaning they correspond on a one-to-one basis with genes found in the human and rat genomes, although their functions might differ.
Comparison of mouse, human and dog genomes have identified a core 812,000,000 base pairs (5.3% of the total human genome) of ancestral sequence common to all three species. This DNA encodes proteins (1-2% of the total genome), and includes specific sequences that control gene expression. This portion of the genome is under what biologists call purifying selection, wherein variations on a gene or changes in a sequence are selected against, or weeded out. The sequencing of additional mammalian genomes, including those of Rhesus monkey, cow, opossum, elephant, rabbit, cat and shrew, should help to sharpen the focus on the DNA definition of mammal-ness.
Despite the similarities between all three species, it appears the genome better reflects the social reality of dogs and humans than does taxonomy, which places the rat closer evolutionarily to humans than the dog. The researchers reported in Nature that some sets of functional genes, like those involved in brain development, showed signs of having evolved similarly in dogs and humans—and more rapidly than in rats. It is a suggestive finding.
The dog’s value in comparative genomics lies in large measure in its breed structure, and here the researchers offer some support to a couple of recent suggestions that repetitive segments of DNA are somehow tied to the dog’s physical plasticity, its ability to assume so many different shapes and sizes, as well as to fall victim to various inherited diseases. Geneticists have focused not only on SNPs—changes in a single base—but also on repetitive blocks of DNA, including “short interspersed nuclear elements” and “tandem repeats.”
SINE elements, as they are known, are repetitive segments of DNA between 150 to 750 bases long. Interspersed throughout a genome, they move around over time, and some are species-specific.
On the whole, the dog genome has fewer SINE elements than the rat or human. But it has a “highly active carnivore-specific SINE family” that is full of mutations that vary between breeds, Lindblad-Toh and her co-authors wrote in Nature. These SINE elements are greater in frequency by a factor of at least 10 than any found in humans, and are believed to play a role in gene expression. When inserted into genes, they can cause diseases, like narcolepsy in Doberman Pinschers and centronuclear myopathy (a muscle disease) in Labrador Retrievers.
Wei Wang and Ewen F. Kirkness of the Institute for Genomc Research, writing in Genome Research, argue that SINE elements are a major source of genetic diversity in the dog. Citing their research, Lindblad-Toh and her colleagues speculate in Nature that the variation from SINE elements “has provided important raw material for the selective breeding programs that have produced the wide phenotypic variations among modern breeds.” In that event, SINE elements may have been what has allowed humans to produce everything from the Pug to the Irish Wolfhound.
But no one knows. In December 2004, John W. Fondon, III, and Harold R. Garner of the University of Texas Southwestern Medical Center proposed in the Proceedings of the National Academy of Sciences—in a paper that caught the attention of scientists, if not the press—that changes in the length of “tandem repeats” found within genes are responsible for the phenotypic variation between breeds and the speed with which breeders can change a breed’s appearance. Once called “junk DNA,” like so many other parts of the sequence whose purpose was then unknown, these “tandem repeats” occur when two or more nucleotides form a pattern that repeats itself over a short stretch of the genome.