My View

Richard Gayle

Processing March 31, 2000

It is not often that virtually an entire edition of Science is taken up by papers dealing with a single project. Recently Science devoted a large amount of its issue to the complete sequencing of the Drosophila genome, its assembly and its annotation. I am still trying to wade through all of it, because it is a very significant event. This week's column will be short since I am still processing the information but here are a few quick thoughts.

Although C. elegans had its genome decoded sooner than Drosophila, there are several key differences. One has to do with the idea of heterochromatic regions of the genome versus euchromatic. Although the original designation of the these regions was dependent on cytological distinctions (i.e. staining properties and time of replication), there is a real functional difference. Euchromatin can be stably cloned into artificial chromosomes in yeast (YACs) or bacteria (BACs), while heterochromatin can not. YACs or BACs allow huge amounts of DNA (several hundred kilobases) to be cloned and examined, making mapping much easier. It turns out that C. elegans does not have a lot of this form of heterochromatin, allowing its complete genome to be cloned in YACs. Drosophila, and humans, do have a lot of this type. Almost 1/3 of the Drosophila genome is heterochromatic and essentially unclonable.

Drosophila is a sentimental favorite since it was the first experimental model for genetics, Mendel's peas notwithstanding. Thomas Hunt Morgan, started using them in 1910, before anyone even knew the chromosomes were the carriers of heredity. He and his collaborators demonstrated just this over the next 5 years. More genetics has been done with this fly than with any single organism. Many of us have worked with it. I remember using Drosophila to generate linkage maps in high school biology.

And Drosophila genetics, more than anything else, demonstrates the playfulness and creativity of scientists. Look at the names of some of the Drosophila mutations. We have sevenless, and its siblings bride of sevenless and son of sevenless. 18-wheeler has body segments that look like a semi-truck. There is ether a-go-go, decapentaplegic, mothers against decapentaplegic (identified by looking at maternal enhancers of decapentaplegic) and Medea (another group of maternal enhancers obviously discovered by someone with a liberal education). A Drosophila mutation that results in a fly with no heart is called tinman.

Drosophila has only twice as many predicted genes as yeast and fewer genes than C. elegans. This is a surprise and will be an area of intense scrutiny. Because there is a huge difference in the development, morphology and behavior of a worm and a fly, yet the number of genes appear to be similar. Somewhat of a paradox.

The sequencing of the Drosophila genome is quite an accomplishment and really sets the stage for the human sequence. But remember, this is not a "complete" sequence. There are over 60 million bases of heterochromatic DNA that are not available. It is generally believed that this heterochromatic DNA is not important, containing massive amounts of repetitive DNA and such. Are there important sequences found in the heterochromatic regions, as discussed earlier? Maybe. This project generated over 3 million bases of sequence that are clonable but can not be mapped to any of the euchromatic regions. Are these in heterochromatic regions? Could there be even more? Probably.

I'll be reading up on this work for a future column. The next year will see a real explosion in data dealing with genomic structure. While Immunex uses this to identify new potential therapeutics, let's not forget that these data will mark the true beginning of the fundamental understanding of ourselves, what we are made of, where we came from and maybe where we are going. Instead of dealing with isolated fragments of the information, we will have reasonable detailed road maps. A tremendous number of questions will be answered by our examination of these data, but, most likely, an even larger number will be posed. It is really an exciting time for biologists. Lucky for us that the processing power of computers is still following Moore's law. We will need it.