Pacific Symposium on Biocomputing
January 3 Report

I always hate rereading what I write. I tend to find things that I want to change. After I had uploaded my report last night, I checked it over one more time. And what do I read but "The Big Island is one of the unique areas in the world." Unique means one of a kind. You can not really add a modifier to something that is unique. What I wrote was redundant. I promise not to do it again. Reread my reports, that is.

Yesterday did not have a lot of science since the meetings had not started. Today is filled with tutorials. There are six of them, given two at a time for 3 hours each. It is usually an interesting exercise to pick the 'right' one from each pair. Today, I am going to try "Predictive Methods Using RNA Sequences" in the morning and "Information Extraction from Scientific Texts."

The RNA one was pretty interesting. There is a lot of work being done examining the tertiary structure of RNA sequences. It has real potential for therapeutics. Many drugs bind to RNA better than they do to proteins. Unfortunately, they do not bind to RNA with the specificity that they do with protein. Knowing RNA structure would help in designing better drugs.

The main problem with determining RNA structure is that, in contrast to proteins which have 2 torsion angles around the backbone, RNA has six. There are also two different conformations of the base, two conformations of the sugar moiety. There are only four bases but a tremendous number of possible conformations.

In fact, the next three hours were spent showing exactly how poorly computational methods work. Trying to align nucleotide sequences is not very easy. Evolution conserves RNA structure not sequences. Often pairs of nucleotides will be mutated, making BLAST type analysis virtually impossible.

Even more so than proteins, RNA structural analysis needs a tremendous amount of experimental data to verify structure. Until recently, most of the data centered around tRNAs, because they were easy to crystallize to get tertiary structure. The elucidation of the three dimensional structures of both ribosome subunits and the complete ribosome has provided a large number of possible structures for the databases. This should help with further computational methods. But there is still a long way to go before we can get a good handle on the structural comformations RNA can take.

This part of the tutorial was reasonably useful because it provided a good overview of the field and its problems. The next part dealt with using XML to display RNA structure. Not nearly as interesting but has some potential uses for the future, not only in this field but in a lot of science endeavors. XML holds the promise of collating different databases written in different formats. We shall see.

The other tutorial I attended was not as worthwhile. It was a discussion of approaches to condense information out of scientific text, such as papers. Unfortunately, the presentation left something to be desired. It was very jargony with little real world examples. Mainly just obvious rules like ambiguities can be a problem. There was a lot of discussion about how a program can accurately parse a sentence and its negation (i.e. John runs and John does not run). Careless programming may allow the computer to see these two sentences as the same.

What I got from this presentation is that information extraction schemes are best used today to identify keywords and such for bibliographic databases like Medline, but are a long way from intelligently extracting text. But, we will have a session devoted to this so maybe current approaches will have some surprises.

The scientific presentations start tomorrow. The meeting officially opens at 8:20 am with coffee, juice and carbohydrates from 7:30 to 8:15. I'll be loading up on the free stuff. This picture shows what the food counters look like BEFORE the food arrives. Maybe tomorrow I'll show what it looks like full. The morning sessions are on High Performance Computing for Computational Biology. That looks to examine new methods to find genes in the databases. We will also have the keynote lecture by David Haussler on A Working Draft of the Human Genome. The afternoon will deal with protein evolution and the evening will be roundtable discussions of the day's presentations. In previous years, bottles of wine were provided, making these very free-wheeling discussions. Hope they are this year.

Presenters - 3. Presentation Methods - Windows with Powerpoint - 3. Arrgh!

Pacific Symposium on Biocomputing January 3 Report

Pacific Symposium on Biocomputing
January 3 Report