My View

Richard Gayle

The Intro(n) and the Outro(n) May 5, 2000

I love word play. New words follow new ideas. I'm not talking about slang, whose main purpose is to be obscure to anyone not IN the group. The title of this column is taken from a song by my favorite obscure late '60 British band. Obviously if you have an intro, you have to have an outro. This band called themselves The Bonzo Dog Band. (Here is a link to a short Real Audio sample of their song: The Intro and the Outro; )

There was a lot of cross-fertilizing in the British music scene at the time and the members of this band knew EVERYONE, although they had only 1 hit. They had gone to school with Eric Clapton, appeared in Magical Mystery Tour with the Beatles, Paul McCartney produced one of their albums. They provided music for an early BBC TV program that had many of the people who went on to perform with another obscure name, Monty Python's Flying Circus. They had a tremendous variety in their songs, that required a fair amount of preexisting knowledge by their fans. Somewhat elitist (they were from art schools) but very funny.

The Intro and the Outro parodies a type of jazz performance where the leader introduces each performer who stands up, does a little riff, and sits down, continuing to play. So the Bonzo Dog Band does the same, although the leader is very smarmy with his dialogue, and all the music is essentially just the same bits spliced together. They built up this very complex musical bit, but it was really done by putting together little bits and pieces.

Science is full of neologisms, since it is full of new ideas. Some are punny, some are obscure, but some have interesting stories. I have always disliked the term intron and its cousin, exon. My title posits a different possibility but there are still problems. They always seemed backwards to me. The exon should be the part of an mRNA that is removed, that goes out, that is excised. The intron should be the part that remains in. But, no one asked me so they mean the opposite. The exon is the part that remains after splicing, to be translated. The intron is the portion that is removed, to be forgotten and degraded. (Much like the poor Bonzo Dog band has been today.)

Now when splicing was first discovered in the '70s many of us referred to the extraneous sequences as intervening sequences. Makes sense and a lot less jargony than intron. So why did these terms come into use? Well, this is the story I was told. It fits human nature so it may even be true.

Around the same time that all this work was being done, that first demonstrated the fact that genes were interrupted by junk sequences, a large oil company was going through some changes. See, it had bought/merged with several other companies over the years but had maintained the local brand names (many arising out of the original breakup of Standard Oil in 1911 into 30 different companies.). So it was known as Humble Oil in some parts of the country, and as Enco in others. I think at one time it was known by 7 different names. Hard to have a brand, and just think about all the stationary costs. So, it decided to rename itself. It did a worldwide search to find a name that no one else used and that did not mean anything in any language. They had a big publicity campaign to let everyone know that the name of their company was now unique. That there was NO other word in any language in the entire world that was similar. Do you know what the name of the company was? Exxon!! So, scientists, being the iconoclasts that they are, came up with a new word that was very similar, to take away the glory of Exxon's new brand. Worked pretty well, huh!! Well, I still find the term confusing.

Now, ever since introns were discovered, there has been the question, Why? It seems like a waste of energy to have sequences transcribed that are not used. And then additional pathways need to be developed to put the exons back together to allow proper translation. One of the first ideas, proposed by Wally Gilbert, I believe, was that exons could be shuffled around, creating new genetic diversity by mixing and matching with other exons. And we deal with alternative splicing all the time. But the question still remains: Why did introns first appear?

Introns, and exons, that are seen today are part of a pretty complex regulatory system. The spliceosome, a large cellular complex, containing lots of small RNAs and proteins, is required to properly remove the introns and stitch the exons back together. Without the spliceosome, there is no splicing. So we have a chicken and egg problem. How did one develop before the other?

Well, we may be getting very close to a good answer. Because it turns out that there are still remnants of the sequences that may have led to introns. And they are found in some very old genomes, those found in eubacteria, chloroplasts and mitochondria. In fact, they may derive from some of the oldest forms of RNA. These are the self-splicing forms of RNA that were the first ribozymes to be discovered. It turns out that some sequences, called group II introns, display many of the same properties found in spliceosomal introns, except they are capable of self-splicing.

Splicing in both the group II introns and the spliceosomal introns proceeds by a similar 2 stage transesterification mechanism. A nucleotide near the 5' end of the intron becomes linked to a 3' adenine to form something that looks like a cowboy's lariat and which is called by the scientific term - intron lariat. The exon ends are then joined and the splice site is cleaved, releasing the intron lariat.

Besides splicing by a similar chemical process, there are also sequence similarities near the intron/exon junctions in both types of introns. However, group II introns can splice by themselves, while spliceosomal introns require help. There is one more important difference - some group II introns are mobile and have the ability to insert themselves into the genome in a process called retrohoming (new word: reto: because it goes through mRNA stage and homing because it only inserts back into its 'home' DNA.) Put a form of the split gene into a cell, but remove the group II intron. Then place the group II intron into the cell on another piece of DNA. Guess what? The intron will "splice" its way back into the intronless-gene, in exactly the same spot as it would normally be found. This activity requires a reverse transcriptase, an endonuclease and a maturase activity that are all coded for by the group II intron. So, this intron, besides being able to splice itself to bring together two coding regions, also codes for several proteins that allow it to insert itself back into DNA.

This process has been pretty well worked out. Remember that self-splicing is like any other catalytic process: it is reversible. What happens is that a free intron lariat can find the sequence that it was excised from, only in this case it is the intronless form of the gene, in DNA form. The endonuclease activity clips the DNA, the intron lariat 'reverse-splices' itself back in and the reverse transcriptase activity fills in the first and second strands, reforming the gene WITH an intron present. These are the only enzymes needed; the cell's own recombinatory systems are not used. This does seem kind of archane. What use would it be to splice in intron BACK into the sequence it was spliced out of? Well, how about making sure that it is never removed from the genome by other means. A cell might want to rid itself of this useless sequence by deleting it. Retrohoming will help make sure that the intron is always added back to any DNA sequence from which it is removed.

The first transition from an RNA world to a DNA one would have required reverse transcriptase, and self-splicing RNAs were probably formed early in an RNA world. So the group II introns may be a remnant of such a world. How do we get to spliceosomal introns? We have an RNA molecule that can put itself back into DNA, although in this case, it only inserts itself back into the coding sequence from which it has been removed. Is it possible that it could insert itself into a novel sequence, one that never had anything like an intron, in a process known as retrotransposition? This would be really useful.

This very process has just been described in a recent Nature article. What they did was put the group II intron on a temperature-sensitive plasmid that would be removed when grown at the selective temperature. An antibiotic resistance marker was placed in the intron to select for insertions of the group II intron into the genome, after the intron-carrying plasmid was removed. So they had a system that allowed them to look for ectopic insertions of the intron INTO the bacterial genome when it had no other option (i.e. the chromosome did not carry anything resembling a group II intron sequence). The cells would die under antibiotic selection unless the group II intron could find someway to splice itself into the bacterial chromosome. They could then sequence out of the inserted intron to see where it inserted.

Not too surprisingly, the frequency of ectopic insertion was much less than retrohoming. They found 8 different insertion sites. They appeared to all be in the sense strand. Two of these sites were in 28 S rRNA. What was interesting was that all 8 showed some sequence similarity around the sites of insertion, indicating that the lariat sequence had some preference. And since they were in the sense strand, they could all be transcribed and propagated AFTER the insertion event. So, are the introns now present in the chromosome capable of self-splicing, even if they are surrounded by novel sequences? Well, the insertions in the 28S rRNA give an easy check because the unspliced form could be easily identified. The answer was Yes. It appeared that the intron could splice itself out of the rRNA about 1/8th as efficiently as it could in its normal position.

How did the group II intron get itself into a new sequences? Clues came from using mutants. Recombination negative mutants of the bacteria reduced the amount of retrotransposition into ectopic sites by 80%. So, in contrast to normal retrohoming, recombination is required. In addition, endonuclease-negative forms of the group II intron still retrotransposed quite well, much better than those same forms retrohomed. So, retrotransposition requires recombination but not endonuclease activity. Retrohoming requires endonuclease activity but not recombination. Both require reverse transcriptase activity. The model the paper proposed has the intron lariat attacking mRNA sequences rather than genomic DNA. Then reverse transcriptase makes this into double stranded DNA. This DNA can then recombine with the normal gene in the chromosome, but now it has an intron, splitting the gene in 2. But the insert is in the sense strand, so it can self-splice out to recreate the original RNA transcript. NEAT.

And now natural selection will try to alter the intron so that it more efficiently splices out. That is easier than trying to remove the intron. Because retrohoming will always try to replace the intron, right? So it now becomes a one way street. The intron is there. It is not leaving. So selection will try to overcome its presence by making the splicing more efficient. And, due to retohoming, it will now try to spread to other alleles of the introned gene (I love taking a noun at creating a new verb. "That gene has been introned.") found on homologous regions of paired sister chromosomes. SO it will rapidly spread through a population. Until all versions of that gene have an intron. Until another retrotransposition occurs, spreads it to another gene, and the cycle begins again.

What an elegant process? RNA splicing was probably around from very early on. So self-splicing regions would be found on early DNA genomes. Retrohoming would make sure the group II intron was not removed and retrotransposition would allow it to spread. So, introns would spread and spread until... obviously the cell would HAVE to bring them under control or they would eventually take over. So perhaps the cell did a very smart thing. Selection would seem to work here quite well. Enzymes could splice better and faster than RNA could. There would be selective pressure to make splicing as efficient as possible, so as to allow the cell to live! It needs the spliced mRNAs to do their job.

And there is an added bonus. As the cell took over, there would be no selective pressure to maintain either endonuclease activity or reverse transcriptase of the introns. Lose these and the introns are pretty much unable to do much at all, so losing any self-splicing activity would not really matter. However, the splicing aparatus was now in place and was needed to deal with the introns. And, the presence of splicing now offered some added advantages, since alternative splicing could now increase the variation of the genome. Perhaps an adaptation to a hostile retrohoming piece of RNA resulted in the positive effects of splicing.

Maybe spliceosomal introns are the leftover remnants of a golden age of introns, like the Real Audio snippets are the only remnants of the Golden Age for an obscure band. Poor intron, doomed to obscurity. But wait, it is very likely that the spliceosomal introns were not the final end for these primitive mobile elements. Next week, I will talk about some of the new data regarding important mobile elements found in our very genomes today. They take some of the basic principles of the group II introns but they are so much more sophisticated in their abilities.

(I guess exon is better than outron.)