Richard Gayle
Networking Kevin Bacon June 2, 2000
Connections is the name of a column by James Burke that runs in Scientific American. It is usually a rambling essay showing how Charles Dickens had a math teacher whose second cousin married the grandchild of the King of Spain whose uncle created the first venture capital company that was instrumental in founding the orphanage that Dickens describes in Oliver Twist. We might marvel at the fact that so many people are connected by circumstances, but, statistically, it is not so unusual. Most people can be connected through their interactions to almost any other person with only a few steps. The anecdotal 'Six Degrees of Separation.' This is called graph theory and is an area of active research.
One of the more popular parlor games recently has been the Kevin Bacon game. In this game, you attempt to link Kevin Bacon to any other actor by only using movies that the actors have appeared together in. For example:
Adam West was in Drop Dead Gorgeous (1999) with Denise Richards
Denise Richards was in Wild Things (1998) with Kevin Bacon
So 'Batman' Adam West links to Kevin Bacon with a Bacon number of 2.
Now this was an interesting game on the Internet several years ago, with people trying to find the lowest Bacon number for any 2 actors. Many could be linked in less than 5 steps. Now, the Internet being the Internet (meaning that many people on it have WAY too much free time, i.e. college students), a web site at the University of Virginia was posted that allows you to play the Kevin Bacon game yourself.
They use a database from the Internet Movie Database, a great site for people like me that love movies. This database lists actors for almost every American Movie ever released. So, these guys at Virginia took this database and wrote the software that will give you the shortest link between any actor and Kevin Bacon.
Orson Welles (Bacon Number=2)
Orson Welles was in Muppet Movie, The (1979) with Steve Martin
Steve Martin was in Novocaine (2000) with Kevin Bacon (even unreleased movies)
Stan Laurel (Bacon Number=3)
Stan Laurel was in Jitterbugs (1943) with Vivian Blaine
Vivian Blaine was in Dark, The (1979) with William Devane
William Devane was in Hollow Man, The (2000) with Kevin Bacon
But now, with a database, you can actually ask questions that have interesting answers. Such as, what is the distribution of Bacon numbers throughout the whole database? Can everyone be linked in less than 6 steps? The computer can determine this level of connectivity easily once the proper algorithm is devised. See how these guys could turn a casual idea into a research project!
Anyway, here are some of the numbers for Kevin Bacon. Out of almost 400000 actors, 1461 have worked directly with him for a Bacon number of 1. Over 100000 have a Bacon number of 2, 230000 have a Bacon number of 3 and 51000 have a Bacon number of 4. 99.2% of all the actors link to Kevin Bacon with a Bacon number of 4 or less. You have to try really hard to find someone with a Bacon number of 5. Here is one:
Grace Ariyawimal (Bacon Number=5)
Grace Ariyawimal was in Visidela (1994) with Jackson Anthony
Jackson Anthony was in Lokuduwa (1994) with Joe Abeywickrama
Joe Abeywickrama was in Sorungeth soru (1967) with Liz (I) Wilson
Liz (I) Wilson was in Life 101 (1995) with Mickey O'Rourke
Mickey O'Rourke was in Sleepers (1996) with Kevin Bacon
There is one poor guy who has a Bacon number of 8. But everyone else links to Kevin Bacon in less than 8 steps. The average works out to be 2.86 steps between any actor and Kevin Bacon.
So, indeed, most actors can be connected to Bacon in less than 5 steps. But, since we have a database, more things can be examined. Such as, is Kevin Bacon the center of the known universe or are there other actors that are more connected? So, they generated tables for every other actor and found that there are 668 other people with greater connectivity. The actor with the fewest average steps to reach is Rod Steiger (with an average of 2.57415 steps), just barely beating out Christopher Lee (2.57515) and Donald Pleasance (2.57517). Also, some poor actor, they do not name them, has an average connectivity of 9. He only worked with 1 other actor who, himself, only worked with 16 other actors.
So, why am I talking about this? It turns out that this sort of interconnectedness is pretty common. And the degrees of separation between things or people often turns out to be very small. And, access to information in databases can very quickly identify these connections. Heck, even the Santa Fe Institute has gotten mixed up in this. It is leading to real research. A paper in Nature from September indicates that any web page can be clicked to from any other in less than 19 clicks.
Now, the event that started me on this journey was a simple one but it was mind-boggling in its ramifications. I had written a column on retrotransposons that linked to an article in Science. At the bottom of that article, the references had links to Medline articles and a full text link to another Science article. But here is what was really cool - a link to an article that referenced this one. Now that is neat. Normal references or citations can only work backwards in time. They can only link to things that happened BEFORE the paper was published. When we do literature searches we try to find a recent article that will be useful and lead us to other papers. This only works in one direction. But now, this sort of reverse-referencing can lead us in another direction, forward in time, to current articles from wherever we started.
I was pretty excited by this revelation and I mentioned it to Judy Reaveley. I showed her the page and... just about popped my eyeballs and broke my jaw. (I had not been that flabbergasted since I showed a friend my new Monty Python LP and discovered that instead of two sides it had three.) Instead of the one forward link that I had seen the daybefore, there were now several new links to articles citing the paper. The page had been updated with new articles and will presumably in the future as more articles are published. Let me repeat. This page was not static and would continue to change in the futuire as more papers were published.
And, none of the referring papers were from Science. One was Genome Research and the other was PNAS. So, not only was Science linking to new papers and updating the older pages, it was doing this with articles that it did not even publish. How did this happen? How was AAAS getting access to the other journal's databases? Here is a clue. Look at several of the online sites for some of the journals published by scientific organizations (i.e. JBC, PNAS, i.e.). They all kind of have the same format. So, being the Internet junkie I am, I checked out who had the domain names for some of these journals. Many link to a place at Stanford called Highwire press.
Turns out that this is Stanford Press' online publishing arm. It is formally a department at Stanford but it has a really interesting mission. They recognized that many of the scientific associations would have problems getting their material online. So, instead of each association designing and maintaining their own site, Highwire would provide the technology for them. The publishers would just have to provide the databases.
Now, once you put together the database, smart people can do amazing things. Like linking back to articles when they are cited in future publications. Like linking to related articles. So you can easily move back in time by using the references or move forward in time by using the citations. This is a whole new way of searching that is quite different than using Ovid. See, these new sorts of links have already been sorted for some sort of relatedness (sorry, could not resist) by the very authors that write the papers. So you know that there is something important here from the get-go.
Think of it. You want to check out the latest on Ribosomes and Translation in the 1997 Annual Review of Biochemistry. You can move backwards in time via links in the Literature Cited. But you can also go to the end of the paper and find recent papers which cite this article. Papers as recent as this month, dealing with ribosome function. It makes it so much easier to get the latest research on a topic. You can even have them e-mail you when new articles cite this paper. Cool.
And, Highwire has been at the forefront in providing free articles. It now has over 150,000 free articles available. Plus, it has set up easy access to many journals that can be paid on a fee basis. So, you could access any article in Science for 24 hours for $5. You can get unlimited access for 24 hours for $10. And, they will take your credit card online. No more waiting for Infotrieve to send a fax. This aspect I really like. I want a corporate credit card ;-)
Highwire Press is demonstrating where scientific publishing is going. You are going to want to publish your papers in journals that will be this connected. So that someone can move backward to relevant articles you cite and move forward to relevant articles that cite you. And, in many cases, the articles will be free. Seems to me that many of the For-Profit journals will be hurting. Even without the government providing support, it appears that a lot of the scientific information available in just a few years will be cheap, easy to search and connected in ways that we are just starting to appreciate.
So, maybe in the future, someone will be able to put together a site showing how many steps there are between any two scientific papers. I wonder what the Gayle number is for Francis Crick?