NEDCC workshop meeting at MIT. The title of the workshop was “The Tectonics of Digital Curation;” I’m a sucker for cool conference titles and this got me in two ways – first, anything with the word “tectonics” in it is automatically cool and second, it said “digital curation” not digitization. This is important because (within natural history at least) most of the current energy is being devoted to capturing more information digitally rather than thinking about what you might do with it afterwards. The concept that you need to curate the information once you have it is an important one. Plus I wanted to go see the Stata Center (above), where the meeting was being hosted - back in my Cambridge days I watched it going up, but I'd never seen the completed thing.
Anyway, it was a great program, although at first sight it had nothing to do with natural history collections, and the talk that I found most thought-provoking was actually the one that looked the least promising. Megan Winget, who is an associate professor at UT Austin, gave a talk on the challenges of collecting and preserving videogames. Now it may surprise you to think that there are challenges associated with collecting videogames – you buy the game, you put a catalog number on it, you put it in a storage cabinet, right?
Wrong. Big-time wrong. You see, a video game is quite a complex entity. First of all, there is the game itself, which is basically a computer program. But then there’s also the platform that it runs on – its operating system if you like. Then there’s the hardware associated with it; this might be the arcade cabinet for Pac Man, or an X-Box, or a Wii controller. Then there are the marketing materials and the packaging.
And that’s just the start. The more recent generation of on-line videogames have tools built into them to allow other developers to modify the game. These customized versions, or Mods, have a life of their own and in some cases end up more popular than the original game they were based on. Then there are massive multiplayer online games (MMORPGs), where you have whole communities of people interacting with the game, creating and uploading supporting materials, producing wiki-based game manuals, etc. How do you record and preserve all this material?
And what about the development process itself, how the game was created? You would think that all of this information is recorded in exhaustive detail by the companies that produce the games. Wrong again. It seems that there is little or no consistency of documentation and the game developer community lacks a tradition of recording and archiving information, beyond routine back-ups. Many of the critical early steps take place on a white board in a meeting room – if you’re lucky, someone may snap a picture of this. Development documents are seen as living documents that can be overwritten at anytime and archive copies are rarely produced. And even if they are, when programming starts a multitude of changes get made to the original schema by the programmers; these are almost never recorded. As a result of this there’s often a yawning gulf between the original development document and the final manual for the completed game.
Assuming you manage to preserve all this information, there is then the question of how you give people access to it. The obvious way, at first sight, is to make it possible for them to play the game. But this is actually harder than it looks. Many of the earlier generation of games can no longer run except on the original hardware. In some cases, it’s possible to emulation software get them to run on modern machines, but experienced gamers criticize this because the authentic experience of playing the game can’t be recaptured when you play it on a different system (presumably, as someone at the meeting pointed out, you also need a basement and an old couch that smells of beer, but that’s another issue).
I was fascinated by this, but initially more from an intellectual standpoint – it was interesting, but obviously it had nothing much to do with natural history specimens. But as I drove home along the Pike I began to think some more. And this is what I thought.
As museum people, we spend a lot of time worrying about objects – collecting them, preserving them, and making them accessible. This carries over into the way that we look at collections data. We take an object-centric view of data; there is a specimen, it has various pieces of information associated with it, and we link object to information by some form of unique identifier, usually a catalog number.
Megan’s talk made me wonder whether there was another way to look at this relationship. Suppose we take the position that there is actually an entity at the core of our museum object that no longer exists – the organism itself, which once existed at a particular place and time. The “specimen” is one aspect of this entity – much as the X-Box or Wii controller is one part of the game – but it’s not the whole thing. For one thing it’s dead, and it’s also gone through a process of “preservation” that is actually anything but – most prep methods involve the destruction of various parts of the specimen and its conversion to a state which is even further from the original organism. All of this means that there are various categories of information relating to the original entity that cannot be deduced from studying the specimen. Or, to put it in the gamer context, you can’t “experience” the organism in the same way as you could before you collected it.
That’s not to say that the specimen isn’t important. Far from it. Study of the specimen can reveal a multitude of information about, for example, the organism’s evolutionary history that would not necessarily be gained from studying it while it was alive. But the associated data adds even more value to it and can actually provide you with a way to restore at least some of the information that was lost as a result of collection and preservation. So field notes can provide valuable information on behavior; preserved gut contents can give you insights into diet, as can casts of fossil teeth that reveal microwear patterns. This is another way in which natural history specimens mimic computer games. Megan was arguing that preserving the material culture that surrounds the game – the supporting materials produced by MMPORPG communities, for example – gives you insights about the game that you might not be able to glean from studying the game itself.
Just as games have players, natural history collections have users. And like the gamers, the community of researchers is constantly doing things that modify the specimens in the collection. They study them, generate data, publish the data, and in doing so change our understanding of the nature of the specimens. They produce derivative materials – CT scans, laser surface scans, MRI images, X-Ray images, SEM images, molds, casts, and plain old photos. They carry out additional preparation to the physical specimen, removing additional matrix to reveal an obscured feature of a fossil, or clearing and staining a fluid-preserved specimen. All of these, I would argue, are part of the wider definition of the specimen and need to be curated as such. Some of them, such as casts of fossils, are in some ways like the Mods that exist in the gaming world. Like Mods they have a utility that goes beyond the original specimen – they are assigned their own numbers in the catalogs of other institutions and they get studied, loaned, and exhibited as objects in their own right.
We also have the same problems with documentation that exist in the gaming world. Documentation standards have improved and are still improving, but for many specimens we do not have adequate field data, or even records of who prepared them and the methods that they used. And while researchers do publish new information on collections, they are frequently reluctant to share all of their data – an issue covered by another speaker at the meeting, Heather Piwowar. Without easy access to raw data, we are faced with the same gulf that exists between game development documents and the final product – you know what the result is, but not necessarily how the researcher got there and certainly not what insights and alternative approaches were discarded en route.
These parallels are important to keep in mind as we push onwards towards the development of a national cyberinfrastructure for biological collections. At the moment, rightly or wrongly, we are fixated on the idea that we are capturing data about our collections and making it more widely available. What struck me on the long haul down I-395 was that we are actually creating a new type of collection – a collection of complex objects, made up of physical and virtual components that may “live” in more than one place. We understand certain aspects of these collections very well, others less well, and there may be emergent properties that we can hardly guess at.
We may have to relearn how we curate these collections. A good start might be to mirror the approach adopted by Megan Winget and her coworkers, who are working with IMLS funding to study the game industry’s creation methods, behaviors, and attitudes. Their idea is that by understanding the process by which these complex objects are created, you can build more meaningful models of preservation and collection. It seems to me that this would be a good place for the natural history collections community to start – and an acknowledgement that we are dealing with something relatively new in our centuries-old profession and that we may need to look for models in surprising places.