Wednesday, September 17, 2014

The Price of Knowledge

In the last post, I touched one issue of collection economics... namely, where does collection value reside and how do you measure it? The other question that I've been ruminating on is how far you can go in trading on that value to offset the cost of caring for the collection. Or, to put it another way, when is it OK to charge?

This question came up in a discussion on the NatSCA listerver regarding the rights and wrongs of charging the general public for enquiries, which then got picked up in a comment piece by Jan Freedman, and finally an online poll by Museum's Journal that basically showed an overwhelming majority of respondents thought it was "unethical" to charge for enquiries.

So what do we make of all this? Well first, we can dump the whole "unethical" angle. This seems to have come about because a number of people, especially in the listserver discussion, believed that you shouldn't charge the public for something that they are already paying for through their taxes. Setting aside the fact that not all museums are publicly-funded, there is no reason why a public service can't charge people, especially if the funding available from taxpayers doesn't meet the full cost of providing the service. Do I get to ride the bus for free, because it's public transport and I'm a member of the public?

Once we've got rid of the emotive language of ethics, we can get into more substantive arguments - namely, is it a good idea to charge? This is the thrust of Freedman's opinion piece, and he makes some good points. We want to encourage people to appreciate museums and their collections as common property; to educate and too inspire. Answering enquiries is a great way to engage with the public and an alternate route to further the institution's educational agenda. And finally... and I think this is perhaps the most persuasive argument of all... it will probably end up costing you more to collect and process payments than it would to just answer the question.

Unless, of course, it's a very complicated question. Most enquiries are along the lines of "is this a fossil?' and can be answered quickly and concisely, with "no" being the default response on 99% of occasions. But there are exceptions, especially once you start dealing with academic users of the collection. So let's back off a bit and ask what to me is the main question. When you charge someone for an enquiry, what are they actually paying for?

Yes, I know that at one level they're paying for an answer. But in most of the cases that people have been talking about so far, you’re charging for access to our expertise. So, what does it actually cost to answer a question? Let’s assume that all we’re really talking about is what it costs my institution to employ me for the period of time that it takes to answer the question. That’s my salary, my fringe benefits, and an overhead that covers the basic services that Yale has to provide in order for me to do my job, which all adds up to about $1.30 per minute.

Most public enquiries don’t take very long to answer – say 5 to 10 minutes if they come in via email, as the majority of them do these days. So what we’re basically saying is that in order to recover the cost of me taking time to answer someone’s question, I’d have to charge them between $6.50 and $13.00. If we handled a lot of public enquiries, this might quickly add up to a lot of money.

But in fact, we don’t. The majority of our traffic comes from professional users of the collection, with maybe 15 or 20 enquiries a year from “the public.” So, following on from Freedman’s argument that this is all about educating the public as to the value of our collections, my Division is spending around $170 a year on public outreach, which is a *very* small fraction of my annual operating budget.

True, you might argue that spending $170 to reach 20 people is not a particularly efficient strategy for outreach. But many of us, myself included, can remember writing to a scientist as children and being psyched when we got a letter back (and it was a real letter in those days...). It's an important way to engage and potentially inspire and falls squarely into the Museum's outreach mission.

But suppose that it’s not a 5 or 10 minute answer, but is instead a one to two hour answer. These sorts of questions are more usually generated by our professional users, but they can come from the public as well. Now you’re looking at a cost of maybe $150 in terms of my time. Would this be worth recovering?

The answer to this, I think, lies in the fact that a 1-2 hour question is fundamentally different to a 5-10 minute question. If a question takes me only 5-10 minutes to answer, that’s likely because I either know the answer already, or because I can find the information needed to provide an answer more quickly than they can, by virtue of my professional training. But a question takes 1-2 hours to answer, it probably means that I have to access the collections to get the information – that’s where the major time-sink comes in. And to my mind, this is an altogether different scenario.

As we've discussed in previous posts, you can spend hours arguing what museums are “for,” but in the case of the VP collections at the Peabody at least, I would argue that Yale is paying a goodly chunk of money to operate a facility that provides resources for research and education. There are different ways in which you can use this facility – you can search for information on-line, you can visit the collections in person, or you can ask me or my staff to answer your question for you.

From our perspective in the collection - given our limited time and funding - we’d like to push as much of the expense of using our facility back onto the users. So for us, the third option – where we answer the question for you – seems by far-and-away the least cost-effective. But from the user’s perspective, travelling to the Peabody to do 1-2 hours’ work might or might not be cost-effective, dependent on where they’re travelling from and whether there are other things they might do when they’re here.

So this is a different sort of cost calculation than the one we used for the 5-10 minute enquiry… $150 for us to answer the question for you, weighed against (probably) several hundred bucks of travel and accommodation expenses, to say nothing of your time, for you to come here and answer it yourself. When you put it that way, a $150 fee looks like a bargain, right?

Of course, all of this assumes that we have 1-2 hours to spare. It doesn't do us much good to charge for a service that means that we can’t perform the other, basic collections operations that our museum is paying us to perform. So could we make enough money charging for enquiries to support the salary of someone dedicated to answering those enquiries?

Here at Yale, we employ Yale work-study students at around $12 an hour to provide collections support, which includes dealing with enquiries. The primary rationale is educational; the students get the experience of working in a museum environment and, as many of them have gone on to museum-related grad school programs, it’s plainly a meaningful experience. But we also get a motivated group of temporary employees that significantly reduces the burden on our permanent staff.

In a system like this, where we have flexible employment paid at an hourly rate, we could relatively easily charge a fee for a time-heavy enquiry and use the fees to support a student to answer the enquiry. That seems like a relatively “ethical” solution – we’re charging for a service, but the funds are being used to directly support an educational program that benefits the collection user’s professional community by helping to generate trained workers.

And it’s cheap. Work study students at Yale don’t pay for their health benefits and even with the overhead their cost works out at about 30 cents a minute, meaning that $150 enquiry now costs around $45. So it would appear that I've solved the entire conundrum and that every vertebrate paleontologist in the world should be slapping me on my virtual back and thanking me for saving them hundreds of dollars in airfares and hotel bills.

The problem, of course, is that $45 is a bargain unless you’re used to paying nothing. The majority of collection users operate in an entirely un-monetized economy, where all of the services that they use are provided on a quid-pro-quo basis; as most of us, or at least our faculty curators, are collection users as well as providers, once we start charging we would inevitably start paying as well.

And that is perhaps the biggest conundrum of all – this whole vast edifice of science, consisting of billions of specimens, tens of thousands of people, and hundreds of buildings, is supported on not much more than the belief of the museums that it’s worth investing funds to provide a free service that benefits the scientific community, and by extension society as a whole.

This would almost make me feel warm and fuzzy inside, were it not for the nagging concern that a system like this is terribly vulnerable. We can, if we chose, quantify the cost of the service we provide – I've done it in this post, albeit as a series of back-of-a-beer-mat calculations that any half decent economist would shred in seconds. But we really don’t have much idea of how to quantify benefit, or at least not in dollar terms, which means that we can’t talk in a meaningful way about value (which is not the same as cost). Nor can we have a conversation about efficiency, or the cost-effectiveness of the service we provide.

And when don’t have answers to questions like that, then our prowess at answering questions about dinosaurs looks a lot less impressive.

[The observant among you will have noticed that I mentioned three ways of accessing the collection, but I only talked about two of them. That was deliberate. Digitization costs got touched on in the last post and I'll come back to them in a future post]

Saturday, September 6, 2014

Ethics or Economics

Having not blogged for a while I've missed a couple of juicy controversies that others have cheerfully piled into. So I'm left playing catch-up, and feeling like I'm not quite as cutting edge as I used to be. But sometimes even half-warmed leftovers can prove surprisingly tasty, so I decided that it was worth revisiting both stories in a pair of posts.

Controversy #1 relates to a paper published in Science by Ben Minteer and colleagues back in April (see what I mean about being late to the party?) which said, and I paraphrase, that there really was no reason for biologists to kill things anymore because you can get all the data you need to describe a new species from DNA samples and digital sound and image records. If you want to read the paper - and frankly there are better uses of your time - you can find it here:10.1126/science.1250953

There were only two surprising things about the paper, namely that Minteer et al bothered to write it and that Science thought it worth publishing. This particular issue has been round the block so many times, particularly in the ornithological community, that it has no tread left on its tires. I've even covered it myself in an earlier post. There really isn't any case to answer as far as the impact of scientific collecting on endangered species is concerned and those involved should have known better.

Nonetheless, an impressive array of worthies from the biocollections community formed a line to beat the living daylights out of Minteer's thesis in various blogs, interviews, etc., including nearly a hundred people who signed a riposte to the original paper that was also published in Science. The mainstream media, who like nothing better than the sight of two groups of egg-head scientists pulling what's left of each other's hair, dutifully took notice and the whole mess was extensively reported in a wide range of venues including NPR, Slate, and the CSM

As it happened, this turned out to be a good opportunity to emphasize the vital importance of collections to our understanding of the natural world, and many people from the collections community did so, eloquently and effectively, so I'm no going to rehash the arguments again. But it did make me think about a related issue, which is our oft-repeated mantra that much of the value of natural history specimens lies in their associated data.

I'm currently talking to some colleagues about a potential project on the economics of museum collections as large-scale distributed research facilities (yeah, yeah, I know… it doesn't sound all that interesting. You'll just have to take my word for it that it is). Anyway, it's made me think a lot about cost/benefit calculations.

Suppose that what we say is true, and that most of the value of natural history specimens is in their data.   Now consider the fraction of curation costs per specimen that is devoted to data storage and distribution versus physical storage and specimen access. My guess - and it is only a guess, I haven't quantified it (yet) - is that the cost of storing and serving data is significantly less than the cost of housing and maintaining a physical collection. So if you say that most of the value lies in the data… do I have to draw you a picture of where this is leading?

Now obviously, I'm not the first person to have thought of this. In fact, since we began the major effort to digitize the nation's biocollections, there has been a small, but persistent niggle of concern about what the long term implications will be for the collections we curate. It's usually expressed in terms of diverting some grant funds away from physical collections care and towards data capture. Since most of us are already doing some form of data capture, what we're talking about here is a relatively short-term injection of funds to accelerate the process and deal with the (admittedly gigantic) backlog. But I don't think that, as a community, we've really got to grips with what the much longer-term implications of mass digitization might be. Are we making physical collections redundant?

Clearly there's a strong counter argument, in that specimen data, in isolation, are actually not that valuable. The value is contextual - it's linked to the specimen. The specimen without data is much less valuable than the specimen with data, but the reverse is also true. Having data allows you to better interpret the specimen. It also improves your ability to study the specimen and generate more data.  To some extent, data-minus-specimen is a bit of a dead end.

This is particularly true when we enter the bright and shiny new world of "Big Data." As I'm sure you all know by now, Big Data is all about correlation, not causality. It reveals patterns, but it doesn't provide explanations for why those patterns came about, or even if they are "real" patterns, as opposed to statistical artifacts. To answer those sorts of questions, you have to go back and reexamine the sources of the data, which our case are the specimens.

But what exactly are those specimens, or rather, what should they be? Traditionally they might have been a skin and a skull, a whole animal in fluid, leaves and flowers on a herbarium sheet, a pinned insect, a microscope slide, or something else depending on the discipline concerned. Now these "traditional" preparations are likely to be supplemented by tissue samples, digital imagery, sound and video recordings, etc. And, of course, data - because the data are an integral part of the specimen.

There's a cost/benefit curve to every specimen. The cost is what it takes to collect, prepare, house, maintain, and provide access. The benefit is what you get out of it in terms of research, education, entertainment, etc. The calculations are complex and "value" may be positively or negatively impacted by a number of factors: for example, the number of other specimens in existence, changing research priorities, the invention of new analytical techniques. But just because its hard to do this, doesn't mean that it shouldn't be done.

What we have't really grasped about digitization, IMO, is that the curve is changing. We're still stuck with a paradigm that assumes that most users will want/need to physically access a traditional prep type housed in a museum. That might be true, but we need to quantify and justify it if we're to continue to argue for resources. At the moment, our most sophisticated argument seems to be that we can't predict how collections will be used in the future, so we'd better not change anything now. If pressure continues to build on funding, as it likely will do, then we need a more nuanced and better supported position.

Minteer et al used an ethical argument to challenge our traditional methods of collecting, but perhaps they'd have been more successful deploying an economic one.  As a community it behoves us to think about these issues before someone else does it for us….