Some objections to our use of library statistics
The use of certain library statistics, mainly related to circulation and its electronic semi-equivalents, has taken on a high degree of importance in library management since 1979, when Charlie Robinson introduced the “give ’em what they want” philosophy of collection development at Baltimore County Public Library. Circulation statistics provide an easy way present an argument to higher level administrators that you are moving in the right direction, if you can take steps to increase them. But there are a number of problems in the way that we often use these statistics. I would like to talk briefly about some of the problems that I have observed in an academic library setting.
1. A download does not imply relevance.
From the perspective of a reference librarian who works with students and faculty who are conducting various types of research, it is important to keep in mind that the circulation of a book or the download of an article is not the end point of our concern. Especially in an educational institution, it is important not only that the resources the students walk away with will help satisfy the formal requirements of their assignment (which may include instructions to “use five articles from scholarly journals”), but that they will help them learn. Sometimes students who haven’t yet figured out how to weed through search results to find what is relevant to the needs of their argument will download numerous articles that they will not read. That boosts circulation stats, but also represents a failure on our part. Policies that are geared toward boosting stats will not address that problem and will not promote learning.
2. Some usage counts for more than other usage.
While it may go against the grain for many democratic-minded librarians, not all usage of library materials is equal. When a lower division undergraduate reads a book that introduces them to a topic, their use of that book is very different from the way a professional scholar uses a book in her own field. This is not to say that the scholar’s use should count for more, necessarily, but to acknowledge that it counts differently and may be more expensive due to being specialized. It is a difference that should affect our use of statistics to draw conclusions. Some might look at patterns of use and argue that resources should be shifted toward whatever is used more. It is important to see that if a library goes in that direction, it is not simply a shift of resources toward “what users want,” but a shift of resources toward what one group of users want (lower division undergrads) and away from what another group of users want (researchers). The proportion of the budgets for resources in each of these areas should be determined differently at each institution according to its own educational policies and the type of place that it is, rather according to a simple market calculus that says “give ’em what they want,” which would tend automatically to hurt users of more specialized resources.
3. Incompatible data.
Because of the perceived need for statistics in reporting to entities outside the library, it is tempting to compare data that is collected differently or simply represents different things. A play of a track in a music database does not represent the same thing as a download of an article in JSTOR, yet they are stats that exist side-by-side in the context of electronic resource statistics. While a person will likely leaf through a book in the stacks before deciding to check it out, an electronic book needs to be “checked out” (counted as a circulation) in order to leaf through it to decide whether or not to use it, yet these stats exist side-by-side in the context of circulation statistics. COUNTER-complaint statistics aim to solve this problem, but they still only measure interactions with the interface rather than use of the material. (If we were to gather information about actual use of our information resources, there could be multiple dimensions to the data.)
4. Patterns of use differ across scholarly communities.
The ways that people use information resources in different disciplines and sub-disciplines, and for different purposes in general, can create distortions in our interpretation of the data if the potential for these differences is not kept in mind. We may calculate a “cost per download” for a given database and compare it to the cost per download for a database used by another department, in order to determine where our money is being spent most effectively, without realizing that a single download of an article may be more or less significant as a part of the overall research for a given project. One scholar may need to download 50 “articles” (which, according to what resource is being used, may not be an actual journal article) in order to get 50 facts, while a researcher in another discipline will download a few texts for the purpose of close reading. The first database might seem to have a much lower “cost per download” than the second (if these scholars’ use of the resource was typical) while the “cost per project” or “cost per research hour” may be the same.
5. The problem of accurately identifying causes of change over time.
The answer to many of these sorts of objections is sometimes to say, “We can still use these stats to identify changes over time,” and that is true, but drawing operational conclusions from those observations requires correctly identifying the causes. For example, when undergraduates began turning to Google to do research for writing assignments, librarians and vendors concluded that in order to compete with Google they needed to implement simplified, Google-like search interfaces to multiple databases. But these simple-to-use federated search products have not done anything to boost download statistics in full text databases of scholarly articles. This means the reason for students’ preference for Google may have been something else. I think it may have been because lower division undergraduates found more content that they could make sense of through Google than in the high-level original research in scholarly journals (which librarians, most of the time, to our great discredit, described to them simply as “more reliable”). If that turned out to be the true cause, the implications for planning would be different (not to mention not automatically clear).
To take another example, I think it’s not uncommon for library administrators to have the attitude that declining book circulation stats indicate a need to shift funds to electronic resources, while declining download stats indicate underutilization and a need to better promote those resources. The unstated assumptions are with respect to the cause of the change in the numbers – on the one hand obsolescence of the format and on the other hand insufficient publicity. It is important to realize that while the stats (these hypothetical ones) give us some information, they do not tell us that this common interpretation of the data is correct. Another possible explanation could be a decline in the amount of reading done by students, regardless of format.
6. Even if the cause of a change in the statistics is correctly understood, the response may be a question of philosophy and little else.
Let’s say that through some research done internally at a university, it is learned that undergraduates are arriving as first year students less well prepared, are spending less time studying, and have fewer and easier writing assignments than ten years ago, and faculty have less time to carefully evaluate their work. That finding would provide a good explanation for a decline in overall circulation. And let’s say that a healthy demand for DVDs for entertainment purposes had grown, that service to university staff members had increased as a proportion of the total, and that students viewed the library increasingly as a study space and decreasingly as an information resource. Is an obvious response implicit in these findings? I think not. I think there are many ways that a library administrator could respond to these changes, and the response would depend greatly on his own philosophy and the prevailing philosophy at the university. The data provide essential information, but they do not say a) it is imperative that the library boost circulation, or that b) circulation should be boosted by following a particular strategy.
7. Steps taken to boost statistics may have unintended pedagogical consequences.
A further potential error in the thinking that says that every use of a resource is equivalent lies in the potential pedagogical effects as well as the potential content implications of formats. If we shift funding from monographs to academic journals or from physical to electronic formats on the basis of perceived demand or future demand, we should not do it without acknowledging that the choice may have an impact beyond “serving more users” (assuming that we are even correct about the direction of demand). There are general differences in the content of monographs and academic journals, and collection development should attend to the question of what those formats contain in relation to the curriculum, aside from what may be suggested by data about use patterns. (It is commonly said that students “want articles that they can use from home and that aren’t too lengthy to deal with,” but seldom acknowledged that the original research in scholarly journals is mostly out of reach of lower division undergrads in terms of the expected background knowledge. One of the most common requests at academic library reference desks is for a “scholarly journal article” that provides an introductory overview to a major topic; such things are rare.) Likewise, as EDUCAUSE recognizes but tells us not to worry about, technologies affect the way that people learn, and these effects should be considered in terms of the educational objectives of the institution, and not taken for granted or taken as a market imperative.
8. Inherent conflict between pedagogy and markets.
The use of circulation and download statistics to guide decision making in libraries reflects a broader trend in higher education and in public institutions to orient themselves as a part of the market economy. (See John E. Buschman’s Dismantling the Public Sphere: Situating and Sustaining Librarianship in the Age of the New Public Philosophy for a thorough treatment of this problem.) In higher education in particular, there is a sharpening conflict between the logic of the market (what students demand) and pedagogical needs that stem from the institution’s desired educational outcomes. Simply stated, the conflict is brought to us by students whose educational aims only somewhat overlap their teachers’. Educators can rightly say that students don’t yet know enough to direct the curriculum (and in a very relevant sense, in terms of information literacy, our own bailiwick, they do not yet know how to separate themselves from the influence of advertising and commercial propaganda, which to a great extent shapes their demand for our resources). And students can rightly say that their tuition is paying for most of the show, and that the piper has a right to call the tune.
In real terms, the economic shift away from subsidized higher education to a more tuition-based model puts educators in a weaker position in terms of the conflict between pedagogy and markets. But it does not change the fact that the conflict is inherent to the educational project as long as professors profess to have something to teach and students have their own reasons for going to college, which are more and more about class anxiety. (Collection development librarians, to the extent that we claim relevant expertise, are caught in the same conflict between pedagogy and markets.)
9. Internal qualitative research.
A way forward for administrators who are stuck in a market situation may be to use more internal qualitative research about library users and use. In-depth profiles of a variety of users and focus groups designed to elicit unanticipated information are approaches to qualitative research that can provide both ideas for new directions for growth and useful talking points for reporting purposes. These techniques allow data collection to preserve important differences among types of use and types of users, and also allow for the generation of insights regarding the causes of change over time that circulation and download statistics are not able to do. Some qualitative data is already collected as part of many libraries’ assessment programs. What I am advocating here is a shift of emphasis, as a way of better capturing the connection between library collections and services and the mission of the university.
My philosophy about this.
Regarding the role of philosophy in interpreting library statistics and acting on them, I will be up-front and say that I favor an an alternative to chasing after the majority of users or potential users. I start with the assumption that educators are not there to educate students (transitive verb), but to to provide opportunities and assistance to students who want to educate themselves. If I worry about all of the students who don’t use the library or who use it poorly, I will die of depression, because the more we dumb down the collection or our interaction with our users, the more we will find ourselves competition with mass media. I prefer to make a range of serious resources available to students who have the motivation to make use of them. Their numbers may be small at times, but when a student who is motivated downloads an article from one of our databases and actually reads it and thinks about it, that download is worth immeasurably more to me, as a librarian, than the more numerous downloads of articles by uninterested students who are doing the minimum amount of work required to pass a class. The university is responsible to provide the best educational opportunities possible to its students, but students are responsible for their own education. Our use of circulation statistics should consider the fact that what we provide are opportunities for intellectual growth. The students have to meet us halfway.