The idea of assessing library services can intimidate

Size: px
Start display at page:

Download "The idea of assessing library services can intimidate"

Transcription

1 Assessing Library Services A Practica Guide for the Nonexpert Lisa R. Horowitz The idea of assessing ibrary services can intimidate even the most seasoned ibrarian. This artice is a straightforward introduction to assessment for the nonexpert, written for reference and pubic service managers and coordinators who want to anayze data about existing services or incorporate assessment into pans for a new service. It addresses how to approach assessment, how to make it more usefu, and how it can improve a service over time. Those who are new to assessment wi find this artice a hepfu starting point for service evauation; those who are experienced in assessment can share this artice with nonexpert staff to expain the basics of assessment, dispe anxieties, and generate support for assessment projects. Background in Assessment Many books and artices have been written that contribute to knowedge in the ibrary fied of how to do assessment. There are a few that have much more information than this artice offers and yet offer introductory information for the nonexpert earning how to do assessment. A major contribution to the fied is the Reference Assessment Manua, pubished by the American Library Association (ALA) in This comprehensive voume of artices about reference assessment aims to provide practicing ibrarians, reference managers, and researchers with access to a wide range of evauation instruments usefu in assessing reference service effectiveness. 1 The book comprises three major parts: (1) a set of essays that describe the appropriate research processes and the best toos to answer certain kinds of questions, (2) an extensive annotated bibiography of artices about reference assessment and performance evauation of reference ibrarians, and (3) a summary of many of the instruments described in the bibiography, most of which are incuded on a (sady now outdated) 3½" foppy disk. As that book is we more than a decade od, severa authors have worked to bring the information in it up to date. Jo Be Whitatch s book, Evauating Reference Services: A Practica Guide, updates the Reference Assessment Manua, comprehensivey incuding artices from 1995 through the book s pubication date, again with annotations. In her book, she aso incudes a vauabe, in-depth description of how to assess reference services. 2 It is an exceent introduction to evauating reference services, using simpe anguage and practica suggestions, incuding sections such as What is a Survey, and Why Woud I Want To Conduct One? and Ethica Issues regarding observation techniques. More recenty, Eric Novotny s book of artices specificay addresses how to assess reference in a digita age. 3 The artices cover a broad range of reference modes (chat, e-mai, onine tutorias, etc.) and a number of instruments are incuded as we. Athough not necessariy written for the nonexpert, it does update the work of the Reference Assessment Manua and Evauating Reference Services. Reference is ony one of many services that ibraries assess, and there are a number of books that more broady address service assessment. A recommended introduction to the evauation process as a whoe is Joseph Matthews The Evauation and Measurement of Library Services. 4 The chapters describe what evauation is and its purposes, and what toos to use and why. There are severa individua chapters devoted to specific ibrary services as we. A number of other books have been written that provide broader knowedge of how to do research and conduct data anaysis, but those move out of the scope of the audience for this artice. 5 Others have focused on service quaity. Peter Hernon and Een Atman describe how to assess service quaity with a focus on ibrary users in Assessing Service Quaity: Satisfying the Expectations of Library Customers. 6 Individua chapters expain how to define performance standards, and when to use certain kinds of toos. Darene Weingand s book, Customer Service Exceence: A Concise Guide for Librarians, uses a ibraries-specific approach to the business concept of customer service exceence, with a focus on pubic ibraries. 7 After the turn of the twenty-first century, the need for assessment data became greater as ibrary resources became more and more restricted. New iterature that showcases different ways to assess in order to produce data with more impact were produced. Roswitha Po and Phiip Payne expain that because resources within institutions and communities have become more and more scarce, ibraries must find ways to prove their contribution to earning, Lisa R. Horowitz (isah@mit.edu) is Coordinator of Centra Reference Services at Massachusetts Institute of Technoogy Libraries in Cambridge. 23, no. 4 Fa

2 teaching, and research quantifiaby. 8 They do so by concentrating on impact measures for ibraries and information services. Larry Nash White provides a case for assessment of intangibes in ibraries, as the word has moved toward a focus that is more service and information oriented than production oriented. He stresses that ibrary goodness coud be considered one of these intangibes. 9 Peggy Maki, former director of assessment at the American Association for Higher Education, combines an assessment focus on student outcomes with a description for how to pan assessment for that purpose. 10 Steve Hier and Jim Sef, two weknown ibrary assessment experts, describe in one artice how data gathered from assessment can hep with panning and decision making. 11 That artice incudes a comprehensive iterature review of the use of data in ibraries and offers contemporary exampes of ibraries that have used data successfuy for panning and management. In 2002, a committee of ALA s Reference and User Services Association (RUSA) was tasked to create guideines for assessment of reference. In the end, there were too many variabes to set definitive guideines, but the committee did deveop a tookit introduced in 2007 that expains in detai what kinds of anaysis are best used with which assessment toos. The tookit points to a variety of toos used by various ibraries, and gives an extensive bibiography of references. 12 This tookit can be a vauabe resource for the nonexpert. What none of these artices or books offers is a concise description of what the nonexpert needs to know in order to begin an assessment process. This artice attempts to expain the basic premises that assessment requires and how to move towards an assessment of a service having defined a scope for the project that is within budget, works with the staff avaiabe, and serves a vauabe purpose. The rest of the artice has four parts. The first section describes genera principes upon which a assessment efforts shoud be founded. The second section expains how to pan for assessment within the context of the resources and staff time avaiabe. The third section expains how to impement these principes and panning, and the fourth section expains post-assessment evauation how to assess the effectiveness of your assessment project, and how to move forward. 13 SECTION I: Genera Principes Po, a frequent contributor of artices on performance measurement (especiay for the Internationa Federation of Library Associations), defines performance measurement as the coection of statistica and other data describing the performance of the ibrary, and the anaysis of these data in order to evauate performance. Or, in other words: Comparing what a ibrary is doing (performance) with what it is meant to do (mission) and wants to achieve (goas). 14 Library staff and administrators want to assess services for a variety of reasons: to justify positions or resources, describe patterns of use and demand for services, determine appropriate service hours or staffing, for peer comparison, to improve services, evauate new services, etc. Most important in the recent iterature is assessing the impact of the ibrary and its services on its constituencies. 15 The word assessment is used to describe a the various strategies used to satisfy these myriad needs. Athough assessment can be initiated for any number of reasons with questions that can be answered with variety of toos, a assessment strategies shoud meet common standards to ensure that they are accurate, wi ead to improvements, and are appropriate to their specific purpose and institution. By adhering to the foowing seven basic tenets of good assessment, resuts wi be richer and more ikey to be trusted by administration. 16 The Seven Basic Tenets of Good Assessment 1. Devise a set of performance standards based on the vaues of your ibrary. A standard, according to The Oxford American Dictionary of Current Engish, is an object or quaity or measure serving as a basis or exampe or principe to which others conform or shoud conform or by which others are judged. 17 Performance standards in this context are those targets, outcomes, or metrics that are desired to indicate that a service is successfu. As Whitach expains, typicay you need to assess against some agreed upon performance standards. You deveop goas for service and measurabe objectives, and then determine whether the objectives are met 18 by comparing your actua performance to the standards that you have set. The performance standards that you empoy wi be defined by the vaues of your ibrary. Your institution or unit may aready have its own rues about service (e.g., We respond to any e-mai question by 5 p.m. the next business day. ) that you can use as your standard, and you can aso rey on or modify other standards for reference and information service (see a ist of possibiities in the appendix at the end of this artice). When you devise performance standards, consider your goas and what wi make your service a success within the context of the vaues hed by your ibrary. Hernon and Atman describe a method that eads from vaues to standards. They suggest that staff ook at a ist of service quaity expectations and identify first-, second-, and third-eve priorities among them: Such a ist of priorities serves as a reminder that the intent is to identify those expectations that staff beieve are most essentia to meet and to ay 194 Library Leadership & Management

3 the foundation both for expanding the ist over time and for deciding where to set [standards]. 19 The set of priorities identified by staff becomes the foundation for performance standards. Sef expains how, at the University of Virginia, where a baanced scorecard 20 method of assessment is used, vaues ead to standards: By imiting the number of scorecard metrics [items that indicate success within the specific standard], it forces us to decide what is important, and to identify those numbers that truy make a difference. 21 These vaues wi aso infuence what you assess and why, which eads to the next principe. 2. Determine what questions you want answered. Many peope begin by ooking at data that aready exists, such as data reported for accreditation or generated by the cataog, and then try to determine what that data tes them about their services. Athough there is vaue to anayzing preexisting data, preexisting data can ony answer very specific questions; as a resut, imiting onesef to mining preexisting data severey restricts the range of questions that can be answered. The most reevant assessment instead begins by asking questions generated by your performance standards. If a performance standard requires that you respond to any question by 5 p.m. the next day, your questions might incude, Are we meeting our question response-time goa? How often do we miss that goa? Can that be improved? Are our users satisfied with our response time? As questions based on specific standards are generated, it wi be found that there are many possibe directions to expore. 3. Make sure to use the most appropriate data-gathering too for your questions. Some peope begin assessment by choosing a particuar datagathering too, even before their questions have been determined. For exampe, they begin by deciding that a focus group or a survey woud be a good idea because they need input from ibrary users, and they fee these are the ogica ways to gather it. However, this is an i-considered practice that can restrict you to answering ony those questions that can be answered by that too. The ist of possibe questions to ask is unimited if it is considered before choosing a too. A best practice is to et the questions determine what data is needed, and then aow data needs to point toward the most appropriate toos. 22 There are a number of toos that can be used, and different toos may offer quantitative or quaitative feedback. Quaitative feedback is non-numerica and may be generated by surveys, interviews, focus groups, or even observation. Quantitative feedback is often generated through unobtrusive observation, such as gate counts or circuation numbers, but is sometimes imposed on quaitative feedback, such as where a scae from 1 to 5 is assigned to something that is quaitative. This is described in more detai in tenet 4. Not a assessment toos serve every purpose, and both quantitative and quaitative methods have vaue. As Danny Waace and Connie Van Feet expain, the specific methodoogy for the evauation project devoves from the perspective estabished for the evauation effort. Certain methodoogies end themseves to certain perspectives. Focus groups, for instance, are very usefu in determining patron perceptions, are of imited use in determining patterns of use, and are of extremey imited vaue in assessing efficiency or effectiveness. 23 Focus groups are guided, moderated discussions on a topic famiiar or meaningfu in some way to the participants invoved. Because of their format, focus groups and one-onone interviews are not usuay representative enough to be generaized to a arge popuation. This kind of quaitative feedback can best be used to soicit ideas, such as for new ibrary initiatives, or to corroborate and expand on ideas earned from a arger-scae survey. Surveys, in turn, are usefu for gaining perceptua and opinion feedback of a arge popuation, but are ess usefu for confirming factua information. For exampe, if users respond that they visit the ibrary once a month, it means that users reported they visited the ibrary once a month. But to earn about actua patterns of use or to corroborate facts, direct observation of intermediated transactions or indirect examinations of eectronic transactions, gate counts, etc., are more usefu. Po and Payne expain the vaue of combining quaitative and quantitative methods: The resuts of quaitative methods wi of course have a subjective bias; they show the perceived outcome. They shoud therefore be compared with resuts of quantitative methods or with statistics of ibrary use in order to vaidate the resuts. But the anecdota evidence wi be invauabe in reporting to the pubic and the institution, as it serves to make statistics understandabe and beievabe. 24 For those interested, an assessment gossary has been compied that offers compete definitions of these methodoogies, which can hep determine the best use of each too. 25 Toos are aso described in more depth ater. 4. Define pans for data use. Regardess of the toos used, ensure that a data coected is used in a contextuay appropriate way. Additionay, make sure the interpretation of data wi suggest actions that can improve services or fufi the needs of assessment identified above. Communication of the use of data and its resuts needs to be cear as we. Staff can get frustrated when they must gather data simpy because it is easy to gather, but they see no outcomes from the data coection. 23, no. 4 Fa

4 Users can become frustrated if they are asked questions, but no action resuts. Maki writes, Assessment is certain to fai if an institution does not deveop channes that communicate assessment interpretations and proposed changes to its centers of institutiona decision-making, panning and budgeting. 26 It is equay important to share that information with users as we. This demonstrates to a the stakehoders that their participation in assessment activities has had vaue, and creates an atmosphere where change and improvement are expected and accepted. How data is used aso has a bearing on the kind of data that needs to be coected. Decisions wi need to be made about whether quaitative or quantitative data wi answer your question, and, if quantitative, what measurement eve is needed. 27 This eve describes which scae is used in rating questions. One of four common types of scaes is used, depending on whether the numbers in the scae are truy continuous. In ibrary studies, a numerica rating is often assigned to satisfaction, success, or an evauation of an instructor. The importance of the measurement eve is that with continuous numbers, statistica measures are egitimate. Using certain statistica and arithmetica functions on non-continuous numbers eads to invaid or questionabe resuts. The east numerica scae is the nomina scae, in which numbers are merey used as named categories; for exampe, the ibrary ca numbers or a phone number. The numbers do not offer a measurement, and cannot be used in a statistica way. On the other end of the spectrum is the ratio scae, the most truy continuous scae, which can be used in any statistica cacuation. It has a true zero point and can be divided into infinitesima pieces which remain meaningfu. This scae incudes numbers such as peope s ages, or engths of time. The interva scae is simiar to a ratio scae, in that the differences between the vaues are fixed and can be divided, but the vaue of zero is arbitrary, such as with temperature scaes or caendar dates. The zero point does not actuay represent a ack of the feature being measured. This means that certain kinds of statistica work can be done to these numbers, but not a. Library assessment often invoves the ordina scae. Ordina scaes define rank how satisfied users are or who is the best teacher but do not define a particuar distance from one eve to another. In ibrary studies, there is often an ordina scae that shows a satisfaction rating, a success rating, or an evauation of an instructor. For exampe, a write-up expaining that ibrary satisfaction scored 3.1 on a scae from 1 to 4 is ordina. The probem is that the distances from 1 to 2 or 3 to 4 are not necessariy the same (I am very satisfied vs. I am satisfied vs. I am dissatisfied, etc.). Even the number ranks themseves ikey mean something different from one person to the next. So athough the ordina scae can be vauabe, using it arithmeticay to get a mean, for exampe, can be miseading. 28 Consider the impications of the measurement scae used when creating assessment toos and appying them in interpreting resuts. 5. Trianguate the data. Corroborate resuts by coecting data through severa different methods of assessment, ideay three. As RUSA s guide states, the more sources of data, the better your anayses wi be... [T]rianguation increases the vaidity of your anayses and resuts. 29 Vaid resuts fow from mutipe data streams, rather than reying on a singe assessment too. This may invove compiing the resuts of severa quantitative anayses, or coud be a combination of quaitative and quantitative data. For exampe, in a study of virtua reference at the Massachusetts Institute of Technoogy (MIT), resuts of two user surveys aongside chat and e-mai transcripts were anayzed, aong with activity eves and return rates for usage of the chat service, and usabiity studies. 30 Taken together, the resuts offered a rich variety of data indicating how MIT ibraries virtua reference services coud be improved. In another exampe, when deciding how to envision MIT reference services for the future, users answered a survey and focus groups were hed to support or extend the information received from the survey, which was more comprehensive. Staff members were surveyed, and a the data examined together and used to write a reference vision that coud be used into the future Endeavor to maintain statistica vaidity. Maintaining statistica vaidity (occurring when variabes truy do measure what they represent) and reiabiity (occurring when the toos used can be trusted to measure it consistenty) is probaby the hardest thing for the nonexpert to do. However, the most reiabe resuts are statisticay vaid, wi stand up to investigation by administrators and outside reviewers, and can be corroborated. Gathering numerica data with the appropriate measurement eve is a good start. 32 There are, additionay, some toos to assist with statistica vaidity, such as random number generators and sampe size cacuators. 33 RUSA s guide offers suggestions for anaytica toos that are best in certain contexts. If ensuring that resuts are statisticay vaid is a concern, a statistician can often be found at an institution or ocay. Other sources are RUSA s Directory of Peer Consutants and Speakers ( or conferring with someone from ALA s Office of Research and Statistics ( 7. Pretest data coection instruments. Make sure that questions in each instrument are understood by those who wi be answering them, and that they measure what is desired. Ensure the data gathered by the too actuay answer initia questions. Use a sma group representing those who wi be taking the survey, and watch them take the test. Consider having them answer the questions out oud. Be certain that subjects interpret the questions as intended, and do not find your questions ambiguous. Look at the resuts as though they are rea, to be sure the resuts obtained answer the questions you asked. When you anayze the resuts, can you interpret 196 Library Leadership & Management

5 them? Sometimes, as much as your questions seem to make sense, the data gathered in response to the questions does not actuay add anything to what you aready know. Testing the toos heps to avoid that probem. By incorporating these seven basic principes into your thinking about and panning for assessment, you wi find that assessment is easier to manage and ess intimidating. SECTION II: Panning Assessment Assessment invoves a number of steps. The first, and perhaps most important, is to take time to pan. Writing out a pan heps to manage the scope of the assessment, and makes it ess intimidating. By writing or at east outining a pan, how much is done at any time can be controed. Doing so wi aso hep reassure staff that incorporating assessment into their daiy workfow wi not be burdensome. An assessment pan wi be most effective if it meets the foowing three criteria: (1) keeping a focus on actions for improvement or change, (2) maintaining a pace that makes the pan manageabe, and (2) avoiding or addressing potentia obstaces that can hinder the assessment process. Focus on Actions Assessment is most wecome when it resuts in positive changes or improvements, whether to users, workfow, use of resources, or determination about how to proceed with a new service. When panning assessment, keep in mind anticipated positive changes or improvements. This wi hep imit the scope of the assessment, as activities which are ess ikey to produce any sort of change can be eiminated or minimized. Efforts can then be focused on activities that wi resut in beneficia changes. Make it Manageabe Assessment becomes unmanageabe when its scope reaches beyond the contro of the person impementing it. Make it manageabe by imiting the scope where possibe. Assessment takes time for a the peope concerned the pan writers, the creators of the toos, those who input data, those who anayze the data. Make sure the assessment pan can be managed within the amount of staff time avaiabe. Make the project smaer if staff are not abe to take on a arge project. That does not mean that the assessment wi necessariy be ess vauabe. The impact of a smaer assessment project may pave the way for administrative support for additiona staff time for assessment in the future. 34 Another way to ensure that time is avaiabe for assessment is to tie assessment to a project that is aready panned. When panning to redesign reference space, incorporate assessment into that project. If starting a new training initiative, describe how to assess the success of the initiative. By making sure that assessment is not separate but is part of projects aready panned, there is a ogica space for assessment in the workoad, and time becomes easier to manage. Pan the assessment over a cyce of severa years, rather than considering it a one-time project. This enabes it to be ongoing, it aows for new questions over time and it imits expectations of ever having a compete assessment at any one point, which is certainy not manageabe. 35 Make the pan moduar. Having an assessment pan with different components that can be handed simutaneousy or at unreated times works we within a cyce of assessment. Tie various standards to certain years (for exampe, get user feedback on services every three to five years, but focus on workfow improvement in other years). Aternativey, focus on certain units or services within each year, without overoading staff with assessment activity. This moduarity aso keeps assessment manageabe by reducing the expectation of what can be achieved at any one time, whie nonetheess iustrating that in the ong term, a variety of assessment projects (and reated improvements!) can take pace. Do not overwork your human resources devoted to assessment. There is a poo of survey subjects, focus group participants, wiing interviewees, and ibrary staff. Try not overoad them with too many assessment requests at any time, so as not to use up their wiingness to participate. Make sure that the pan does not dictate the use of these resources many times within a short period. 36 Finay, assign responsibiity for assessment. Assessment is ess ikey to sta if a particuar person or group champions it. This champion shoud represent the stakehoders in the assessment process, peope who wi be affected both by the assessment process itsef, and by the changes that may take pace based on the resuts. This champion can then advocate for funding support and staff commitment to the assessment process, and can maintain history and continuity over time, as assessment needs change. Overcoming Obstaces The previous section on managing assessment hints at obstaces that may hinder assessment, with the goa of handing them during the panning process. This section highights possibe obstaces and suggests how to prevent them from hampering that process. There are three kinds of obstaces that can hamper good assessment. First, administrative support comprises buy-in, funding, and aocated staff time or positions. If administrators do not beieve that assessment can resut in positive changes and improvements, 23, no. 4 Fa

6 start smaer. Seect an area that requires ess support from them, and show them how vauabe assessment can be. Consider writing the assessment pan in conjunction with stakehoders to encourage buy-in. Maki advises communicating reguary with stakehoders to promote support for assessment: If an institution aims to sustain its assessment efforts to improve continuay the quaity of education, it needs to deveop channes of communication whereby it shares interpretations of students resuts and incorporates recommended changes into its budgeting, decision making, and strategic panning as these processes wi ikey need to respond to and support proposed changes. 37 Assessment can require toos, suppies such as postage or incentives, or even simpy training for staff. Ideay, ibrary administration wi support the assessment initiative and provide some funding when needed. 38 However, ack of funding does not mean that assessment cannot be done. If there is not an assessment budget, ook for toos that are free or inexpensive (see RUSA s guide for ideas). Consider aotting more staff time for a homegrown product, if possibe. Again, when shown that assessment produces resuts, it may be possibe to get future funding for something with a arger scope. In the written pan or outine, make it cear what can be done within the funding avaiabe currenty, and expain what might be done with more funding. The second type of obstace is interna. As mentioned before, it is crucia that staff support the assessment and beieve that it wi resut in positive change. Otherwise, it wi not be supported, and may be resented. Staff who are resentfu or who do not beieve in the process may forget or refuse to provide data or input when it requires manua action. Conversey, they may anticipate an outcome that negativey impacts them, and therefore may provide inaccurate data in an attempt to skew the resuts. The best way to combat this type of obstace is to share the assessment pan and what it means with the staff invoved. Make sure that they understand the ikey positive impacts that their cooperation wi produce. The third type of obstace is externa the users themseves. Respondents can hinder assessment through ow response rates or through sham responses. Response rate is important because ow response rates from a representative sampe can reduce the vaidity of generaizing the resuts to the whoe popuation. 39 (Response rates from a non-representative sampe are ess critica in that the resuts cannot be appied to the genera popuation anyway.) Response rates can be raised through incentives for participation, support and encouragement from a critica stakehoder whose opinion makes a difference to the respondents, or through cear benefits in how the resuts wi be used. Sham responses are much ess ikey but can actuay occur as the resut of incentives. If the incentive is very good (e.g., every responder wi receive $10), there may be respondents who answer randomy simpy to obtain the incentive. Generay, this is something that cannot be totay avoided except by imiting incentives. 40 SECTION III: Impementing the Assessment Pan Up to this point, the basic tenets for doing assessment and the panning required to make it effective have been described. The foowing is a suggested ist of steps that ead to assessment that is manageabe and actionabe. It reies on the standards created, which in turn depend on the vaues of the institution. This eads to a set of measurabe objectives, which then point to the appropriate toos to obtain those measurements. Finay, there are suggestions for where to go for input on how to anayze and interpret the data that is gathered. 41 Ask Questions Based on Standards As described above, the assessment wi rey upon a set of standards. In MIT Libraries Pubic Services department, for instance, the foowing were deveoped based on standards and guideines from the ibrary iterature (see the appendix) as we as from the ibraries own goas: 1. Staff provide consistent and exceent service in a interactions. 2. Service is both avaiabe and deivered in modes appropriate to user needs. 3. Users get accurate, appropriate and timey service. 4. Users are aware of and understand our services and poicies. 5. Staff are empowered to be successfu in their roes. 6. There is effective, efficient and baanced use of resources. 42 The standards that are isted above are not actuay measurabe. They suggest, in vague terms, what the idea service woud provide. The University of Virginia Library has simiar standards. For exampe, it incudes a standard, Provide exceent service to users of the University of Virginia Library. 43 Tenet number 2 from the Genera Principes section provides the next step to get to something measurabe: Ask questions for which answers are desired. Look at one or a of your standards, and brainstorm questions that wi te whether or not the standard has been met. An exampe wi make this cearer. At MIT in 2007, pubic services administrators chose to focus assessment on the standard, Users are aware of and understand our services and poicies. Many committees 198 Library Leadership & Management

7 and units anayzed that standard from their own point of view. The Research and Instruction Support Group (RISG) brainstormed questions reated to research and instruction, such as, Do peope know we offer research assistance in a variety of formats? and How do users find out about our research and instruction services? The group brainstormed a tota of amost forty questions. At the same time, the Cient Focus Group brainstormed questions reated to outreach. In each case, the standard tied the questions together, so that a units were considering users awareness and understanding of our services. However, generating the questions is ony the first step. Be sure that the project is manageabe by reducing the number of questions that wi be addressed. It is possibe to combine questions into categories and again, going back to the stated vaues, determine which are the most important to anayze at the moment. Prioritize questions which are most pertinent now, whether for administrators or which ead to actions that can improve the service. RISG generated forty questions, but in the end chose to focus on ony a singe one, given the time avaiabe for this project and the staff avaiabe to work on it. Define Measurabe Objectives for Each Question A measurabe objective is just that: an objective, goa, target, that can be measured. Look at the top priority questions generated in the ast step, and then reword them into something measurabe. Moving from the questions to something measurabe can be the most chaenging part of assessment because it means assigning numbers to the institution s vaues. Especiay if you have no idea how we your service is currenty doing, it is hard to assign what seems ike an arbitrary number to an objective to determine how we you are doing, but it is important to do so. Hernon and Atman agree: Because there are no universay accepted standards or norms for performance on any aspect of ibrary processes, functions, or service, senior staff in each ibrary have to decide on the acceptabe eve of performance. 44 There are severa ways to accompish this. The first is to define numbers that are desirabe as we as reasonabe. For exampe, if a standard aready exists et s say that staff e-mai response turnaround is by 5 p.m. the next business day a measurabe objective that states, Eighty percent of e-mai questions get a compete answer or significant starting hep by 5 p.m. the next day, whie the other twenty percent get some sort of response in that same time period can be deveoped. Those numbers seem competey arbitrary without context. The context of the institution anecdota evidence or other data wi guide what is desirabe and reasonabe. At the University of Virginia (UVa), standards are transated into what they ca metrics which are the specificay measurabe way to view the standard. UVa deveoped two metrics for the standard exampe above ( Provide exceent service to users of the University of Virginia Library ): Overa rating in student and facuty surveys [given by the institution] and Customer service rating in student and facuty surveys [given by the ibrary]. Each of these metrics then has two targets, which are numbers defined by the staff that have been determined to be desirabe and reasonabe. The first target for the first metric, At east 4.25 out of 5.0 from each of the major user groups: undergraduate students, graduate students, humanities facuty, socia science facuty, and science facuty, is their most desirabe target, whie their second target, A score of at east 4.0 from each of the major user groups, is reasonabe and sti desirabe. 45 If they meet these targets, they fee that they have been successfu in this particuar area, whie if they do not, they have room for improvement. The second way to deveop measurabe objectives is more conservative, but can ead to improvements without first assigning a seemingy arbitrary number. This invoves answering the questions through data that aready exists (or newy generated data) to provide a starting point from which to base future improvement. For exampe, the question that RISG decided to answer was, How aware are students, facuty and ibrary service desk staff of our subject guides? Rather than deciding that a certain percentage of users shoud indicate awareness of the subject pages (thus imposing an arbitrary number based on our vaues), the group decided to ook at a variety of existing data to see whether users did use and understand them. Gathering a variety of data about the subject pages heps determine whether to improve the pages, advertise them better, or discontinue investing resources in them. If the second method is chosen, the idea is to gather data for a baseine starting point, but the next time assessing the same service, a number that is ess arbitrary is appied. For exampe, after RISG gathered its assessment information in 2007, finding that 20 percent of users were aware of these toos, it set a measurabe objective of 50 percent awareness overa (undergrads, grad students, facuty) within three years. The first is a benchmark, whie the second is a new measurabe objective to strive for. Once a ist of measurabe objectives is deveoped, it can be trimmed to make it manageabe. Each objective may have one or severa methods that coud be used to determine its success. It is idea to trianguate, as indicated in the principes above, using more than one method to corroborate a set of resuts. A sma assessment project woud therefore need to focus on ony one or perhaps two measurabe objectives. If there is more staff time aotted to assessment, or more projects occurring in which it is ogica to incorporate assessment, it may be possibe to manage more measurabe objectives. No matter how big the project is, the process has now begun: Setting targets and intended resuts ensures that the ibrary engages in reguar assessment, consistenty uses the same questions or statements, and commits the resources necessary to meet its goas , no. 4 Fa

8 Design Appropriate Toos Once defined measurabe objectives for each question are identified, the appropriate too can be appied. The kinds of information needed wi determine the best too for the job. For exampe, in the MIT Libraries, RISG evauated the use of subject guides by ooking at the resuts of a previous ibrary survey question which asked about users use and awareness of subject guides. Additionay, they compared the number of hits to the subject guide pages versus other ibraries webpages, to see how much the subject guides were used in comparison to how much effort was being put into them. They aso examined which of MIT s nonibrary webpages inked to the subject guides, to get a sense of how often facuty and students advocate the ibraries subject guides. These atter two methods were both unobtrusive observation techniques. For some additiona quaitative input, staff were surveyed for anecdota information about their interactions with users and their use of subject guides, and RISG anayzed a staff sef-appraisa that had asked some questions about staff knowedge and use of subject guides. Other than the anecdota evidence, this was a data that aready existed in some format, but by puing it a together, RISG was abe to evauate both quantitative and quaitative evidence of the use and awareness of the subject guides as we as their vaue to users. The Assessment Gossary 47 can assist in expaining the various toos and what they can be used for, as does the RUSA Guide. The UVa baanced scorecard, where the toos are described foowing the metrics and targets, can aso be consuted. In addition, numerous artices and books expain how to use a particuar too or describe an assessment that used a specific too. For exampe, there are many exceent books written about how to conduct focus groups, often for commercia settings. One artice that is especiay usefu was written by a consutant for nonprofit organization deveopment, Judy Sharken Simon. It is concise and expains very simpy the important points to keep in mind when conducting focus groups. Because its focus is on nonprofits, it has an understanding of the vaue of a focus group that differs from a traditiona commercia approach. 48 Hernon and Atman s book aso has a hepfu chapter describing how to work with focus groups. 49 For an expanation of surveys, there is a usefu chapter in Assessing Service Quaity. 50 An especiay practica artice by the Rand Corporation discusses how to conduct surveys via the Web. 51 It describes the pros and cons of offering a survey in eectronic form, expaining various issues reated to design and impementation. It is intended for businesses, but much can be geaned from its uncompicated description of concerns. Observation, as expained previousy, describes data that is gathered eectronicay, such as gate counts or circuation statistics. In ibrary terminoogy, though, observation is often used to describe persona observation of intermediated transactions, whether carried out when the participants are aware (as when a supervisor automaticay reviews chat transcripts or watches interactions at a service desk), or through unobtrusive evauations, for exampe, when performed by mystery shoppers. The mystery shopper idea, which comes from business, is when peope are asked (or paid) to present questions to a service point, whether panned questions or impromptu, and report certain indicators back to an assessment team. Andrew Hubbertz wrote a rather scathing rejection of this kind of observation that offers thoughtfu ideas. 52 His artice describes and gives soid evidence showing how unobtrusive observations of actua interactions are often neither egitimate nor needed forms of evauation. The measurabe objectives combined with an understanding of the various kinds of toos avaiabe wi point toward the most appropriate toos for the job. Fina Steps in Impementation One key step in impementing the pan is anayzing and interpreting the data. That cannot be described adequatey in an artice such as this. A good introduction to the variety of possibe statistica anaysis toos is Hernon and Atman s chapter on Interpreting Findings to Improve Customer Service. 53 Another set of pointers to statistica toos is in the RUSA guide, and experts can be consuted for assistance, especiay if extensive statistica anaysis is required. Many ibrarians do choose to interpret quantitative findings in quaitative ways rather than using traditiona statistica toos. Keeping in mind the tenet to maintain statistica vaidity, this is possibe, as ong as caveats reated to samping or other biases in the way the toos are used and/ or interpreted are reported. Aternativey, using quantitative data to vaidate quaitative information can be usefu as we. Library directors need to be abe to emphasize the story behind the numbers to make their case for the vaue of the ibrary. 54 Once resuts are anayzed, they can be appied to make improvements to the service or workfow, to justify use of resources, etc. It is critica to communicate with the stakehoders invoved. Hernon and Atman point out that organizationa vaues again come to the fore as reported items are highighted. 55 SECTION IV: Fina Anaysis for Success Assessing the Assessment After anayzing and interpreting the resuts, and impementing improvements or actions based on those resuts, the assessment continues. It is important to assess the assessment process and products reguary. The foowing are questions that shoud be asked about the assessment process to hep define future assessment: 200 Library Leadership & Management

9 Was the process supported by the staff invoved? If not, reconsider the process that ed to the assessment effort. Invove more staff in the design of the assessment, or in the anaysis and impementation of resuts. Were the resuts shared with users? Did users support and respect the process and the resuts? It is aways vauabe for users to understand the work that goes into improving ibraries, and the more that is expained and shared with them about the process and resuting actions, the more they wi support continued assessment efforts. Was the service actuay improved? Did the assessment ead to positive action? If not, reconsider the questions asked. Maybe the too used did not answer that question and therefore ed to resuts that coud not be appied. Maybe the too was ess effective than it coud have been, and care shoud be taken to revamp this too if used again. Are the standards sti appicabe? At UVa, the standards, metrics and targets are reconsidered each year. As priorities and vaues change, standards may aso change. At MIT ibraries, the time came to assess the assessment pan described previousy. 56 Severa parts of the pan s impementation did not work we. For exampe, a cyce was imposed in which each standard was examined in a particuar year, as described previousy. This ed to the creation of assessment projects in order to meet the need of the standard, rather than assessment being part of projects aready deemed priorities. Second, the standards themseves were reexamined, because it was found that they combined factors that might be better assessed as separate entities. Fortunatey, it was aso found that the assessment pan had begun to create a sense that assessment was part of the workfow. A new pubic services projects incuded an assessment pan, defining how it woud be determined that the project was successfu. Staff were asking when to switch focus onto a new standard, because there was something ese that they thought woud be usefu to measure but it was not part of the current cyce. Reassessing your assessment process provides meaning and context to the assessment that you do, and aows it to become a continuous process. When assessment is a reguar part of everyone s daiy workfow, continuous improvement is a goa that can be obtained. Concusion Assessment of ibrary services does not need to be intimidating, nor does it need to tie up a huge number of resources. If done we, it can ead to positive improvements and changes in services that strengthen a ibrary s reationship to its community. By foowing basic principes for assessment, panning the assessment to imit its scope, impementing a pan that focuses on positive change, and assessing the process reguary, a ibrary can comfortaby incorporate assessment into its reguar workfow. The author woud gratefuy ike to acknowedge the input and advice given by Susan Ware of Pennsyvania State University, Marene Manoff and Patsy Baudoin of MIT, and the MIT Libraries for providing research eave to aow me time to write. References and Notes 1. American Library Association and Reference and User Services Association Management of User Services Section (MOUSS) Evauation of Reference and Adut Services Committee, The Reference Assessment Manua (Ann Arbor, Mich.: Pierian Pr., 1995). 2. Jo Be Whitatch, Evauating Reference Services : A Practica Guide (Chicago: ALA, 2000). 3. Eric Novotny, ed., Assessing Reference and User Services in a Digita Age, (Binghamton, N.Y.: The Haworth Pr., 2006). 4. Joseph R. Matthews, The Evauation and Measurement of Library Services, (Westport, Conn.: Libraries Unimited, 2007). 5. Ronad R. Powe and Lynn Siipigni Connaway, Basic Research Methods for Librarians, 4th ed. (Westport, Conn.: Libraries Unimited, 2004); G. E. Gorman, Peter Cayton, Sydney J. Shep, and Adea Cayton, Quaitative Research for the Information Professiona: A Practica Handbook, 2nd ed. (London: Facet, 2005). For exampe, Powe and Connaway s Basic Research Methods for Librarians, and Gorman and Cayton s Quaitative Research for the Information Professiona were recommended to me for that purpose. However, their focus is much ess on a nonexpert attempting to do evauation to improve services than on a ibrarian trying to do authentic research to improve the profession or for pubication purposes. 6. Peter Hernon and Een Atman, Assessing Service Quaity: Satisfying the Expectations of Library Customers (Chicago: ALA, 1998). 7. Darene E. Weingand, Customer Service Exceence : A Concise Guide For Librarians (Chicago: ALA, 1997). 8 Roswitha Po and Phiip Payne, Impact Measures for Libraries and Information Services, Library Hi Tech 24, no. 4 (2006): Larry Nash White, Unseen Measures: The Need To Account for Intangibes, The Bottom Line: Managing Library Finances 20, no. 2 (2007): Peggy L. Maki, Deveoping an Assessment Pan to Learn about Student Learning, The Journa of Academic Librarianship 28, no. 1 (2002): Steve Hier and James Sef, From Measurement to Management: Using Data Wisey for Panning and Decision-Making, Library Trends 53, no. 1 (2004): RUSA, RSS Evauation of Reference and User Services Committee, Measuring and Assessing Reference Services and Resources: A Guide, sections/rss/rsssection/rsscomm/evauationofref/meas refguide.cfm (accessed Mar. 22, 2009). 13. Athough I am not a schoar in assessment or a statistician, 23, no. 4 Fa

10 for amost tweve years I have been working on assessment projects for the MIT Libraries and for the American Library Association s Reference and User Services Association (RUSA). I have been invoved in assessing in-person and virtua reference services using surveys, focus groups, and staff forums, as we as detaied anayses of reference transaction statistics, e-mai and chat transcripts, and time ogs. Within RUSA, I chaired the Evauation of Reference and User Services Committee in and was the initiator as we as contributor to a project to produce recommendations for measuring and assessing reference services. In 2007, I ed the fina process for producing a modern definition of reference for RUSA ( guideines/definitionsreference.cfm, accessed Juy 19, 2009). I wrote this artice to provide the kind of information that woud have heped me in my earier assessment work. 14. Roswitha Po, Measuring Quaity: Internationa Guideines for Performance Measurement in Academic Libraries, IFLA Pubication 76 (1996): See for exampes: Maki, Deveoping an Assessment Pan ; Po and Payne, Impact Measures ; White, Unseen Measures ; and Susan J. Beck, Making Informed Decisions: The Impications of Assessment, Proceedings of the Eeventh Nationa Conference of the Association of Research Libraries (Charotte, N.C.: Apri 10 13, 2003), mgrps/divs/acr/events/pdf/beck.pdf (accessed Mar. 22, 2009). 16. These tenets refect knowedge that I have accumuated over time. They have been captured in sighty different form in the RUSA Guide, to which I was a contributor. 17. Oxford Reference Onine, s.v. standard, reference.com/views/entryhtm?subview=main&entry=t2 1.e The idea of basing assessment on guideines and standards is commonpace. This quote is from persona correspondence with Jo Be Whitatch, a we-known schoar in reference assessment, on Juy 1, Hernon and Atman said simiary, Any measure, incuding customer-reated ones, shoud be inked to performance targets and outcomes, which monitor, and are intended to improve, the quaity of service over time, Assessing Service Quaity, Hernon and Atman, Assessing Service Quaity, Oxford Reference Onine, s.v. baanced scorecard, Oxford Reference Onine, Jim Sef, Using Data to Make Choices: The Baanced Scorecard at the University of Virginia Library, ARL: A Bimonthy Newsetter of Research Library Issues and Actions, 230/231 (2003): 2. See aso current scorecard metrics at (accessed Mar. 22, 2009). 22. RUSA, Measuring and Assessing Reference Services and Resources: A Guide, Section Danny P. Waace and Connie Jean Van Feet, Library Evauation: A Casebook and Can-Do Guide (Engewood, Coo.: Libraries Unimited, 2001): Po and Payne, Impact Measures, Lisa R. Horowitz, Assessing Library Services: A Practica Guide for the Non-Expert Gossary, Access: web.mit.edu/~isah/www/professiona/gossary%20-%20 Jan% pdf (accessed Mar. 22, 2009). 26. Maki, Deveoping an Assessment Pan, Measurement eve is defined as the type of measurement scae used for generating scores, which determines the permissibe transformations that can be performed on the scores without changing their reationship to the attributes being measured, from A Dictionary of Psychoogy, ed. Andrew M. Coman (Oxford Univ. Pr.: 2006); Oxford Reference Onine, reference.com/views/entryhtm?subview=main&entry=t8 7.e4905 (accessed Mar. 22, 2009) 28. Horowitz, Gossary, offers fu definitions of the different scaes used for quantitative data and the impications for them, as we as other terminoogy used in assessment. 29. RUSA, Measuring and Assessing Reference Services and Resources: A Guide, section Lisa R. Horowitz, Pat Fanagan, and Deb Heman, The Viabiity of Live Onine Reference: An Assessment, porta: Libraries and the Academy 5, no. 2 (2005): MIT Libraries, Reference Vision Project Report, 2002, refvision_psmg.pdf (accessed Mar. 22, 2009). 32. For fu definitions, see Horowitz, Gossary. 33. One exampe of a sampe size cacuator is offered by Creative Research Systems, at (accessed Mar. 22, 2009). This aso heps define confidence intervas (how confident can you be that your resuts can be extrapoated). One type of random number generator can be found at (accessed Mar. 22, 2009). However, there are many different types avaiabe onine as we as in a variety of reference works. 34. Maki, Deveoping an Assessment Pan, 13; indicates that Initiay, imiting the number of outcomes coeagues wi assess enabes them to determine how an assessment cyce wi operate based on existing structures and processes or proposed new ones. 35. Maki, Deveoping an Assessment Pan, 11; incudes a variety of suggestions for how to estabish a schedue for assessment. 36. Matthias Schonau, Ronad D. Fricker Jr., and Marc N. Eiott, Conducting Research Surveys via Emai and the Web (Santa Monica, Caif.: Rand, 2002). (accessed Mar. 22, 2009), 40. Schonau et a. say, With both vounteer and recruited panes, one concern that researchers have is that participants may tire of fiing out surveys, a condition caed pane fatigue, or earn to provide the easiest responses, a phenomenon caed pane conditioning. 37. Maki, Deveoping an Assessment Pan, Susan J. Beck, The Extent and Variety of Data-Informed Decisions at Academic Libraries: An Anaysis of Nine Libraries of the Association of Research Libraries in the U.S.A. and Canada, Presentation at the 5th Northumbria Internationa Conference on Performance Measurement in Libraries (Durham, Engand: Juy 29, 2003). Persona paper shared by author. In Beck s presentation she expained that one ibrary demonstrates its commitment to assessment by spending more than 5 percent of their 17 miion doar ($850,000, 2002 figure) annua budget on [a forma Management Information Systems department staffed with three peope]. 39. Hernon and Atman, Assessing Service Quaity, See aso pane conditioning, Schonau et a., Conducting Research Surveys, Po and Payne, Impact Measures, , describes a simiar set of steps for assessment based on an anaysis of 202 Library Leadership & Management