Why do you ask?

6. Data collection methods

Annual estimates

In Norway, public libraries count the number of questions during one typical week in the spring and in the autumn. Multiplied by 26 (or by 25 to take account of Christmas, Easter and other holidays) these give annual values.

The numbers are clearly rough estimates. The counting unit is problematic. The border between reference and non-reference is unclear. Library staff has not been trained in systematic data  collection. Different persons and different libraries probably apply different criteria. When the work load is heavy, registration may be incomplete.

Data registration is a technical process. It could be improved by more detailed instructions and by brief local training sessions. In addition, annual totals could probably be estimated more accurately by utilizing data on visits. Statistics on visits do not depend on manual registration. Most libraries use electronic counters and collect them automatically throughout the year.

Let us compare the two estimation methods. Today we say:

Questions/Year  = Questions/Week * Weeks/Year

We estimate Q/W from the two sample weeks. But the "typical weeks" in mid-spring and mid-autumn tend to be busy weeks. The estimated number of questions therefore tends to be too high (note 1).

Visits during a particular week also varies from year to year.  Libraries are most attractive under moderate weather conditions.  Heat and sun draw people outdoors. Tempests and heavy rain keep them at home. Local or national events - sport competitions, strikes, royal marriages - may influence the level of traffic.

The alternative estimate would be based on the formula:

Q/Year  = Questions/Visit * Visits/Year

We can probably increase the precision and reduce the variability of Q/Year by estimating Q/Visit rather than Q/Week. The actual data collection could continue as before. It is only the estimation method that would be changed (note 2).

Duration

Collecting data on transaction times is not very hard. The number of queries, we have said, is measured during two typical weeks, one in the spring and one in the autumn. If the staff is equipped with stop watches, durations could be measured at the same time.

There is, by the way, no need to collect thousands of values. With a well-planned sample, a few hundred data points will be sufficient.

We already know that most orientation questions are brief. Directional questions (where do I find X?) and administrative questions (opening hours) can often be answered in a minute or less. Basic reference questions usually take a couple of minutes. If questions take longer than three or four minutes, they normally involve more complex operations and responses and should be classified as professional  reference work.

Content

To understand reference work in depth we need information on the actual flow of questions and answers, and on the search processes that lead from questions to answers. But it is very hard to collect such data from traditional reference services. The basic problem is time - which is a nicer word for money.

Information desks are busy places. Customers often queue for attention. If the queues get too long, back-up staff may be called away from office tasks to man the battle posts. If the queues disappear, they vanish. There is always work to be done.

The pressure on the desk remains high. With competing tasks at hand, there is never time for idleness. Under such circumstances, extensive data collection is a real burden. Filling forms while customers wait is not popular. Detailed forms can increase the time needed by 50 or 100%.

This means, effectively, that you need a second person to collect the data. Researchers carry out continuous data collection all the time. But the cost is high. A typical information desk (with one person) may handle 10-15 questions an hour. A full-time person costs about one euro a minute. Collecting good content data at the desk therefore costs 4-6 euro per transaction. If we want routine management statistics, with a regular flow of data on content, this is rather expensive.

Virtual reference desks, on the other hand, collect data on questions and answers automatically, as a by-product of providing the service. If VRD data are representative of reference patterns and trends in general, the data collection problem is essentially solved.

The challenge then moves from collection to analysis. We must develop categories, dimensions and indicators that allow us to interpret the data. Access to rich data on reference transactions opens up a new world of information. But we need maps to find our way in the new landscape.

Start

Previous

Next