Print item usage analysis


This project analyzes a twelve year series of print items in the NCSU Libraries collection to determine the proportion of items used by years in the collection and examines the correlation between an item's years in the collection and its circulation status.


Project Details

This project analyzes a data set of print items added to the NCSU Libraries' collection over the past twelve years. 393,062 unique items are included in the analysis representing items from the collections of the main D.H. Hill Library and the four branch libraries: Design, Natural Resources, Textiles, and Veterinary Medicine.

This project differs from many studies of library circulation by examining the categorical status (i.e. 'used at least once' or 'never used') of items as opposed to the overall circulation of a collection. Overall circulation counts are heavily right skewed. Using total circulation as the base metric under estimates the use and value of a collection.

The proportion of items that have been used increases steadily the longer the item is available in the collection. The NCSU Libraries data indicate that up to 72% of items have been used after twelve years in the collection.

circulation status by years in collection line graph

Circulation status can also be coded to allow for a more detailed view of the use of the collection.

detail circulation status by years in collection line graph

Calculating the Spearman rank-order correlation coefficient between years in the collection and circulation status (measured on a scale of 'Never', 'Once', '2-5 circs', '6-10 circs', 'More than 10') indicates a relationship between the two variables. The Spearman correlation (rs) for this 12 year sample is 0.26. While the correlation is significant at p < .0001 it can still be considered relatively weak. More investigation is needed into other variables such as publisher, content level, required reading, and discipline that may be related to whether or not an item is used.

These findings are in contrast to the impression that most items never circulate. This has implications as libraries begin to use patron driven purchasing strategies as the data would argue against removing ebook records from the catalog after a two year period of non-use.

Additional info

Years Circulation Status  
in Collection Never At Least Once Total
1 12577 3332 15909
  79.06 20.94  
2 11496 6623 18119
  63.45 36.55  
3 16279 14685 30964
  52.57 47.43  
4 17872 17387 35259
  50.69 49.31  
5 12785 16775 29560
  43.25 56.75  
6 12834 17804 30638
  41.89 58.11  
7 14616 21750 36366
  40.19 59.81  
8 15427 20977 36404
  42.38 57.62  
9 14645 27285 41930
  34.93 65.07  
10 12735 27207 39942
  31.88 68.12  
11 10480 27192 37672
  27.82 72.18  
12 11142 29157 40299
  27.65 72.35  
Total 162888 230174 393062

Total circulation counts are heavily right skewed.

distribution of total charges graph

Reports and Presentations

Raschke, Gregory and John Vickery. 2010. "Clay Shirky, Fantasy Football, and Using data to glean the future of library collections." Charleston Conference, November 3, 2010.