Text and Data Mining

Text and data mining are the computer-based processes of extracting relevant information and/or patterns from machine-readable text or data. The work usually involves examining large data sets. Text and data mining is used across disciplines, from history to biology, computer science to political science.

Depending on your discipline and the kind of data you want to mine, the process may be called something other than text mining or data mining. Related terms include: text data mining, text analytics, content mining, audio mining, software mining, image mining, metadata mining, and video mining.

Try it out

The Proceedings of the Old Bailey, 1674-1913, is a fully searchable collection of 197,745 criminal trials held at London's central criminal court. The Old Bailey Demonstrator facilitates the dynamic exploration of trial results and the export of trial texts and collections of trial URLs to the suite of linguistic analysis tools from Voyant Tools.

Google’s n-Gram Viewer and Bookworm let you graph occurrences of words in Google Books and a variety of open digital collections, respectively.