Hesburgh Libraries

University of Notre Dame monogram

Tuesday, November 1, 2022

11:00am to noon

247 Hesburgh Library, Navari Family Center for Digital Scholarship

Text mining, a process for extracting information from unstructured text, requires everyday files (PDF, Word, HTML, etc.) to be transformed into plain text files. Once your files are in a plain text format (no bold, no italics, no underlining, etc.) they are ready for automated processing and computer analysis.

This hands-on workshop will demonstrate and facilitate the use of a free Java-based program called Tika to do this work. More specifically, this workshop will help attendees install Tika and use it to convert just about any file into plain text, and then participants will be empowered to use a myriad of text mining services available on the 'Net.

Please bring your own laptop.

Related LibGuide: Text Mining and Analysis by Eric Lease Morgan

View All Events
Event: Preparing Files for Text and Data Mining

University of Notre Dame > Office of the Provost >

Hesburgh Library

284 Hesburgh Library, Notre Dame, IN 46556

Circulation Desk Phone (574) 631-6679

Security Monitors Phone (574) 631-6350

asklib@nd.edu

Hesburgh Library Logo
Phone Number: (574) 631-6679