Hesburgh Libraries

Preparing Files for Text and Data Mining

Friday, September 24, 2021

12:30 pm – 1:30 pm

247 Hesburgh Library, Navari Family Center for Digital Scholarship

Text mining — a process for extracting information from an unstructured text — requires everyday files (PDF, Word, HTML, etc.) to be transformed into plain text files. Once your files are in a plain text format (no bold, italics, underlining, etc.) they are ready for automated processing and computer analysis.

This hands-on workshop will demonstrate and facilitate the use of a free Java-based program called Tika. Attendees will install Tika, learn how to convert just about any file into plain text, and gain the knowledge and confidence to use text mining services available online.

Please bring your laptop to this session.

View All Events

Sign up to receive weekly email updates for specific types of events.

284 Hesburgh Library, Notre Dame, IN 46556

Circulation Desk Phone (574) 631-6679

Security Monitors Phone (574) 631-6350

asklib@nd.edu

Facebook  Instagram  LinkedIn  Twitter   NDlibraries
Hesburgh Library Logo
Phone Number: (574) 631-6679