Apache Tika is an open source toolkit that makes it easy for search engines, content management systems and other applications to detect and extract content from digital documents in all major file formats.Tika in Action is a hands-on guide for developers working with search engines, content management systems and other similar applications who want to exploit the information locked in digital d