After trying about a half-dozen different indexing techniques and technologies, including native MySQL full-text, MongoDB, TokuDB, HBase, Lucene, and CouchDB, I found that none of them were even close to being fast enough to keep up with a sustained log stream of more than a few thousand logs per second when indexing each word in the message.