The disclosed embodiments relate generally to data processing systems and methods, and in particular to a document compression system and method for use with a collection of documents with an associated index (hereinafter also referred to as a ???tokenspace repository???).