It will be understood that the terms ???semi-structured??? and ???unstructured??? relate to documents such as web pages and news articles which are not comprised of data only from known data types that are specified in a plurality of predefined fields (such as those data from relational databases).