Hypernymy relations are an important asset in many applications, and a central ingredient to Semantic Web ontologies. The IsA database is a large collection of such hypernymy relations extracted from the Common Crawl. WebIsALOD is the Linked Open Data version of the IsA database, containing 11.7M hypernymy relations, each provided with rich provenance information and confidence estimates.
Linked Open Data Endpoint
We provide a Linked Data endpoint using derefencable URIs. To browse the LOD enpoint, use, e.g., the concept president
Content negotiation is also provided. For example, you can retrieve the data for the above example as n-quads or csv:
curl -v -H "Accept: application/n-quads" http://webisa.webdatacommons.org/concept/_president_
curl -v -H "Accept: text/csv" http://webisa.webdatacommons.org/concept/_president_
Schema
Example depiction of a hypernymy relation with its metadata:
SPARQL Endpoint
The SPARQL Enpoint is available at /sparql.
-
The following query retrieves relations with a confidence score higher than 0.9 (Open results in browser):
SELECT ?hyponymLabel ?hypernymLabel ?confidence WHERE{ GRAPH ?g { ?hyponym skos:broader ?hypernym. } ?hyponym rdfs:label ?hyponymLabel. ?hypernym rdfs:label ?hypernymLabel. ?g <http://webisa.webdatacommons.org/ontology#hasConfidence> ?confidence. FILTER (?confidence >= 0.9) } ORDER BY(?hyponymLabel) LIMIT 100
-
This query retrieves all hypernyms for a concept ordered by confidence (Open results in browser):
PREFIX isa: <http://webisa.webdatacommons.org/concept/> PREFIX isaont: <http://webisa.webdatacommons.org/ontology#> SELECT ?hypernymLabel ?hyponymLabel ?confidence WHERE{ GRAPH ?g { isa:_president_ skos:broader ?hyponym. } isa:_president_ rdfs:label ?hypernymLabel. ?hyponym rdfs:label ?hyponymLabel. ?g isaont:hasConfidence ?confidence. } ORDER BY DESC(?confidence)
-
Popular hyponyms of a concept, e.g,. popular singers (Open results in browser):
SELECT ?x ?c WHERE { GRAPH ?g { ?x skos:broader <http://webisa.webdatacommons.org/concept/_singer_> . } ?g <http://webisa.webdatacommons.org/ontology#hasConfidence> ?c. } ORDER BY DESC(?c) LIMIT 100
-
Instances of two classes, e.g., singers that also act (Open results in browser):
SELECT ?x (?c1+?c2)/2 AS ?c WHERE { GRAPH ?g1 { ?x skos:broader <http://webisa.webdatacommons.org/concept/_singer_> . } ?g1 <http://webisa.webdatacommons.org/ontology#hasConfidence> ?c1. GRAPH ?g2 { ?x skos:broader <http://webisa.webdatacommons.org/concept/_actor_> . } ?g2 <http://webisa.webdatacommons.org/ontology#hasConfidence> ?c2. } ORDER BY DESC(?c) LIMIT 100
Patterns
All patterns used for the extraction can be retrived with the following query ( Open results in browser ):
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX isao: <http://isadb.webdatacommons.org/ontology#>
SELECT *
WHERE{
?pattern_activity a prov:Activity;
prov:used ?pattern.
?pattern a prov:Entity;
rdfs:label ?pattern_label;
rdfs:comment ?pattern_comment;
prov:wasDerivedFrom ?pattern_source;
isao:hasRegex ?pattern_regex;
isao:hasType ?pattern_type.
}
ORDER BY ?pattern_label
Dataset description
The VOID file is located at http://webisa.webdatacommons.org/.well-known/void The dataset is also described at datahub with the name webisalod.
Type breakdown of the instances linked to DBpedia
Data Dumps
Links to the dumps of the dataset (gzipped n-quads):
Crowdsourcing results
Templates:
Results:- relation judgement ( threshold 0, threshold 1, threshold 2, threshold 3, threshold 5, threshold 10, threshold 20 )
- mapping to wikipedia (raw)
Code Repository
The code repository with all results is hosted at github: sven-h/webisalod
Citing WebIsaLod
- Sven Hertling and Heiko Paulheim. WebIsALOD: Providing Hypernymy Relations extracted from the Web as Linked Open Data. Proceedings of the 16th International Semantic Web Conference 2017. [pdf]