Document entailment and turtle
README.md
255 | 255 | #### **Scheme Procedure**: `recognize graph vocabulary` | |
256 | 256 | ||
257 | 257 | Transforms a graph to replace every instance of recognized IRIs in the | |
258 | - | vocabulary by an RDF datatype. | |
258 | > | ||
259 | 0 | > | \ No newline at end of file |
258 | + | vocabulary by an RDF datatype. | |
259 | + | ||
260 | + | ### RDF Semantics | |
261 | + | ||
262 | + | RDF gives a semantics to graphs. It defines four entailment regimes where | |
263 | + | the concepts of a *valid graph* and *entailment* are defined. An entailment | |
264 | + | is a similar concept to an implication, when we interpret graphs as statements | |
265 | + | about the world. A graph G entails E, if in any world where G is "true", E is | |
266 | + | also "true". | |
267 | + | ||
268 | + | In order to prove an entailment, we need to check the validity of the claims of | |
269 | + | every triples of E, with regards to G. There is only one rule common to every | |
270 | + | entailment regime: any triple is valid with regards to G if it is not valid. | |
271 | + | ||
272 | + | #### The Simple Entailment Regime | |
273 | + | ||
274 | + | The first entailment regime is the *simple entailment regime*, defined in | |
275 | + | `(rdf entailment simple)`. In this regime, any graph is valid, so we canot | |
276 | + | derive False. Since E can contain blank nodes, we need to create a mapping | |
277 | + | from blank nodes in E to nodes (or blank nodes) in G. G entails E if and | |
278 | + | only if such a mapping exists and is valid, i. e. every mapped triple of E is | |
279 | + | a triple of G. | |
280 | + | ||
281 | + | The following procedures are available: | |
282 | + | ||
283 | + | **Scheme Procedure**: `consistent-graph? graph` | |
284 | + | ||
285 | + | Returns whether a graph is consistent in the simple entailment regime. | |
286 | + | ||
287 | + | **Scheme Procedure**: `entails? G E` | |
288 | + | ||
289 | + | Returns whether a graph G entails another graph E. | |
290 | + | ||
291 | + | #### The D Entailement Regime | |
292 | + | ||
293 | + | The second entailment regime is the *D entailment regime*, defined in | |
294 | + | `(rdf entailment d)`. This regime is parameterized by a vocabulary D (defined | |
295 | + | datatypes). A graph is valid if and only if all its recognized literals | |
296 | + | (whose type is in D) have their lexical value in their lexical space. | |
297 | + | ||
298 | + | For instance the following is not a valid graph: | |
299 | + | ||
300 | + | ``` | |
301 | + | _:a1 <http://example.org/prop> "ten"^^xsd:integer . | |
302 | + | ``` | |
303 | + | ||
304 | + | because the lexical space of `xsd:integer` does not include `"ten"`. | |
305 | + | ||
306 | + | Entailments work in a similar fasion to the simple entailment regime, but, | |
307 | + | for literals of a recognized datatype, it is sufficient to have the same value | |
308 | + | (the simple entailment regime restricts literals to having the same lexical | |
309 | + | form). For instance, the two triples are equivalent in the D entailment regime: | |
310 | + | ||
311 | + | ``` | |
312 | + | _:a1 <http://example.org/prop> "010"^^xsd:integer . | |
313 | + | _:a1 <http://example.org/prop> "10"^^xsd:integer . | |
314 | + | ``` | |
315 | + | ||
316 | + | because their objects both have the same value `10` (but a different lexical | |
317 | + | form). | |
318 | + | ||
319 | + | The following procedures are available: | |
320 | + | ||
321 | + | **Scheme Procedure**: `consistent-graph? graph vocabulary` | |
322 | + | ||
323 | + | Returns whether a graph is D-consistent, with regards to the vocabulary, an | |
324 | + | `rdf-vocabulary` object. | |
325 | + | ||
326 | + | **Scheme Procedure**: `entails? G E vocabulary` | |
327 | + | ||
328 | + | Returns whether a graph G D-entails another graph E, with regards to the | |
329 | + | vocabulary, an `rdf-vocabulary` object. | |
330 | + | ||
331 | + | #### The RDF Entailment Regime | |
332 | + | ||
333 | + | The third entailment regime is the *RDF entailment regime*, defined in | |
334 | + | `(rdf entailment rdf)`. This regime is parameterized by a vocabulary. A graph | |
335 | + | is valid if it is D-valid and if the types of every nodes are compatible. | |
336 | + | ||
337 | + | In RDF, a node can have zero, one or more types. When it has more than one type, | |
338 | + | it is only valid if its types are compatible, meaning that there is at least | |
339 | + | one value (in the value space, not the lexical space) that is in the value | |
340 | + | space of all its types. For instance, a node can be both an integer and a | |
341 | + | decimal because `10` is in the value space of both types. A node cannot be | |
342 | + | a decimal and a boolean because no value is in both spaces at the same time. | |
343 | + | ||
344 | + | Entailment in this regime is more complex and we will not describe it here. | |
345 | + | Suffices to say that some derivation rules are added, and we can implement them | |
346 | + | by first extending the graph G with new facts about the world that can | |
347 | + | be derived from it. Once we have exhausted all possible extension of G, we can | |
348 | + | apply the D entailment regime. | |
349 | + | ||
350 | + | The following procedures are available: | |
351 | + | ||
352 | + | **Scheme Procedure**: `consistent-graph? graph vocabulary` | |
353 | + | ||
354 | + | Returns whether a graph is RDF-consistent, with regards to the vocabulary, an | |
355 | + | `rdf-vocabulary` object. | |
356 | + | ||
357 | + | **Scheme Procedure**: `entails? G E vocabulary` | |
358 | + | ||
359 | + | Returns whether a graph G RDF-entails another graph E, with regards to the | |
360 | + | vocabulary, an `rdf-vocabulary` object. | |
361 | + | ||
362 | + | #### The RDFS Entailment Regime | |
363 | + | ||
364 | + | The last entailment regime is the *RDFS entailment regime*, defined in | |
365 | + | `(rdf entailment rdfs)`. this regime is parameterized by a vocabulary. A graph | |
366 | + | is valid if it is RDF-valid and if the subclasses are compatible. | |
367 | + | ||
368 | + | In RDFS, nodes can have a class, and a class system exists that orders classes | |
369 | + | in terms of subclasses. The class system is valid if and only if, for any type | |
370 | + | B which is a subclass of A, its value space is included in that of B. For instance, | |
371 | + | xsd:int is a subclass of xsd:integer (because its value space, a finite interval, | |
372 | + | is a subset of the value space of xsd:integer, which is infinite), but | |
373 | + | xsd:int is not a subclass of xsd:string. | |
374 | + | ||
375 | + | As with RDF, the RDFS entailment regime adds more deduction rules and we use them | |
376 | + | to exted the graph G. When the graph is fully extended, we use the D-entailment | |
377 | + | regime to check whether the extended G entails E. | |
378 | + | ||
379 | + | The following procedures are available: | |
380 | + | ||
381 | + | **Scheme Procedure**: `consistent-graph? graph vocabulary` | |
382 | + | ||
383 | + | Returns whether a graph is RDFS-consistent, with regards to the vocabulary, an | |
384 | + | `rdf-vocabulary` object. | |
385 | + | ||
386 | + | **Scheme Procedure**: `entails? G E vocabulary` | |
387 | + | ||
388 | + | Returns whethe a graph G RDFS-entails another graph E, with regards to the | |
389 | + | vocabulary, an `rdf-vocabulary` object. | |
390 | + | ||
391 | + | ### Turtle Format | |
392 | + | ||
393 | + | Turtle is a textual format to represent RDF graphs. We include a parser and | |
394 | + | a generator in guile-rdf. The `(turtle tordf)` module defines a parser: | |
395 | + | ||
396 | + | #### **Scheme Procedure**: `turtle->rdf str-or-file base` | |
397 | + | ||
398 | + | Generates an RDF graph from the file or string passed as first argument | |
399 | + | (we first check whether the string is a file on the filesystem, then we | |
400 | + | parse it as a string). The `base` is the document base or `#f` if there is | |
401 | + | none. When a document is downloaded from the internet, the base is typically | |
402 | + | the URl of that document, or the value of a base header. | |
403 | + | ||
404 | + | #### **Scheme Procedure**: `rdf->turtle graph` | |
405 | + | ||
406 | + | Generates a string representing a turtle document for the `graph`. This is more | |
407 | + | accurately a N-Triples representation of the graph, but that format is a subset | |
408 | + | of Turtle. | |
408 | < | ||
0 | 409 | < | \ No newline at end of file |