Document entailment and turtle
README.md
| 255 | 255 | #### **Scheme Procedure**: `recognize graph vocabulary` | |
| 256 | 256 | ||
| 257 | 257 | Transforms a graph to replace every instance of recognized IRIs in the | |
| 258 | - | vocabulary by an RDF datatype. | |
| 258 | > | ||
| 259 | 0 | > | \ No newline at end of file |
| 258 | + | vocabulary by an RDF datatype. | |
| 259 | + | ||
| 260 | + | ### RDF Semantics | |
| 261 | + | ||
| 262 | + | RDF gives a semantics to graphs. It defines four entailment regimes where | |
| 263 | + | the concepts of a *valid graph* and *entailment* are defined. An entailment | |
| 264 | + | is a similar concept to an implication, when we interpret graphs as statements | |
| 265 | + | about the world. A graph G entails E, if in any world where G is "true", E is | |
| 266 | + | also "true". | |
| 267 | + | ||
| 268 | + | In order to prove an entailment, we need to check the validity of the claims of | |
| 269 | + | every triples of E, with regards to G. There is only one rule common to every | |
| 270 | + | entailment regime: any triple is valid with regards to G if it is not valid. | |
| 271 | + | ||
| 272 | + | #### The Simple Entailment Regime | |
| 273 | + | ||
| 274 | + | The first entailment regime is the *simple entailment regime*, defined in | |
| 275 | + | `(rdf entailment simple)`. In this regime, any graph is valid, so we canot | |
| 276 | + | derive False. Since E can contain blank nodes, we need to create a mapping | |
| 277 | + | from blank nodes in E to nodes (or blank nodes) in G. G entails E if and | |
| 278 | + | only if such a mapping exists and is valid, i. e. every mapped triple of E is | |
| 279 | + | a triple of G. | |
| 280 | + | ||
| 281 | + | The following procedures are available: | |
| 282 | + | ||
| 283 | + | **Scheme Procedure**: `consistent-graph? graph` | |
| 284 | + | ||
| 285 | + | Returns whether a graph is consistent in the simple entailment regime. | |
| 286 | + | ||
| 287 | + | **Scheme Procedure**: `entails? G E` | |
| 288 | + | ||
| 289 | + | Returns whether a graph G entails another graph E. | |
| 290 | + | ||
| 291 | + | #### The D Entailement Regime | |
| 292 | + | ||
| 293 | + | The second entailment regime is the *D entailment regime*, defined in | |
| 294 | + | `(rdf entailment d)`. This regime is parameterized by a vocabulary D (defined | |
| 295 | + | datatypes). A graph is valid if and only if all its recognized literals | |
| 296 | + | (whose type is in D) have their lexical value in their lexical space. | |
| 297 | + | ||
| 298 | + | For instance the following is not a valid graph: | |
| 299 | + | ||
| 300 | + | ``` | |
| 301 | + | _:a1 <http://example.org/prop> "ten"^^xsd:integer . | |
| 302 | + | ``` | |
| 303 | + | ||
| 304 | + | because the lexical space of `xsd:integer` does not include `"ten"`. | |
| 305 | + | ||
| 306 | + | Entailments work in a similar fasion to the simple entailment regime, but, | |
| 307 | + | for literals of a recognized datatype, it is sufficient to have the same value | |
| 308 | + | (the simple entailment regime restricts literals to having the same lexical | |
| 309 | + | form). For instance, the two triples are equivalent in the D entailment regime: | |
| 310 | + | ||
| 311 | + | ``` | |
| 312 | + | _:a1 <http://example.org/prop> "010"^^xsd:integer . | |
| 313 | + | _:a1 <http://example.org/prop> "10"^^xsd:integer . | |
| 314 | + | ``` | |
| 315 | + | ||
| 316 | + | because their objects both have the same value `10` (but a different lexical | |
| 317 | + | form). | |
| 318 | + | ||
| 319 | + | The following procedures are available: | |
| 320 | + | ||
| 321 | + | **Scheme Procedure**: `consistent-graph? graph vocabulary` | |
| 322 | + | ||
| 323 | + | Returns whether a graph is D-consistent, with regards to the vocabulary, an | |
| 324 | + | `rdf-vocabulary` object. | |
| 325 | + | ||
| 326 | + | **Scheme Procedure**: `entails? G E vocabulary` | |
| 327 | + | ||
| 328 | + | Returns whether a graph G D-entails another graph E, with regards to the | |
| 329 | + | vocabulary, an `rdf-vocabulary` object. | |
| 330 | + | ||
| 331 | + | #### The RDF Entailment Regime | |
| 332 | + | ||
| 333 | + | The third entailment regime is the *RDF entailment regime*, defined in | |
| 334 | + | `(rdf entailment rdf)`. This regime is parameterized by a vocabulary. A graph | |
| 335 | + | is valid if it is D-valid and if the types of every nodes are compatible. | |
| 336 | + | ||
| 337 | + | In RDF, a node can have zero, one or more types. When it has more than one type, | |
| 338 | + | it is only valid if its types are compatible, meaning that there is at least | |
| 339 | + | one value (in the value space, not the lexical space) that is in the value | |
| 340 | + | space of all its types. For instance, a node can be both an integer and a | |
| 341 | + | decimal because `10` is in the value space of both types. A node cannot be | |
| 342 | + | a decimal and a boolean because no value is in both spaces at the same time. | |
| 343 | + | ||
| 344 | + | Entailment in this regime is more complex and we will not describe it here. | |
| 345 | + | Suffices to say that some derivation rules are added, and we can implement them | |
| 346 | + | by first extending the graph G with new facts about the world that can | |
| 347 | + | be derived from it. Once we have exhausted all possible extension of G, we can | |
| 348 | + | apply the D entailment regime. | |
| 349 | + | ||
| 350 | + | The following procedures are available: | |
| 351 | + | ||
| 352 | + | **Scheme Procedure**: `consistent-graph? graph vocabulary` | |
| 353 | + | ||
| 354 | + | Returns whether a graph is RDF-consistent, with regards to the vocabulary, an | |
| 355 | + | `rdf-vocabulary` object. | |
| 356 | + | ||
| 357 | + | **Scheme Procedure**: `entails? G E vocabulary` | |
| 358 | + | ||
| 359 | + | Returns whether a graph G RDF-entails another graph E, with regards to the | |
| 360 | + | vocabulary, an `rdf-vocabulary` object. | |
| 361 | + | ||
| 362 | + | #### The RDFS Entailment Regime | |
| 363 | + | ||
| 364 | + | The last entailment regime is the *RDFS entailment regime*, defined in | |
| 365 | + | `(rdf entailment rdfs)`. this regime is parameterized by a vocabulary. A graph | |
| 366 | + | is valid if it is RDF-valid and if the subclasses are compatible. | |
| 367 | + | ||
| 368 | + | In RDFS, nodes can have a class, and a class system exists that orders classes | |
| 369 | + | in terms of subclasses. The class system is valid if and only if, for any type | |
| 370 | + | B which is a subclass of A, its value space is included in that of B. For instance, | |
| 371 | + | xsd:int is a subclass of xsd:integer (because its value space, a finite interval, | |
| 372 | + | is a subset of the value space of xsd:integer, which is infinite), but | |
| 373 | + | xsd:int is not a subclass of xsd:string. | |
| 374 | + | ||
| 375 | + | As with RDF, the RDFS entailment regime adds more deduction rules and we use them | |
| 376 | + | to exted the graph G. When the graph is fully extended, we use the D-entailment | |
| 377 | + | regime to check whether the extended G entails E. | |
| 378 | + | ||
| 379 | + | The following procedures are available: | |
| 380 | + | ||
| 381 | + | **Scheme Procedure**: `consistent-graph? graph vocabulary` | |
| 382 | + | ||
| 383 | + | Returns whether a graph is RDFS-consistent, with regards to the vocabulary, an | |
| 384 | + | `rdf-vocabulary` object. | |
| 385 | + | ||
| 386 | + | **Scheme Procedure**: `entails? G E vocabulary` | |
| 387 | + | ||
| 388 | + | Returns whethe a graph G RDFS-entails another graph E, with regards to the | |
| 389 | + | vocabulary, an `rdf-vocabulary` object. | |
| 390 | + | ||
| 391 | + | ### Turtle Format | |
| 392 | + | ||
| 393 | + | Turtle is a textual format to represent RDF graphs. We include a parser and | |
| 394 | + | a generator in guile-rdf. The `(turtle tordf)` module defines a parser: | |
| 395 | + | ||
| 396 | + | #### **Scheme Procedure**: `turtle->rdf str-or-file base` | |
| 397 | + | ||
| 398 | + | Generates an RDF graph from the file or string passed as first argument | |
| 399 | + | (we first check whether the string is a file on the filesystem, then we | |
| 400 | + | parse it as a string). The `base` is the document base or `#f` if there is | |
| 401 | + | none. When a document is downloaded from the internet, the base is typically | |
| 402 | + | the URl of that document, or the value of a base header. | |
| 403 | + | ||
| 404 | + | #### **Scheme Procedure**: `rdf->turtle graph` | |
| 405 | + | ||
| 406 | + | Generates a string representing a turtle document for the `graph`. This is more | |
| 407 | + | accurately a N-Triples representation of the graph, but that format is a subset | |
| 408 | + | of Turtle. | |
| 408 | < | ||
| 0 | 409 | < | \ No newline at end of file |