Monday, September 12, 2011

Some unsorted facts about RDF



Here some facts taken from the book "practical RDF" of Shelley Powers

about RDF
  • RDF provides a standard way of expressing graphs of data and sharing them with other people and with machines.
  • RDF is a language for expressing data models using statements expressed as triples. Each statements is composed of a subject, a predicate, and an object.
  • RDF conceptualizes anything (and everything) in the universe as a resource. A resource is simply anything that can be identified with a Universal Resource Identifier (URI). And by design, anything we can talk about can be assigned a URI.
  • URLs are a subset of URIs that identify where digital information can be retrieved.
  • Because URIs uniquely identify resources (things in the world), we consider them strong identifiers. There is no ambiguity about what they represent, and they always represent the same thing, regardless of the context we find them in.

about RDF graph model
  • the RDF data model is best represented by a directed labeled graph     
  • the RDF directed graph consists of a set of nodes connected by arcs, forming a pattern of node-arc-node. Additionally, the nodes come in three varieties: uriref, blank nodes, and literals.
  • there are RDF data models that can be represented in RDF graphs, but not in RDF/XML. The addition of rdf:nodeIDs provided some of the necessary syntactic elements that allow RDF/XML to record all RDF graphs. However, RDF/XML still can't encode graphs whose properties (predicates) cannot be recorded as namespace-qualified XML names, or QNames.
  • the components of the RDF graph - the uriref, bnode, literal, and arc - are the only components used to document a specific instance of an RDF data model.
  • there is no rule or regulation within the RDF graph that insists that all nodes be somehow connected with one another.
  • an RDF graph is considered grounded if there are no blank nodes.
  • an instance of an RDF graph is a graph in which each blank node has been replaced by an identifier, becoming a named node. 

about RDF tiple

  • each RDF triple is a complete and unique fact
  • each RDF triple can be joined with other RDF triples, but it still retains its own unique meaning, regardless of the complexity of the model in which it is included
  • regardless of how complex an RDF graph, it still consists of only a grouping of unique, simple RDF triples, and each is made upof a subject, predicate and object.

about urirefs

  • uriref within a RDF model has not to be resolvable (point to something that is accessible on the web). RDF is designed to be a generic means of recording data, it can't restrict urirefs to being "real" data sources.
  • URIs provide a common syntax for naming a resource regardless of the protocol used to access the resource.
  • A URI is only an identifier. A specific protocol does not need to be specified, nor must the object identified physically exist on the Web
  • you could use as URI a UUID (Universally Unique Identifier) referencing a COM or other technology components.
  • URL is a location of an object, while a URI can function as a name or a location. 

about blank nodes

  • blank nodes are also called bnodes or anonymous nodes
  • blank nodes are nodes that did not have a URI
  • most RDF parsers generate an unique identifier (genid:xxxxx) for each blank nodes. This is needed to distinguish blank nodes from each others within the single instance of the graph.
  • blank nodes are never merged in a graph because there is no way of determining whether two nodes are the same.

about literals

  • a literal consist of three parts: a character string and an optional language tag and data type.
  • literals represent RDF objects only, never subjects or predicates

about predicates

  • every arc (predicate), without exception, must be labeled within the graph.