Antoine Logean's bookmarks: 2011

Wednesday, December 28, 2011

Introduction to RDFLib

>>> import rdflib
>>> from rdflib import ConjunctiveGraph
>>> graph = ConjunctiveGraph()
>>> graph.parse("http://semantictweet.com/ecolix")
)>
>>> for triple in graph:
... print triple

Tuesday, December 6, 2011

Git is your friend !

> make dir myproject
> cd myproject
> git clone git@github.com:myproject.git .
> git branch --track abranch origin/abranch

Some tutorials:

Sunday, October 30, 2011

Notes about "JavaScript: The Definition Guide" of David Flanagan

Chapter 1: Introduction to JavaScript
there are :

the core JavaScript language = minimal API without I/O functions
client-side JavaScript = hosting environment (browser)

Development environment : Firebug
to debug : console.log() or alert()

Chapter 2: Lexical Structure

Character Set

support Unicode
Case Sensitive
ignores Whitespace

Comments

 // This is a single-line comments.

 /* This is also a comment */  // and here is another comment.

/*

  * This is yet another comment.

  * It has multiple lines

Literal

12 // The number twelve

 1.2            // The number one point two

 "hello world"  // A string of text

 'Hi'           // Another string

 true           // A Boolean value

 false          // The other Boolean value

 /javascript/gi // A "regular expression" literal (for pattern  

 matching)

null // Absence of an object

{ x:1, y:2 }    // An object initializer

[ 1, 2, 3, 4 ] // An array initializer

Monday, September 12, 2011

Some unsorted facts about RDF

Here some facts taken from the book "practical RDF" of Shelley Powers

about RDF

RDF provides a standard way of expressing graphs of data and sharing them with other people and with machines.
RDF is a language for expressing data models using statements expressed as triples. Each statements is composed of a subject, a predicate, and an object.
RDF conceptualizes anything (and everything) in the universe as a resource. A resource is simply anything that can be identified with a Universal Resource Identifier (URI). And by design, anything we can talk about can be assigned a URI.
URLs are a subset of URIs that identify where digital information can be retrieved.
Because URIs uniquely identify resources (things in the world), we consider them strong identifiers. There is no ambiguity about what they represent, and they always represent the same thing, regardless of the context we find them in.

about RDF graph model

the RDF data model is best represented by a directed labeled graph
the RDF directed graph consists of a set of nodes connected by arcs, forming a pattern of node-arc-node. Additionally, the nodes come in three varieties: uriref, blank nodes, and literals.
there are RDF data models that can be represented in RDF graphs, but not in RDF/XML. The addition of rdf:nodeIDs provided some of the necessary syntactic elements that allow RDF/XML to record all RDF graphs. However, RDF/XML still can't encode graphs whose properties (predicates) cannot be recorded as namespace-qualified XML names, or QNames.
the components of the RDF graph - the uriref, bnode, literal, and arc - are the only components used to document a specific instance of an RDF data model.
there is no rule or regulation within the RDF graph that insists that all nodes be somehow connected with one another.
an RDF graph is considered grounded if there are no blank nodes.
an instance of an RDF graph is a graph in which each blank node has been replaced by an identifier, becoming a named node.

about RDF tiple

each RDF triple is a complete and unique fact
each RDF triple can be joined with other RDF triples, but it still retains its own unique meaning, regardless of the complexity of the model in which it is included
regardless of how complex an RDF graph, it still consists of only a grouping of unique, simple RDF triples, and each is made upof a subject, predicate and object.

about urirefs

uriref within a RDF model has not to be resolvable (point to something that is accessible on the web). RDF is designed to be a generic means of recording data, it can't restrict urirefs to being "real" data sources.
URIs provide a common syntax for naming a resource regardless of the protocol used to access the resource.
A URI is only an identifier. A specific protocol does not need to be specified, nor must the object identified physically exist on the Web
you could use as URI a UUID (Universally Unique Identifier) referencing a COM or other technology components.
URL is a location of an object, while a URI can function as a name or a location.

about blank nodes

blank nodes are also called bnodes or anonymous nodes
blank nodes are nodes that did not have a URI
most RDF parsers generate an unique identifier (genid:xxxxx) for each blank nodes. This is needed to distinguish blank nodes from each others within the single instance of the graph.
blank nodes are never merged in a graph because there is no way of determining whether two nodes are the same.

about literals

a literal consist of three parts: a character string and an optional language tag and data type.
literals represent RDF objects only, never subjects or predicates

about predicates

every arc (predicate), without exception, must be labeled within the graph.

Saturday, August 13, 2011

Molecular substructure & similarity search

Surveys

Fingerprint

Substructure Search

Structure Editorts

http://merian.pch.univie.ac.at/~nhaider/cheminf/moldb.html

Markush Suche (R-Gruppe)

Chemistry Databases

Chemoinformatics Libs

Chemistry Developement Kit (CDK)

JoeLib

http://sourceforge.net/projects/joelib/

OpenBabel

http://openbabel.org/wiki/Main_Page

LOD Visualisation using JavaScript

Graphical Representation:

jQuery Sparkline
jQuery plugin generates sparklines (small inline charts) directly in the browser using data supplied either inline in the HTML, or via javascript.

Protovis

D3.js

gRaphaël

smoothiecharts

Processing.js

Geo Information :

http://openlayers.org/
Javascript mapping abstraction library mapstraction

Javascript 3D Engine
three.js

SVG

Polymaps

Tuesday, July 19, 2011

Word Bank is giving public access to 7000 data sets

The Data Catalog of Word Bank provides now download access to over 7,000 indicators from World Bank data sets.

Schema.org: Spoonfeeding Library Data to Search Engines

Not sure what Schema.org is about ? Read the nice post of Eric Hellman.

The internet of thinks

Nice explanation of what the internet of thinks could be : http://blogs.cisco.com/news/the-internet-of-things-infographic/

Definition of an Open Government Data Ontology (OGDO)

Let say we aim to build an open government data catalog with the following properties:

the solution should be based on open source software
with minimal self-development. One should be able to configure an existing framework
the content of the catalog should be readable for computer (as LOD) and for human (HTML)
everyone should be able to edit and add content.

The first step will be to develop a formal context of the catalog, to define an OGD-ontology. An OGD-ontology can be subdivided in 3 parts that are orthogonal. They can be define separately:

a Data-ontology (D-ontology)
an Open-ontology (O-ontology)
a Government-ontology (G-ontology)

1) eine Data-Ontologie (og[D]) : diese Ontologie (ein SKOS Taxonomie könnte erstmal reichen) definiert die Semantik der Daten. Die hätte Konzepten wir "Healthcare", "Army", "Defence", "Religion", "Education", ... einfach alle die nötigen Schubladen die wir brauchen um OGD einzuordnen. Bevor eine richtige OWL Ontologie zu definieren kann man hier erstmal mit einer SKOS Taxonomie anfangen. Das sollte auch einfacher das in semantic MediaWiki zu integrieren. Diese Ontologie ist auch ganz generisch. Sie gilt in Prinzip für alle Landen. Ein gut Anfang wäre die Katalog von open.gov anzuschauen. Wahrscheinlich hat man das auch schon gemacht. Ich habe bis jetzt noch nichts gefunden. Bis jetzt das beste das ich habe ist [ ]

2) eine Open-Ontologie ([O]gd) : diese Ontologie beschreibt die Art wir die Daten veröffentlicht sind, den so genannte Dienst Vertrag, die nicht funktionalen Aspekten der Schnittstelle: wo sind die Daten zu finden (URI)? in welche Format (in LOD sollte das eher mit Content-Negociation machen), gibt es ein Gebühr ? Wenn ja wieviel. Welche Copyright ist mit der Daten gebunden, wie grosse sind die Daten?, Wann wurden sie das letzte Mal aktualiesiert? gibt es ein Kontakt Personn ? ... Genau wie bei der Daten-Ontologie ist diese Ontologie ganz generisch und gar nicht CH-spezifisch. Im SOA Umfeld hat man bestimmt etwas ähnlich schon definiert.

3) eine Government-Ontologie (o[G]d): mit dieser Ontologie kann man die politische Organisation/Strukturen des Landes spezifizieren. Wir haben hier Konzepten wie "Bund", "Kantonen", "Gemeide", "Departement",... In Prinzip wird so eine Ontologie einmal für die Schweiz definiert und sollte sich nicht so viel ändern (es hat sich diese letzte 100 Jahren kaum geändert ...). Hier auch sollte erstmal ein Taxonomie reichen.

Diese 3 Ontologien definieren den formalen Rahme des Verzeichnis. Dann sollte man semantic MediaWiki so konfiguriert das es nur möglich ist, diese OGD-Ontologie/Taxonomie zu instanzieren. Da weiss ich nicht genau ob die semantic extension von MediaWiki so etwas ermöglicht. Grundsätzlich kann man 2 Sichten auf die Daten definieren: eine Daten Sicht und eine Government Sicht. Die Daten Sicht listet (es wird eher ein Baum-hierachie) einfach die verschieden Arten von Daten. Für eine bestimmte OGD-Daten Kategorie (zum Beispiel "Kultur" ) sehe

Tuesday, July 12, 2011

Semantic Wiki with Referata

Referata offers hosting of semantic wikis (MediaWiki + semantic MediaWiki extension ) : http://tinyurl.com/67sm9u3

Data.gov catalogs

An interactive dataset containing the metadata for the Data.gov raw datasets and tools catalogs : http://explore.data.gov/Other/Data-gov-Catalog/pyv4-fkgv

Friday, May 20, 2011

DNS entries caching by Windows

Do you want to know which websites your child has visited ? In windows simply enter in a command line ipconfig /displaydns
This command shows all the dns domain name of the web servers web sites that a user has been visited.

ipconfig /flushdns:
removes that lists.

The Berlin SPARQL Benchmark

The Benchmark

Query 1: Find products for a given set of generic features

SELECT DISTINCT ?product ?label

WHERE {

?product rdfs:label ?label .

?product rdf:type %ProductType% .

?product bsbm:productFeature %ProductFeature1% .

?product bsbm:productFeature %ProductFeature2% .

?product bsbm:productPropertyNumeric1 ?value1 .

FILTER (?value1 > %x%)}

ORDER BY ?label

LIMIT 10

Query 2: Retrieve basic information about a specific product for display purposes

SELECT ?label ?comment ?producer ?productFeature ?propertyTextual1

?propertyTextual2 ?propertyTextual3 ?propertyNumeric1

?propertyNumeric2 ?propertyTextual4 ?propertyTextual5

?propertyNumeric4

WHERE {

%ProductXYZ% rdfs:label ?label .

%ProductXYZ% rdfs:comment ?comment .

%ProductXYZ% bsbm:producer ?p .

?p rdfs:label ?producer .

%ProductXYZ% dc:publisher ?p .

%ProductXYZ% bsbm:productFeature ?f .

?f rdfs:label ?productFeature .

%ProductXYZ% bsbm:productPropertyTextual1 ?propertyTextual1 .

%ProductXYZ% bsbm:productPropertyTextual2 ?propertyTextual2 .

%ProductXYZ% bsbm:productPropertyTextual3 ?propertyTextual3 .

%ProductXYZ% bsbm:productPropertyNumeric1 ?propertyNumeric1 .

%ProductXYZ% bsbm:productPropertyNumeric2 ?propertyNumeric2 .

OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual4 ?propertyTextual4 }

OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual5 ?propertyTextual5 }

OPTIONAL { %ProductXYZ% bsbm:productPropertyNumeric4 ?propertyNumeric4 }}

Query 3: Find products having some specific features and not having one feature

SELECT ?product ?label

WHERE {

?product rdfs:label ?label .

?product rdf:type %ProductType% .

?product bsbm:productFeature %ProductFeature1% .

?product bsbm:productPropertyNumeric1 ?p1 .

FILTER ( ?p1 > %x% )

?product bsbm:productPropertyNumeric3 ?p3 .

FILTER (?p3 Query 4: Find products matching two different sets of features

SELECT ?product ?label

WHERE {

{ ?product rdfs:label ?label .

?product rdf:type %ProductType% .

?product bsbm:productFeature %ProductFeature1% .

?product bsbm:productFeature %ProductFeature2% .

?product bsbm:productPropertyNumeric1 ?p1 .

FILTER ( ?p1 > %x% )

} UNION {

?product rdfs:label ?label .

?product rdf:type %ProductType% .

?product bsbm:productFeature %ProductFeature1% .

?product bsbm:productFeature %ProductFeature3% .

?product bsbm:productPropertyNumeric2 ?p2 .

FILTER ( ?p2> %y% ) }}

ORDER BY ?label

LIMIT 10 OFFSET 10

Query 5: Find products that are similar to a given product

SELECT DISTINCT ?product ?productLabel

WHERE {

?product rdfs:label ?productLabel .

FILTER (%ProductXYZ% != ?product)

%ProductXYZ% bsbm:productFeature ?prodFeature .

?product bsbm:productFeature ?prodFeature .

%ProductXYZ% bsbm:productPropertyNumeric1 ?origProperty1 .

?product bsbm:productPropertyNumeric1 ?simProperty1 .

FILTER (?simProperty1

(?origProperty1 - 120))

%ProductXYZ% bsbm:productPropertyNumeric2 ?origProperty2 .

?product bsbm:productPropertyNumeric2 ?simProperty2 .

FILTER (?simProperty2

(?origProperty2 - 170)) }

ORDER BY ?productLabel

LIMIT 5

Query 6: Find products having a label that contains a specific string

SELECT ?product ?label

WHERE {

?product rdfs:label ?label .

?product rdf:type bsbm:Product .

FILTER regex(?label, "%word1%")}

Query 7: Retrieve in-depth information about a product including offers and reviews

SELECT ?productLabel ?offer ?price ?vendor ?vendorTitle ?review

?revTitle ?reviewer ?revName ?rating1 ?rating2

WHERE {

%ProductXYZ% rdfs:label ?productLabel .

OPTIONAL {

?offer bsbm:product %ProductXYZ% .

?offer bsbm:price ?price .

?offer bsbm:vendor ?vendor .

?vendor rdfs:label ?vendorTitle .

?vendor bsbm:country

.

?offer dc:publisher ?vendor .

?offer bsbm:validTo ?date .

FILTER (?date > %currentDate% ) }

OPTIONAL {

?review bsbm:reviewFor %ProductXYZ% .

?review rev:reviewer ?reviewer .

?reviewer foaf:name ?revName .

?review dc:title ?revTitle . OPTIONAL { ?review bsbm:rating1 ?rating1 . }

OPTIONAL { ?review bsbm:rating2 ?rating2 . } } }

Query 8: Give me recent English language reviews for a specific product

SELECT ?title ?text ?reviewDate ?reviewer ?reviewerName ?rating1

?rating2 ?rating3 ?rating4

WHERE {

?review bsbm:reviewFor %ProductXYZ% .

?review dc:title ?title .

?review rev:text ?text .

FILTER langMatches( lang(?text), "EN" )

?review bsbm:reviewDate ?reviewDate .

?review rev:reviewer ?reviewer .

?reviewer foaf:name ?reviewerName .

OPTIONAL { ?review bsbm:rating1 ?rating1 . }

OPTIONAL { ?review bsbm:rating2 ?rating2 . }

OPTIONAL { ?review bsbm:rating3 ?rating3 . }

OPTIONAL { ?review bsbm:rating4 ?rating4 . } }

ORDER BY DESC(?reviewDate) LIMIT 20

Query 9: Get information about a reviewer.

DESCRIBE ?x

WHERE {

%ReviewXYZ% rev:reviewer ?x }

Query 10: Get cheap offers which fulfill the consumer’s delivery requirements.

SELECT DISTINCT ?offer ?price

WHERE {

?offer bsbm:product %ProductXYZ% .

?offer bsbm:vendor ?vendor .

?offer dc:publisher ?vendor .

?vendor bsbm:country %CountryXYZ% .

?offer bsbm:deliveryDays ?deliveryDays .

FILTER (?deliveryDays %currentDate% ) }

ORDER BY xsd:double(str(?price))

LIMIT 10

Query 11: Get all information about an offer.

SELECT ?property ?hasValue ?isValueOf

WHERE {

{ %OfferXYZ% ?property ?hasValue }

UNION

{ ?isValueOf ?property %OfferXYZ% } }

Query 12: Export information about an offer into another schema.

CONSTRUCT {

%OfferXYZ% bsbm-export:product ?productURI .

%OfferXYZ% bsbm-export:productlabel ?productlabel .

%OfferXYZ% bsbm-export:vendor ?vendorname .

%OfferXYZ% bsbm-export:vendorhomepage ?vendorhomepage .

%OfferXYZ% bsbm-export:offerURL ?offerURL .

%OfferXYZ% bsbm-export:price ?price .

%OfferXYZ% bsbm-export:deliveryDays ?deliveryDays .

%OfferXYZ% bsbm-export:validuntil ?validTo }

WHERE {

%OfferXYZ% bsbm:product ?productURI .

?productURI rdfs:label ?productlabel .

%OfferXYZ% bsbm:vendor ?vendorURI .

?vendorURI rdfs:label ?vendorname .

?vendorURI foaf:homepage ?vendorhomepage .

%OfferXYZ% bsbm:offerWebpage ?offerURL .

%OfferXYZ% bsbm:price ?price .

%OfferXYZ% bsbm:deliveryDays ?deliveryDays .

%OfferXYZ% bsbm:validTo ?validTo }

Tuesday, May 17, 2011

BibSonomy

A blue social bookmark and publication sharing system.

Monday, May 16, 2011

Mac Keyboard Shortcuts

switch to black and white	ctrl + alt + cmd + 8
Take picture of the entire screen	cmd + shift + 3

Tuesday, May 10, 2011

rNews: embedding metadata in online news (RDFa)

http://dev.iptc.org/rNews

Thursday, May 5, 2011

A Simple Linked Data and JavaScript Tutorial

http://dailyjs.com/2010/11/26/linked-data-and-javascript/

Getting data from the Semantic Web (Ruby)

http://semanticweb.org/wiki/Getting_data_from_the_Semantic_Web_%28Ruby%29

Wednesday, May 4, 2011

publishMydata

http://publishmydata.com/

Getting Started with RDF and SPARQL Using 4store and RDF.rb

See the post of Jeni Tennison : http://www.jenitennison.com/blog/node/152

Tuesday, April 26, 2011

Open Data Directory

A free search engine for open data sets published by governments, private companies and other organizations.

Thursday, April 21, 2011

Informationssicherheit: Lage in der Schweiz

http://www.news.admin.ch/NSBSubscriber/message/attachments/22741.pdf

Sunday, April 17, 2011

How RDF Databases Differ from Other NoSQL Solutions

The article by Arto (2010/04/22)

Saturday, April 16, 2011

Open Data Showroom

Open Data Projekte aus Deutschland, Europa und der Welt

Wednesday, April 13, 2011

Data visualizations for a changing world

Explore the data

New Film about Open Government Data

http://opengovernmentdata.org/film/

or

http://blog.okfn.org/2011/04/13/opendata-new-film-about-open-government-data/

Tuesday, April 12, 2011

Open Knowledge Foundation Blog

http://blog.okfn.org/

Crime maps

Putting Government Data online from Tim Berners-Lee

Putting Government Data online from Tim Berners-Lee:

Abstract

Government data is being put online to increase accountability, contribute valuable information about the world, and to enable government, the country, and the world to function more efficiently. All of these purposes are served by putting the information on the Web as Linked Data. Start with the "low-hanging fruit". Whatever else, the raw data should be made available as soon as possible. Preferably, it should be put up as Linked Data. As a third priority, it should be linked to other sources. As a lower priority, nice user interfaces should be made to it -- if interested communities outside government have not already done it. The Linked Data technology, unlike any other technology, allows any data communication to be composed of many mixed vocabularies. Each vocabulary is from a community, be it international, national, state or local; or specific to an industry sector. This optimizes the usual trade-off between the expense and difficulty of getting wide agreement, and the practicality of working in a smaller community. Effort toward interoperability can be spent where most needed, making the evolution with time smoother and more productive.

Monday, April 11, 2011

Monisme et Dualisme

Toutes les philosophies sont soit monistes soit dualistes. Les monistes pensent que le monde matériel est le seul qui existe. Les dualistes croient en un monde spirituel en plus du monde matérialiste.

Repository of Open Government Data Catalogs

Repository of Open Government Data Catalogs [1]. Until now, only the Open Data initiatives led by either governments or public agencies have been published. Now, any public sector information catalog (managed by citizen movements, transparency commissions, NGOs, and other institutions) is welcomed. The only requirement is that those catalogs must contain public sector information.

The second main feature is the collaborative aspect of the catalog. Anyone may contribute submitting new catalogs using a simple form [2]. All the changes will be moderated to avoid spam or inaccuracies. After the submission an the approval, the meta-information of the initiative will be available through a SPARQL endpoint [3].

[1] http://datos.fundacionctic.org/sandbox/catalog/faceted/
[2] http://datos.fundacionctic.org/sandbox/catalog/manage/new
[3] http://data.fundacionctic.org/sparql

Freeing Greater Manchester's Public Data

http://datagm.org.uk/

Friday, April 8, 2011

twouse: Semantic Web Enabled Software Engineering

http://code.google.com/p/twouse/

Thursday, April 7, 2011

GoodURIs

What qualities make a URI work well in RDF and on the web in general?

Semantic Web Development Tools

http://www.w3.org/2001/sw/wiki/Tools

Saturday, April 2, 2011

Applications of Ontologies in Software Engineering

From Hans-Jörg Happel and Stefan Seedorf :
The Article

Abstract. The emerging field of semantic web technologies promises new stimulus for Software Engineering research. However, since the underlying concepts of the semantic web have a long tradition in the knowledge engineering field, it is sometimes hard for software engineers to overlook the variety of ontology-enabled approaches to Software Engineering. In this paper we therefore present some examples of ontology applications throughout the Software Engineering lifecycle. We discuss the advantages of ontologies in each case and provide a framework for classifying the usage of ontologies in Software Engineering.

Documentary about the Semantic Web by Kate Ray

Kate Ray, a Journalism/Psychology major at NYU made a documentary about the Semantic Web. The film gives a nice overview of what the Semantic Web is and what it is trying to achieve.

Semantic Web technologies applied to Software Engineering

http://www.ifi.uzh.ch/pax/uploads/pdf/publication/1207/icse2009_tutorial.pdf

serialization involves turning a graph into a tree

serialization involves turning a graph into a tree : explain

Friday, April 1, 2011

A Simple HTML5 RDFa Example

http://blog.3kbo.com/2010/11/10/simple-html5-rdfa-example/

Wednesday, March 30, 2011

Semantic Turkey: A Semantic Web Knowledge Management and Acquisition Platform based on the Firefox Web Browser

http://semanticturkey.uniroma2.it/

OWLGrEd: Editor for Compact UML-style OWL Graphic Notation

http://owlgred.lumii.lv/

Friday, February 25, 2011

Open Government Data in Switzerland

Articles & news

Organisations:

Blog Digitale Nachhaltigkeit

Firmen:

http://www.itopia.ch/index.de.html

Key players:

Presentations:

Open government obl_20101006

Podcasts:

Behördendaten als wertvolles Rohmaterial

Open Government Data

What is open government data?

By “open” one means open as in the Open (Knowledge) Definition — in essence material (data) is open if it can be freely used, reused and redistributed by anyone.
By “government data” one means data and information produced or commissioned by government or government controlled entities.

see http://opengovernmentdata.org/

Monday, February 21, 2011

Harald Sack's blog

Harald Sack's blog: http://moresemantic.blogspot.com/

Videos about linked data

von Harald Sack: http://www.tele-task.de/archive/video/flash/12488/
von Georgi Kobilarov : http://www.yovisto.com/video/17144

Saturday, January 29, 2011

Lexing and Parsing

Lexing is the process of dividing an input stream into meaningful units, or tokens, which are then processed. Parsing refers to discovering semantic meaning out of a series of tokens according to the rules of a grammar.

Thursday, January 27, 2011

RDF versus RDB-shema

Fixed RDB shema :

Image taken from ontotext

The same using RDF :

Image taken from ontotext

Friday, January 21, 2011

Semantic Web Architecture

Thursday, January 20, 2011

Some definitions about languages

A language consists of a structure definition (called abstract syntax or meta model), a definition of the notation (also called concrete syntax) and semantics. In semantics we distinguish between the static semantics (constraints, typesystem) and the operational semantics (what something means as it is executed).