This page is out of date!
 

What is this about? Up^

Our ultimate goal is to bootstrap transforming of government data into RDF and better as Linked Data. While many government are going open, we want to help at making them going open and linked :-).

We present here a couple of contributions:

  1. dcat

    dcat is an RDF vocabulary for the exchange of data catalogs. Its primary purpose is the expression of government data catalogs, such as data.gov or data.gov.uk, in RDF.

    As a feasibility study, we put four existing catalogs (namely data.gov, data.london.gov.uk, data.australia.gov.au and datasf.org) in RDF using dcat. We provide Linked Data interface and SPARQL endpoint to access this data.

  2. Integration with and extending Freebase Gridworks

    If you are not aware of Freebase Gridworks (yet?)... watch one of the great screencasts on their website now!

    Freebase Gridworks is a power tool that allows you to load data, understand it, clean it up, reconcile it internally, augment it with data coming from Freebase, and optionally contribute your data to Freebase for others to use. All in the comfort and privacy of your own computer.

    But it doesnot provide a direct way to export RDF!!

    we enabled navigating a catalog represented in RDF using dcat through Freebase Gridworks (the catalog can be provided as RDF dump file or through a SPARQL endpoint). We also added RDF export functionality to Gridworks.

In conclusion, we provide a way to represent government catalogs, which are hubs for very valuable government data, in RDF and then provide an easy way to navigate throught this data, open it using, the very powerful tool, Freebase Gridworks where data can be cleaned, linked and enhanced. Finally, we enable exporting this data as RDF. We believe that this two-steps tackling of RDFizing government data is necessary to manage the various datasets that governments provide i.e. It enables tackling domain-specific dataset in a case-by-case manner.

Download Up^

download (Updated:13/08/2010)

Download, unzip, navigate to the folder and run

java -jar gw.jar

How to have it running? Up^

Currently, the application runs only on Java 1.6 We will provide a distribution that runs on Java 1.5 soon.

Screenshots Up^

The image below shows the starting screen. The arrow 1 points to the new section added to Freebase Gridworks which enables browsing a government catalog represented as RDF according to dcat vocabulary and recommendations. Note that you can browse a SPARQL end point or a dump RDF file (arrow 2).

start screen

The example shows the result of our experimental SPARQL endpoint which contains dcat representation of four catalogs: data.gov, data.australia.gov.au, data.london.gov.uk and datasf.org.
When using the application you can use this as a SPARQL endpoint URL: http://lab.linkeddata.deri.ie/govcat/sparql (please be patient as the endpoint is running on limited resources)
or use one of these as dump files:

  • data.australia.gov.au: http://lab.linkeddata.deri.ie/2010/dcat/files/data_australia.rdf
  • data.london.gov.uk: http://lab.linkeddata.deri.ie/2010/dcat/files/data_london.rdf
  • datasf.org: http://lab.linkeddata.deri.ie/2010/dcat/files/data_sf.rdf

After loading the RDF data, you can browse through the available datasets. Full-text search(arrow 1) and category and data format facets(arrow 2)can be used to search teh catalog.

Any dataset can be dowloaded and tabular ones can be opened using Freebase Gridworks(arrow 3), then you have all the goodness and power of it to clean and polish the data.

catalog screen

Inside Freebase Gridworks you can now find "Edit RDF Schema" option under Schemas menu

catalog screen

The Dialog empowers you to shape the RDF the way you want... you can set base URI(arrow 2) and all relative URIs will be resolved against it. You can add rdf:type to resources(arrow 1) You can define your own property if the autocomplete popup does not help(arrow 4) entering a relative URI will coin a new property within the namespace dtermined by the base URI you entered. At any point, you can preview the resulting RDF(arrow 3) this will show (up to) the first 20 rows represented in Turtle. Vocabulary Manager (arrow 5) enables managing the used vocabularies/ontologies.


Vocabulary manager. A handful of popular vocabularies are predefined for convinience.

catalog screen

Autocomplete options. Terms are based on the vocabularies defined in the vocabulary manager prefix.cc

Clicking on anode shows a dialog where you can specify all the details of the intended RDF resources.

catalog screen

To define your custom URIs you have the full power of Gridwokrs Expression Language (GEL). We also add a urlify function to it.

catalog screen

Issues Up^

While this is work-in-progress and still have some bugs and missing features, we wat to highlight a further issue here. Catalogs (especially data.gov) might define the format of a dataset as CSV but actually provide the data in different format (usually exe) and things will just not work as expected. So at least some unexpected behavior is not our fault :-)

Credits Up^

Freebase Gridworks

Freebase Gridworks is originally developed by Metaweb. It is now an open source project hosted on Google Code. People invlolved are listed here.

We are very gratefule to Gridworks developers and Metaweb for making such a great tool open source!

dcat

dcat is originally developed at DERI, Galway by Fadi Maali, Richard Cyganiak and Vassilios Peristeras.

Work on dcat now is pursued under the W3C eGov Interest Group. You can follow the work here.

Extending GridWorks

The additional functionality, namely exporting RDF and browsing governemnt catalogues described in dcat, is developed by Fadi Maali and Richard Cyganiak.

Contacts Up^

Please feel free to contact us regarding any comment, question, bugs, features...

This site is © Copyright , Linked Data Research Centre (LiDRC), DERI 2010, All Rights Reserved
Free website templates