Thing which seemed very Thingish inside you is quite different when it gets out into the open and has other people looking at it

Friday, September 23, 2011

Expose your cloud data as RDF Resources

Since its all about semantic web 3.0 and RDF Data linking, I am going to explain about RDF data and exposing RDF data in the cloud space in 5 to 10 mins :) Just by using WSO2 Stratos Data Services Server.

The Resource Description Framework (RDF) is one of the most powerful technique to expose and interlink data(knowledge) in the decentralized world. It is also the latest trend in publishing and consuming linked data on the cloud therefore, lets discuss how we can expose our data as a RDF resource in the cloud using WSO2 Stratos Data Services Server.

1) use the RDF data model to publish structured data on the Web

RDF data model consist of set of statements which has a way of publishing link data on the web as triplets (with the use of subject predicate and object). In simple terms RDF model is a way of representing machine understandable data on the web as shown in the diagram below.

2. use RDF links to interlink data from different data sources
All things described by RDF are called resources, RDF links represents the linkage between one resource to another which is mainly done by the use of URIs.

-------------------Simple RDF file --------------
<rdf:RDF xmlns:rdf=" xmlns:cd="http://www.product.fake/cd#">
<rdf:Description rdf:about="http://www.product.fake/cd/S10_1678 ">
<cd:productName>1969 Harley Davidson Ultimate Chopper</cd:productName>

Now that we have a brief understanding on RDF and the importance of RDF data, lets see how we can generate RDF data source from a Google spread sheet.

First you need to create a google spread sheet of your choice which has some sensible information. To get the full usage of RDF you need to create several rdf

resource for the linking purposes however, for clarity purposes I will demonstrate how to create a single RDF resource and link it with an existing RDF resources.

Lets expose a google spreadsheet with product information on vehicle sales.

    Product – Describe the currently available products in a car sale vendor.

    S10_16781996 Moto Guzzi 1100iMotorcycles12
    S10_19492003 Harley-Davidson Eagle Drag BikeClassic Cars23
    S10_20161972 Alfa Romeo GTAMotorcycles18
    S10_46981962 LanciaA Delta 16VMotorcycles15
    S10_47571968 Ford MustangClassic Cars13
    S10_49622001 Ferrari EnzoClassic Cars12
    S12_10991968 Ford MustangClassic Cars4
    S12_11082001 Ferrari EnzoClassic Cars10

Lets assume we have another set of RDF resources on product line ( which has information on each product line type) ie http://productLines/car , http://productLines/cycle, http://productLines/bus

Now lets create a data service to expose our Spreadsheet data as a rdf resource. In order to expose these data in the cloud you need to have a stratoslive account. Once
you create your stratos live account you can access set of stratos services such as Enterprise Service Bus, Application Server, Data Services etc (to try out stratos services you can easily create a demo account for free )

After creating your stratos account you can easily logged into your tenant domain and start working in the cloud!!!!
Now lets go back to exposing spreadsheet data as a service ... In order to do that we need to use WSO2 Stratos Data Services Server which provides a
powerful set of feature to expose data as a service and set of service utility methods. To access data services go to stratos live manager home page and click on wso2 stratos Data services.

To create a data service go to the left side menu bar and click on create under webservices->Add->Data Services. Then you will get a wizard as shown below. Give a proper data service name and click on next.

Once you click on next you will be directed to add data source page. And give information regarding the google spreadsheet you created along with your credentials

You can click on test connection to confirm your connection.

Click on next to go to the Query page. Query page describe the extracting algorithm to extract your data from the data source (google spreadsheet). Lets extract ProductID, Model,Classification and Qty.

Since our output is RDF result set, we need to specify our output type as RDF. RDF Base URI is the format of rdf:about URI which uniquely identifies each resource.

We will give RDF base URI as http://www.product/cd/{1}; this takes the Spreadsheet column 1(which is the ID) value for each row and replaces it for the RDF about attribute inside rdf:Description element

Output Type – RDF
RDF Base URI :- http://www.product/cd/{1}
Row namespace :- http://www.product/cd#

To generate the response in RDF format click on "Add New Output Mappings" button. There are two mapping types in RDF Output mapping. 1) as a element, 2) as a resource.When mapping an element as a resource, you need to give the resource URI along with the column name which needs to be mapped in curly brackets as shown below. This way we can link two RDF resources together and create a relationship between each other.

Lets map ID, Model and Qty as elements and Classification as a resouce, Lets link classification column to the productline resouces as i mention earlier ( http://productLines/car , http://productLines/cycle, http://productLines/bus )

Mappings of RDF resource

Resource URI http://productLines/{3} (as you can see we put the column 3 to get each classification type of the product).

Resouce Field Name - Classification

Mappings of RDF element

Following diagram shows the output mappings which we mapped from google spreadsheet to RDF resource.

Once we create the the query click on next to add Resources. Since we are exposing data as RDF resource we need to create a resource to expose the data. Lets give our query information when creating the resource.

Resouce Path – Products
Resource Method – Get
Query ID – RDFQuery

Click on finish to deploy the data service. Once you click on finish you can see your deployed data service under service list as shown below.

Now that we created our RDF resource we can test it by accessing it as a rest call or by using the try its feature.

Rest URL (replace the tenant name with your tenant domain)

You can validate this RDF resource by using the online RDF validator by copy pasting the rdf resource (right click on the page and view page source copy paste it inside the validation)

Now we exposed our spreadsheet data in the cloud space just within 10 mins :) you can create more rdf data sources using the same manner with different data sources (csv/excel/rdbms) and expose those data as RDF data sources. I will further explain how we can extract RDF data using SPARQL in my next blog post :)

No comments:

Post a Comment