In my last post I explained how to expose your data in cloud as RDF resources. Today I am going to explain how we can query RDF resources on the web/cloud and expose extracted data as a service. First of all as usual to expose our data as a service we need to create a data service using wso2 data services server.
To demonstrate RDF data extraction I am going to use a popular RDF data source which stores interesting information about NASA aircraft details. And we are going to extract aircraft information according to the agency. Following diagram shows how NASA keep their aircraft information as a data source.
If you click on RDF/XML on the top right hand side corner link you can view the RDF source of the the data.
To create data services you need to either download and install wso2 data services server or you can straight away create data services using Stratoslive cloud platform.
Once you login to data services server, click on create under Web Services -> Data Services -> add. Give an appropriate name and click on next.
Once you login to data services server, click on create under Web Services -> Data Services -> add. Give an appropriate name and click on next.
Then you need to create an RDF data source using our NASA rdf datasource. Give the DataSource Id, Data Source Type, and RDF File Location.
RDF File Location - http://nasa.dataincubator.org/~search.rdf?query=all
Click on next and create new query to create a new query. In order to query RDF data we need to write queries using SPARQL query language which is similar to SQL.The following SPARQL query is used to extract aircraft information
PREFIX space: <http://purl.org/net/schemas/space/>PREFIX relevance: <http://a9.com/-/opensearch/extensions/relevance/1.0/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?homepage ?name ?alternateName ?internationalDesignator ?mass ?score ?launch ?agency ?description
WHERE {
?craft foaf:homepage ?homepage.
?craft foaf:name ?name.
?craft space:alternateName ?alternateName.
?craft space:internationalDesignator ?internationalDesignator.
?craft space:mass ?mass.
?craft relevance:score ?score.
?craft space:launch ?launch.
?craft space:agency ?agency.
?craft dc:description ?description.
}
Click on add Input mapping to give input mapping parameters. Since we are going to give agency as a input parameter we need to specify it in our rdfQuery. And go to main configuration.
Give the Query information and the SPARQL query as shown below.
Now we need to arrange how we display the results in the Results (output mapping section).
- Output type – xml
- Grouped by element - Aircrafts
- Row name – Aircraft
Click on add new output mappings to add output mapping elements. Give the mapping type, output field name and data source column name as shown in the table below.
Mapping Type | Data Source Type | Data Source Column Name |
---|---|---|
element | homepage | homepage |
element | name | name |
element | alternateName | alternateName |
element | internationalDesignator | internationalDesignator |
element | mass | mass |
element | score | score |
element | launch | launch |
element | agency | agency |
element | description | description |
Click on save to save the query. Now we'l add an operation to get our extracted data called getAircrafts. Click next -> add New Operation. And give the query name and the operation name. Click on finish to finish creating the data service.
Click on finish to deploy the data service. Once you click on finish you can see your deployed data service under service list as shown below.
To try this service click on our try-it feature. Lets test our service by giving “United States” as the input of our service. You can see we can get all the aircraft details coming from United States agency.
Loginworks Softwares provides web scraping, data scraping, website scraping, web data extraction, big data service, big data solution and data mining services. We provides any kind of data from any online web resource.
ReplyDelete