Developing for the Semantic Web with Ruby on Rails, the 2012 guide
30th January 2012 at 10:47

The Semantic Web, an Internet of of connected knowledge, meaningful data and machine-readable information, is seen as many as the future of the Internet. Envisioned by Tim Berners-Lee, its purpose is to transition from a web of unstructured documents into a “web of data”, in which we could browse and search for data not only based on syntactic keywords, but also based on semantic meaning. However, it’s awkward to realise that most of the work around the Semantic Web is done in Java, and many technologies with great future potential, such as Ruby on Rails, are seemingly forgotten.
Having developed a Semantic Web project using that Ruby framework, I found very scattered information and tiny bits of contributions, many already obsolete. It’s almost shocking to see so few and old information around such a modern framework with such a vibrant and participating community developing for it. As such, having collected many of those bits of information, I believe it’s time to put them together and finally create an updated guide to the development for the Semantic Web with Ruby on Rails.
Disclaimer: This is not a step-by-step tutorial, so don’t expect to know how to develop a Semantic Web application in Ruby on Rails as soon as you finish reading this. The main goal of this article is to save hours of research and show the best and updated tools available. Also, the code examples aren’t necessarily complete or 100% correct, as their goal is to explain something, and not simply being copy+pasted to a tutorial application.
Building the ontology and fetching relevant data
If you’re going to develop a website using the concept of the Semantic Web, one of the first things you’ll need is an ontology. You may find many useful ontologies capable of describing the data your web application is going to use (the Swoogle project might be a good place to start your search), but you might as well have to develop your own. If you’re building your own ontology, I recommend using Protégé for it.
The development of the ontology is out of the scope of this article, as it has no relation to the platform used to develop the web application itself, so if you don’t know how to start I’d recommend reading some simpler guides first (here’s a short collection of links that might help you with that).
After you have the right ontology to power your web application, it’s time to to fetch the data.Unless you’re aiming for a very limited and controled set of elements, you’ll probably want to develop a screen scraper or use an external API. Using the Ruby gem Nokogiri, you can easily browse and fetch the information you want from any useful website. Here is a simple example of a screen scraper for your application:
items = []
doc = Nokogiri::HTML(open(website_url))
doc.css(".item").each do |item|
link = item.at_css(".info h4 a")[:href]
item_doc = Nokogiri::HTML(open(link))
name = item_doc.at_css("h2#title a").content.to_s
description = item_doc.at_css("p.description").content.to_s
items << ["name" => name, "description" => description]
end
return items
As you see, this little example simply iterates through all the elements of the website of the class “item”, browses to their specific page and then extracts the name and description. This technique may have its flaws, like being vulnerable to major website redesigns, but it’s also a very simple and rapid way of extracting large quantities of information.
Ideally, you’ll want to save your new information in XML files or in a database, so you can use them later without having to run the fetcher again. This, of course, is just a temporary location, ready to change as soon as the Semantic Web enters the game.
Putting the Semantic Web in Ruby on Rails
It’s easy enough to develop for the Semantic Web in Java, as you can easily find a large set of tools to help you in every stage of the development. In Ruby it’s not that easy at all. After some research you’ll probably come across two Ruby gems that handle the semantic graph: ActiveRDF and RDF.rb. However, ActiveRDF is now outdated and it doesn’t seem to work as it should in Rails 3. So RDF.rb it is, then.
Now that we have the tool to manage the semantic data, we should save our raw data in a semantic graph. We’ll use the N-Triples format for that. First you’ll need to define all the semantic relations of your data:
include RDF
MyOnt = RDF::Vocabulary.new("http://ricardolopes.net/myont.owl#")
graph = RDF::Graph.new
items.each do |item|
graph << [item["uri"], RDF.type, MyOnt.Item]
graph << [item["uri"], MyOnt.hasName, item["name"]]
end
return graph
This example assumes that you have an ontology with the URI http://ricardolopes.net/myont.owl, which has at least a class Item and a property hasName. This example, as all others in this guide, is far from complete, as it is just used for quick demonstrations. One thing it doesn’t show is how to create the unique URI for each item. That is up to you to decide, as every project is different. However, when you have a unique identifier for every item, you probably should do something like item["uri"] = MyOnt[unique_identifier]. Beware, though, that if you use MyOnt to store individuals, you’ll no longer have an empty ontology serving only as a schema. You’ll probably find it more useful to use an URI for the ontology schema and another for the individuals (MyOnt and Data, for instance).
After that step, you’ll then need to store the information of the semantic graph you created in an N-Triples file:
RDF::Writer.open("data/graph.nt") do |writer|
graph.each_statement do |stmt|
writer << stmt
end
end
This is it. you should now have a complete N-Triples file with all your semantic information. Now that you have the semantic data, you may start using its full potential.
Querying the graph
If you’re not new to the Semantic Web development you’re probably used to SPARQL when it comes to querying for information. In Ruby on Rails you’ll find it possible to do such queries using pure Ruby, thanks to the RDF.rb gem. Imagine you have the following information:
<http://ricardolopes.net/data/2011/01/10/ricardolopes> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ricardolopes.net/myont.owl#Developer> . <http://ricardolopes.net/data/2011/01/10/ricardolopes> <http://ricardolopes.net/myont.owl#hasName> "Ricardo Lopes" . <http://ricardolopes.net/data/2012/01/30/semanticwebarticle> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ricardolopes.net/myont.owl#Article> . <http://ricardolopes.net/data/2012/01/30/semanticwebarticle> <http://ricardolopes.net/myont.owl#hasName> "Semantic Web article" . <http://ricardolopes.net/data/2011/01/10/ricardolopes> <http://ricardolopes.net/myont.owl#isAuthorOf> <http://ricardolopes.net/data/2011/01/30/semanticwebarticle> .
This means that we have an individual of the class Developer that has the name “Ricardo Lopes”, and an individual of the class Article that has the name “Semantic Web article”. We also have a relation that states that the Developer “Ricardo Lopes” is the author of the Article “Semantic Web article”.
Imagine now that you’re showing the article “Semantic Web article” and want to show the name of its author. Using the RDF.rb gem, you could do it simply using Ruby:
article_name = "Semantic Web article"
...
graph.query([nil, MyOnt.hasName, article_name]) do |stmt1|
# stmt1.subject now holds the URI of the article
graph.query([nil, MyOnt.isAuthorOf, stmt1.subject]) do |stmt2|
# stmt2.subject now holds the URI of the author
graph.query([stmt2.subject, MyOnt.hasName, nil]) do |stmt3|
# stmt3.object now holds the name of the author
author_name = stmt3.object.inspect
end
end
end
This is equivalent to the SPARQL query:
PREFIX MyOnt: <http://ricardolopes.net/myont.owl#>
SELECT ?author_name
WHERE {
?article MyOnt:hasName "Semantic Web article" .
?author MyOnt:isAuthorOf ?article .
?author MyOnt:hasName ?author_name .
}
If you still prefer to use SPARQL instead of pure Ruby, there’s a gem for that: SPARQL-Grammar. Beware, though, that, as I’ve said before, the Semantic Web in Ruby has still a long road to run to catch up with languages such as Java. The work to bring SPARQL to Ruby is still very incomplete and there are many things you can’t do (sometimes I couldn’t even define prefixes, which resulted in some very ugly queries). If you want to give that gem a try, your code should be somewhat similar to this:
queryable = RDF::Repository.load("data/graph.nt")
authors = []
query = # the SPARQL query above
sse = SPARQL::Grammar.parse(query)
results = sse.execute(queryable)
results.each do |result|
author = result[:author_name].to_s
authors << author unless authors.index(author)
end
You now have the tools you need to make full use of the semantic information of your application. With many Semantic Web applications, this is all you need to know. However, there’s more we can do. If you’re developing in Ruby on Rails, chances are you’re intending to put your web application on the cloud. And there’s also a good possibility that you may want to change some of your data in the future. That’s when a simple file is no longer an option to store your data and you must upgrade it to a Triple Store.
Leveling up to a Triple Store
A Triple Store stands for an N-Triples file as a database stands for an XML file (sort of). So if you’ll ever need to update the semantic information of your web application, that’s what you should use.
There are plenty of Triple Stores in Java for you to run locally or in your server, like Jena, AllegroGraph, Sesame and many more. There are interfaces that let your Ruby application interact with those Java Triple Stores, so if your server supports Java or localhost is as far as you’ll get, them one of those might be the right solution for you.
However, if you’re developing the webapp in Ruby on Rails, chances are you want it online and your server might not support other languages. If that’s the case, then you need to develop an interface that creates an abstraction in your server’s database so it acts as a Triple Store. Such an abstraction would probably use a single 3-column table (for subject, predicate and object) and would return a set of triples for every query received. The hardest step should be converting from SPARQL or pure-Ruby RDF.rb queries into the proper SQL query.
Fortunately, if you use PostgreSQL and/or SQLite, there’s an amazing work done that integrates those databases with RDF.rb data. It’s the Ruby gem RDF-do, and it may save you days of work and many frustrations.
To sum up
Once one grasps the power of the Semantic Web, it’s not easy to ignore its advantages and future potential. However, as you can now see, the work around it in Ruby feels scattered and incomplete. Because it was so difficult to come across these scattered pieces of great work and because of what can be achieved when they are all connected, there was a great need of putting all this knowledge together. Yes, there are other similar guides online, but many are so outdated that may put you on the wrong track.
As I’ve pointed out before, this wasn’t intended to be a step-by-step tutorial for you to copy+paste all the code you needed to develop a basic application. If this article is so long as it is, imagine how it would be with all that information. It’s more like an updated guide you can use to save hours of research and frustration.
As I’m fairly new in this area, there are probably many mistakes in this article. Still, I hope that the bundled information can make up for that.




10 Comments
Alexander
19/02/2012
Thank you, you helped me a lot
Juan Francisco Reyes
11/03/2012
I am planning to develop a web semantic application for news content as part of my thesis project. Actually, I am planning to focus in the cognitive processes of consuming news under the old fashion vs. the semantic fashion, so I need to have a running news content application but I have not chosen what language/platform/framework fits in my project. Since I have previous experience on RoR I was thinking on choose it for my purposes but I am not still sure. Any suggestion?
ricardolopes
11/03/2012
Thank you for your comments.
RoR is an excellent framework for web development, but I feel like it’s still far behind Java when talking about the Semantic Web.
This post proves that it’s possible to develop a full Semantic Web application in RoR, but for many people with good Java knowledge I believe Java is still the best option (this decision, of course, should be based on one’s experience with all possible alternatives).
I doubt many Rails fans would love that option, but there’s also good MVC frameworks for Java, like Play Framework (which I have never used, BTW).
Juan Francisco Reyes
16/03/2012
Hey Ricardo, are you developing semantic web apps using RoR? I am planning to begin to do it. What materials you would recommend me to read?
ricardolopes
17/03/2012
I built one as a project for a Semantic Web course. Unfortunately I didn’t find much literature about Semantic Web in RoR (and because of that I felt the need to write this post), so I believe the best I can recommend is the pages of the tools I talked about (like RDF.rb and SPARQL-Grammar).
Mauricio Mercado
26/06/2012
Hello Ricardo, great post!
I have a specific question… do you have an example on how to parse a rdf file and save it into a triplestore … for example sesame… or preferably a library with good/great documentation?
Im looking to build a small web semantic app, and want to use ruby or maybe python… what would you recommend?
ricardolopes
26/06/2012
Between Ruby and Python I would choose Ruby, because Python seems to have fewer and less stable tools for the Semantic Web (from what I understood by watching people use Python for SW, as I only used Ruby).
Unfortunately I didn’t use any Triple Store because of technical constraints: I had to use an abstraction in a relational database. The best tool I can recommend for that is RDF-do.
Srinivas
15/07/2012
Hi Richard
I Need a step by step implementation of a rails application with sqlite store. Although https://s3.amazonaws.com/video-encoding-outputs/rails-rdf-agraph.mp4 helped me a lot.
But still I don’t understand how to integrate ActiveRecord(which is a Relational Database ORM) with other RDF storages
Srinivas
15/07/2012
Also tell how to setup a storage for RDF files
Ruby and Semantic Web | First Blog for Social Web Semantics
13/11/2012
[...] Developing for the Semantic Web with Ruby on Rails, the 2012 guide was my main guide while learning integration of ruby with semantic web. It is accoringly an update site still some links mentioned in blog are broken. [...]