Friday, July 28, 2006

Reifying reified relationships

In our recent modeling exercises with real-world customers it became (once more) evident that reified relationships are a key requirement in many domains. Reified relationships are everywhere.

A typical solution to this is to introduce a relationship class that connects a source/subject with a target/object, so that you could attach additional attributes to the relationship objects. For example if we want to be able to attach an index to each parent-child relationship (first child, second child etc), then we can introduce an object to link parent, child and the index.

However, in many such cases it becomes very complicating and inconvenient if you need to do reasoning with such reified relationships. In many cases you want to talk about a simple relationship such as isParentOf instead of having to construct statements involving reified objects like ChildParentRelationship.
As a small exercise I have created an ontology pattern to automatically synchronize a reified relationship with a plain (object) property. I have no idea if this is useful to someone (and whether this has been shown elsewhere already), but anyway I'd like to share this idea here. An example based on this pattern is demonstrated below:



We have a generic base class reif:BinaryRelationship with properties reif:subject and reif:object. A subclass of this is reif:SynchronizedBinaryRelationship, and you can create subclasses of this, such as ChildParentRelationship. Then let's assume we have a class Person with instances, and a simple property pair hasParent and isParentOf. Now, whenever we create a reified ChildParentRelationship between A and B, we want to automatically also get the triple A hasParent B. This is expressed by connecting our relationship class with the hasParent property (OWL Full Full).

To drive this, automatic inferences can be expressed in a rule language and then used by engines such as the generic Jena rule engine. Here is the corresponding rule in Jena notation:

[synchronize:
(?r rdf:type ?rt)
(?rt rdfs:subClassOf reif:SynchronizedBinaryRelationship)
(?r reif:subject ?s)
(?r reif:object ?o)
(?rt reif:synchronizedProperty ?p) ->
(?s ?p ?o)
]

A worked-out example can be found here and the generic pattern can be imported from

http://www.topbraidcomposer.com/owl/2006/07/reification.owl

If you load the example file into TopBraid Composer, and set the rule engine to incremental mode and to operate on top of a Pellet inference graph, then the system will automatically infer that Holger is an instance of Parent. Note that you need to run the rules engine twice to combine Pellet with Jena and again with Pellet. A similar solution could be used for rdf:Statements as well.


Click on the screenshot to explore details.

Standards and Pseudo Standards

I participated in some of our company's training and consulting activities this week. One of the interesting issues that came up was the question of standards and how well they are supported by tools. Many participants elaborated on the pain of using SQL. SQL is a database query and interchange language that has been around many many years. However, all vendors have their own dialects and refuse to export their databases into an SQL dialect that other tools would understand without choking.

A similar case came up today when a user attempted to convert UML files into OWL, using TopBraid Composer. Due to changes in the UML specification and its reference implementation for Eclipse, his UML files became slightly incompatible and failed to load. We are still in the process of figuring out how to adjust MagicDraw to create the format the Eclipse needs and vice versa. I find it just incredible how slow the UML standard comes along, given that the OMG has spent so many years on it already.

Enter the Semantic Web: For OWL you have a fine selection of tools that can really use the same input files without choking. Unless in perhaps extreme cases, you can nowadays seamlessly move back and forth between SWOOP, Protege, TopBraid or whatever else you need. This is probably because no large tool vendors are trying to hijack the standard (like in the SQL and UML worlds), but likely also a result of rigorous and efficient standardization processes at the W3C.

Joy and pain of plug-in architectures

Both ontology editors that I have developed over the last few years have been based upon a pre-existing platform using the platforms' plug-in mechanisms. In the case of Protege-OWL this was the Protege platform which was originally built for frame-based knowledge modeling in the 1990's. In the case of TopBraid Composer, we selected Eclipse, a well-known general-purpose platform for all kinds of tools.

To those who have followed the evolution of Protege-OWL in 2003 and 2004 it should be obvious that we had to iterate and change quite a bit, leading to a lot of frustration among our dear users. The original vision of building an OWL plugin on top of Protege-frames was to be able to reuse pre-existing features and user interface components. This turned out to be a very unrealistic vision, and in the end I found myself rewriting essentially the whole platform, even though my fellow programmers made significant efforts to adjust the core platform to accommodate the OWL requirements as well. It just didn't fit: widgets that were developed for a frame-based knowledge model, did not know anything about namespaces, supported different value types, had different layout conventions, assumed a closed world etc etc were rather obstacles than really reusable.

In TopBraid Composer the situation has been very different because I could start from the scratch and could pick the platform and API that I thought would make most sense. The obvious choices were Eclipse and Jena. I did not need to pay attention to the pseudo-backward compatibility of Protege and did not have to reuse unsuitable components and UI frameworks. Eclipse does a marvelous job as a base platform and is very enjoyable to implement against. It provides a clean plug-in mechanism through so-called extension points. Here is an example.

The screenshot below shows a new feature for the upcoming 1.1.4 build of TBC: A task list that allows users to attach "TODO" list items to resources in their ontology. For example, this feature can be used to annotate that a certain class definition should be changed in the future. Following the convention that we introduced in Protege-OWL, we use the owl:versionInfo property to store such items: any versionInfo starting with the string "TODO" is considered to be a todo list item. Java IDE's and other tools follow similar conventions to embed TODO items into source code.




One nice thing to say about Eclipse here is that it provides a complete framework to manage such todo items as co-called markers. Eclipse also comes with default windows to display all markers of a given type. One just needs to declare the new marker types and their attributes, and the system knows how to display and persist them. Beside the Task view shown above I used the same mechanism for a (closed world) constraint checker that helps users find violations of owl:Restrictions in their instances.



Eclipse is flexible enough to support navigation so that a double-click on any item in the list will take you to the problem resource. All this is very nice and took less than a day to implement. However, no matter how flexible a UI platform may be, there will always be plug-in requirements that cannot be fulfilled. My experience is that while adding new features is always easy and nice, it is extremely challenging or impossible to remove existing features. In the case of the Task list I wanted to replace the button to create a new Task so that it automatically also creates a corresponding owl:versionInfo enty. Since Eclipse does not provide a clean mechanism to remove or replace existing buttons, I had to go through some rather painful work-arounds to get this going (ask me how if you're interested). At least it is working and no changes to Eclipse were needed :)

Sunday, July 23, 2006

Geospatial Ontologies in TopBraid Composer

Geospatial information such as latitude/longitude properties can serve as the foundation of many cool Semantic Web applications. It is great to see that the W3C has launched an Incubator Activity to establish a (de-facto) standard RDF vocabulary for geography. A preliminary version of this vocabulary can already be used, and tools are starting to support it.

I have always been very interested in all things related to maps. In fact - beside the library of Lonely Planet travel guides - I have two large world atlases as bedside reading. It was therefore a pleasure to implement geographical mapping support for our ontology platform TopBraid Composer this week. Here is a screenshot of TBC 1.1.3, which has been published today:


The new features allow users to specify coordinates for any resource in an ontology, and to visualize the coordinates in a dynamic (Google) map. I have created a short screencam video to demonstrate how it works:


Geography in TopBraid Composer Video

I particularly like the idea of combining geographical information with intelligent query languages such as SPARQL + Pellet. While these ideas are already being explored by several mash-up scripts and AJAX applications on the Web, I think it is invaluable to also have geographical support inside of an ontology development environment. Full-blown ontology editors can offer much greater flexibility and more rapid turn-around times for schema and instance creation.

There are many more ideas for similar new features in our minds right now, and I am looking forward to bringing ontology design closer to ontology use.

Thursday, July 20, 2006

Composing the Semantic Web

Welcome to my blog! I will use this space to muse about various topics related to ontology development and the Semantic Web in general. This is based on my background as a tool developer, and my experience with ongoing ontology design projects.

As some may know I am working with TopQuadrant, in their office at NASA AMES Research Park in Moffett Field, California. Before joining TopQuadrant, I was responsible for the Protege-OWL ontology editor from Stanford, and before that I worked in various research areas including agile software development and multi-agent systems.

Beside various customer projects with NASA I am developing a tool known as TopBraid Composer - and some of my comments here should be seen in the light of this tool. I'll try to keep the product advertisement ratio low, while at the same time I will use this channel to notify our users about new or hidden features of TopBraid.