HomeNews and EventsEvents Calendar → Seminar


Scalable Ontology Systems

Event Type: Seminar

Date: March 06, 2008

Time: 10:00AM - 11:30AM

Venue: S-3-028

Abstract:

Ontologies have become commonplace as a way to represent both knowledge and data. The bio-medical field is a clear success story, with many bio-medical ontologies testing the limits of current knowledge representation systems.

Such systems typically use a relational database representation to store ontological data. However, access patterns associated with querying and reasoning about ontologies are substantially different than those of traditional database queries, to the extent that performance degrades significantly when using relational models.

In this talk, I will address three of the main challenges in building a scalable ontology system. First, I will describe an efficient alternative to reification for RDF data annotated with information such as probabilities, validity intervals or provenance. The Annotated RDF framework allows a user to add any type of partially-ordered metadata to an ontology, while maintaining query processing times short when compared to reified representations.

Second, I will describe methods of indexing RDF ontologies which are several times faster than their relational counterparts. Our GRIN indexing method avoids the computationally complex self-joins inherent in a relational-backed representation by relying on the locality property of queries.

More specifically, we show that we only need to iterate over a small subset of RDF resources to locate the smallest portion of the ontology guaranteed to contain the answers to a given query. I also describe several experimental findings from comparisons to leading systems such as Jena2, RDFBroker and 3store.

Third, I will present a novel ontology integration algorithm called ILIADS that combines statistical and logical inference to improve the quality of integrated ontologies. Some of our most interesting findings show that (i) matching schema and data at the same time yields significantly better recall than existing leading algorithms; (ii) the robustness of the integrating two ontologies depends on how similar their characteristics are and (iii) a little logical inference goes a long way in improving result quality.

Speaker: Octavian Udrea

Speaker Bio:

Octavian Udrea is currently a PhD student at the University of Maryland College Park. His primary research interests include knowledge representation, automated reasoning and heterogenous databases. He has also publishes several papers on activity-based querying of video databases and on automated code verification.