I attended the Association for The Advancement of Artificial Intelligence Spring Symposia, 2009, at
The Symposium was organized by Li Ding and Jen Bao from Rensselaer Polytechnic Institute and Mark Greaves from Vulcan, Inc. Li Ding opened the discussion and described a situation where Semantic Web technologies may be poised to increase the range and effectiveness of Web 2.0 tools for information retrieval, social networking and collaboration. We spent the next two and a half days discussing examples of this technology and the issues their use introduce into how people interact with the Web.
A number of applications were described that bridge the gap between collaborative technology and semantics. Twine is a site that allows users to group links into what are called twines. A twine is a group of sites that are topically related. Tags are generated when a site is added to a twine and domain ontologies are used to link different twines together and recommend to a user other twines that may interest them. Radar Networks Inc. developed Twine and their CEO Nova Spivack gave the first presentation. Twine looks like a very useful application. It is somewhat similar to delic.io.us in concept, but with explicit semantics.
Denny Vrandecic from Insitut AIFB,
Semantic MediaWiki was the basis for a number of other applications discussed at the symposium. One was Metavid.org, an “open video archive of the US Congress.” Metavid.com captures video and closed captioning of Congressional proceedings. Semantic MediaWiki’s extensions allow for categorical searches of recorded speeches.
The Halo Project, funded by Paul Allen’s Vulcan Inc. and sponsored by Mark Greaves, has developed extensions to Semantic MediaWiki that go a long way toward showing the power of embedding semantics in applications. The work was done by Ontoprise and they have produced a video of its features that is worth viewing.
Some of the applications discussed provide collaborative, distributed development environments for authoring ontologies. Tania Tudorache of the
In regard to architecting systems that use semantics to leverage Web 2.0 features, a number of approaches kept coming up. Ontologies for describing tagging behavior by users were mentioned by a few of the presenters. This is a way to capture the relationships between taggers (two users who tag the same site with the same or similar tags) and the temporal dimension of tagging (“who tagged what tag when?”). Another common thread was defining a semantic layer to describe the syntactic or functional layers of a system. Hans-George Fill of the
Some other applications described at the conference use existing collaborative technology, such as Wikipedia, to jumpstart Semantic Web applications. Tim Finin described an approach that he and his colleagues at the
Our team presented a paper that described how the location of bloggers could be inferred from location entity mentions in their blog posts. We described an experiment where we were able to correctly geolocate 61% of blogs based on a test set of ~800 blogs with known locations. While our work was somewhat tangential to the Semantic Web, it is a demonstration of the “inference problem,” where information not stated directly, can be inferred from other available information. This raises issues of privacy given the explosion of the use of social networking sites such as Facebook and the proliferation of personal Web logs. Three other papers presented at the symposium addressed privacy and access control issues. Mary-Ann Williams of the
Panels presented during the symposium addressed some cross-cutting issues for Web 2.0 and Semantic Web applications; usability, scale and privacy. On the 25th, the panel included Steve White of Radar Networks , Denny Vrandecic, Natasha Noy, Jaime Taylor, Minister of Information for Metaweb, the home of Freebase (an excellent open collaborative database), and Jeff Pollock of Oracle and the author of the recently published “The Semantic Web for Dummies.” This panel was dedicated to the topic of usability, but also addressed the issue of scale. All agreed that usability issues on the Semantic Web are the same as with Webs 1.0 and 2.0; simple is better, hide confusing bits like RDF and OWL tags, etc. Noy made the point, however, that there are different classes of users for semantic applications on the web, such as the users of BioPortal and those actually involved in ontology development. A lot of time was spent talking about users of applications such as Excel and how even a killer application like the Semantic Web can be overtaken by simple, inelegant solutions. The issue of scale came down to how Semantic Web applications will handle billions of triples, and the difficulty of doing anything more than simple reasoning over such large amounts of data.
The next day’s panel included Paul Groth, Denny Vrandecic, Tim Finin and Rajesh Balakrishnan and touched on issues of privacy and trust. One conclusion of this discussion was that the structured metadata that comes with the Semantc Web, along with ability to reason over the data – albeit, probably in small bites – will just multiply the inference problem. There was no real consensus on what can be done about that.
This symposium did a great job of framing how social computing and semantics are quickly coming together. There was quite a bit of excitement about Twine and the success of Semantic MediaWiki. There was no clear consensus whether this technology will revolutionize the user experience or just provide enabling technology to intelligently link applications and make current functionalities such as search more effective. For developers, however, there is a whole new universe of challenges here.