FGIR 2007

From September 24 to September 26 I was in Halle (at the river Saale) in Germany attending FGIR 2007. FGIR is part of LWA, an annual event of several interest groups of the German Computer Society. The venue was located at the computer science instate of the Martin Luther university. The community represented by the different interest groups (all of them in some form related to knowledge) was very interesting and for 20€ of registration feed you even got refreshments and snacks :) .
We had two papers there: In the presented first we focused in detail on the evolution of the associative retrieval component (which was first presented at i-Semantics). Besides simply presenting the obtained results in this paper we argued why our chosen approach to evaluation of an information retrieval on the Semantic Desktop is valid. With the German interest group for information retrieval we had the perfect audience for this talk.
The other paper presented is a survey of current approaches to information retrieval in the Semantic Web and on the Semantic Desktop. In this paper we also try to find a definition what information on the Semantic Web actually is.

Posted in academia, paper, search, semantic web | No Comments »

Search on the Semantic Web

Recently I discovered a nice survey on semantic web search engines. If you are interested in this topic (as I am) you should not miss it:

A Categorization Scheme for Semantic Web Search Engines (Esmaili & Abolhassani, 2006)

 

In addition check at the references on the Search on the Semantic Web page (of course there is some overlap - search on the semantic web is a young discipline :) ).

Posted in academia, paper, search, semantic web | No Comments »

(Semantic) Similarity-Blog

In my research on search in the Semantic Web, (semantic) similarity plays a crucial role. Today I discovered a great resource for semantic similarity: (Semantic) Similarity-Blog by Krzysztof Janowicz.

Posted in academia, blog, paper | No Comments »

DBLP data available in RDF format

As Danny Ayers notes Chris Bizer and Richard Cyganiak yesterday announced a RDF version of DBLP, the popular computer science publication database. More info can be found on their D2R Server DBLP page.

Posted in academia, paper, semantic web, services | No Comments »

Semantic annotation for knowledge management contd. : The seven requirements

I previously blogged about a very comprehensive survey-paper on semantic annotation. Besides the survey the paper also lists seven requirements for semantic annotation systems in the context of a document centred knowledge management approach and reviews them in the light of the annotation tools surveyed. Here are the seven requirements from Uren et al. (2006):

  • Requirement 1—standard formats: … Using standard formats is preferred, wherever possible, because the investment in marking up resources is considerable and standardization builds in future proofing because new tools, services, etc., which were not envisaged when the original semantic annotation was performed may be developed. Compliance with standards also frees companies from the constraints of proprietary formats when choosing knowledge management software. It is the activity of the W3C in developing and promoting international standards for the SemanticWeb that has convinced us that this route is worth following in knowledge management. Two types of standard are required, standards for describing ontologies such as the Web Ontology Language OWL and standards for annotations such as the W3C’s RDF annotation schema.
    Many of the reviewed tools already use W3C standards.
  • Requirement 2—user centered/collaborative design:Annotation can potentially become a bottleneck if it is done by knowledge workers with many demands on their time. Since few organizations have the capacity to employ professional annotators, it is crucial to provide knowledge workers with easy to use interfaces that simplify the annotation process and place it in the context of their everyday work. A good approach would be a single point of entry interface, so that the environment in which users annotate documents is integrated with the one in which they create, read, share and edit them. System design also needs to facilitate collaboration between users, which is a key facet of knowledge work with experts from different fields contributing to and reusing intelligent documents. …
    … More attention needs to be paid to build in or plug-in semantic annotation facilities in commonly used packages to encourage knowledge workers to view annotation as part of the authoring process not as an afterthought, and also to supporting annotation in collaborative environments, …
  • Requirement 3—ontology support (multiple ontologies and evolution): … annotation tools need to be able to support multiple ontologies. For example, in a medical context, there may be one ontology for general metadata about a patient and other technical ontologies that deal with diagnosis and treatment. … In addition, systems will have to cope with changes made to ontologies over time, such as incorporating new classes or modifying existing ones. In this case, the problem is ensuring consistency between ontologies and annotations with respect to ontology changes. …
    … Ontology maintenance, which directly affects the maintenance of annotations, is poorly supported, or not supported at all, by the current generation of tools. This perhaps reflects the intended user groups; with the assumption being that knowledge workers will use existing ontologies rather than editing or creating them. … A genuinely integrated semantic annotation environment should give the user automatic support for ontology maintenance, for example, using text mining methods to suggest new classes as they emerge in documents and spotting inconsistencies between new and existing annotations. …
  • Requirement 4—support of heterogeneous document formats: … Documents will be in many different formats including word processor files, spreadsheets, graphics files and complex mixtures of different formats. This presents a technical challenge rather than a research challenge, but dealing with multiple document formats is a prerequisite for integrating annotation into existing work practices.
  • Requirement 5—document evolution (document and annotation consistency): Ontologies change sometimes but some documents change many times. … What should happen to the annotations on a document when it is revised, poses both technical and application specific questions. … Annotation environments need to help knowledge workers maintain appropriate annotations as documents change.
    The survey did not discover any concerted work on these lines.
  • Requirement 6—annotation storage options: The Semantic Web model assumes that annotations will be stored separately from the original document, whereas the “word processor” model assumes that comments are stored as an integral part of the document, which can be viewed or not as the reader prefers. …
    … However, separate storage of annotations has advantages for KM. … It also makes it easy to produce different views of a document for users with different roles in an organization or different access rights, thus facilitating knowledge sharing and collaboration. We therefore argue that separate storage is the better model, even when extra overheads are required to maintain links between a document and its annotations.
  • Requirement 7—automation: Another aspect of easing the knowledge acquisition bottleneck is the provision of facilities for automatic mark-up of document collections to facilitate the economical annotation of large document collections. To achieve this, the integration of knowledge extraction technologies into the annotation environment is vital. …
    Language technologies present usability challenges when deployed for knowledgeworkers since most are research tools or designed for use by specialists. … In addition to the usability challenges there are also research challenges, among which we have highlighted the extraction of relations as important for semantic annotation.

Posted in annotation, knowledge management, paper, semantic web | No Comments »

Characterizing Semantic Web Applications contd.

These are directions for next generation Semantic Web applications (NGSWA) summarized from Language Technologies and the Evolution of the Semantic Web (Motta and Sabou 2006a):

  • Decoupling the process of engineering from that of exploiting the Semantic Web. NGSWA assume that they operate in an environment characterized by large scale, distributed semantic markup.
  • Operating with heterogeneous semantic markup and multiple ontologies. NGSWA have to deal with heterogeneous semantic markup.
  • Openness with respect to semantic resources. NGSWA allow to add new sources or integrate new ontologies
  • Scale more important than quality. While a lot of the emphasis in first-generation tools was on quality, NGSWA move away from traditional quality centered expert systems, just as the Web differentiated itself from hypertext, by allowing broken links.
  • WWW – We Want Web! Early Semantic Web applications are far more similar to the classic knowledge-based systems, than to the Semantic Web applications of the future. NGSWA try to bring the Semantic Web closer to the Web and also integrate Web Services in their functionalities.
  • From intelligent applications to harvesting collective intelligence. In NGSWA intelligence is also a byproduct of operating with large amounts of data. The users act as catalysts in deriving value from collectively gathered, tagged and shared semantic data, thus using the system to harvest collective intelligence.

See also my previous post on Characterizing Semantic Web Applications.

Posted in paper, semantic web | No Comments »

Semantic annotation for knowledge management: Requirements and a survey of the state of the art (Uren et al. 2006)

This seems to be the mother of all survey papers on Semantic Annotation. I have never seen an overview on this topic being that complete: Semantic annotation for knowledge management: Requirements and a survey of the state of the art (Uren et al. 2006)

Beside the detailed survey on Semantic Annotation they also present seven requirements for Semantic Annotation in a document centric approach to Knowledge Management.

Posted in annotation, paper, semantic web | No Comments »

Characterizing Semantic Web Applications

Enrico Motta from KMI held a talk at the The Fourth Summer School on Ontological Engineering and the Semantic Web (SSSW’06) called Characterizing Semantic Web Applications were he listed several (desired) characteristics of a (typical) Semantic Web application. I wrote him an email regarding his talk and asked if he had any additional resources on this topic. He kindly pointed me to the following two papers written by him and Marta Sabou:

These are the (my) key-points of the second paper (Motta and Sabou 2006b).
Properties of Next generation Semantic Web applications (NGSWA) are:

  • Semantic data generation vs reuse. NGSWA are designed to operate with the semantic data that already exist. In other words, they worry less about bootstrapping a Semantic Web, than about providing mechanisms to exploit available semantic markup.
  • Single-ontology vs multi-ontology systems. NGSWA can consume any number of ontologies at the same time. It does not make much sense to make a ‘closed domain’ assumption.
  • Openness with respect to semantic resources. NGSWA take into account RDF data available from a particular Web site, in response to a request from a user who wish to use them.
  • Scale as important as data quality. NGSWA are designed to operate at scale.NGSWA do not require any extra effort to bring in new sources.
  • Openness with respect to Web (non-semantic) resources. NGSWA integrating data acquisition mechanisms in their architecture.
  • Compliance with the Web 2.0 paradigm. NGSWA need to provide mechanisms for users to add and annotate data. NGSWA support user annotation are still rather primitive, and better tools are badly needed.
  • Open to services. NGSWA seamlessly integrate scraping services into their data acquisition architectures.

Posted in paper, semantic web | No Comments »

Creating a Science of the Web (Berners-Lee et al. 2006)

James Hendler of Mindswap spreads the news of the paper Creating a Science of the Web he published together with Tim Berners-Lee, Wendy Hall, Nigel Shadbolt, and Daniel J. Weitzner.
In the paper they argue for a science of the web dealing with both technical and societal aspects of the web. Regarding to them this science “has its own ethos: decentralization to avoid social and technical bottlenecks, openness to the reuse of information in unexpected ways, and fairness.” (Berners-Lee et al. 2006)

Posted in paper, web | 1 Comment »