Queries
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks
Existing approaches for link prediction, in the domain of network science, exploit a network's topology to predict future connections by assessing existing edges and connections, and inducing links given the presence of mutual nodes. Despite the rise in popularity of Attention-Information Networks (i.e. microblogging platforms) and the production of content within such platforms, no existing work has attempted to exploit the semantics of published content when predicting network links. In this paper we present an approach that fills this gap by a) predicting follower edges within a directed social network by exploiting concept graphs and thereby significantly outperforming a random baseline and models that rely solely on network topology information, and b) assessing the different behavior that users exhibit when making followee-addition decisions. This latter contribution exposes latent factors within social networks and the existence of a clear need for topical affinity between users for a follow link to be created.
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks
Existing approaches for link prediction, in the domain of network science, exploit a network's topology to predict future connections by assessing existing edges and connections, and inducing links given the presence of mutual nodes. Despite the rise in popularity of Attention-Information Networks (i.e. microblogging platforms) and the production of content within such platforms, no existing work has attempted to exploit the semantics of published content when predicting network links. In this paper we present an approach that fills this gap by a) predicting follower edges within a directed social network by exploiting concept graphs and thereby significantly outperforming a random baseline and models that rely solely on network topology information, and b) assessing the different behavior that users exhibit when making followee-addition decisions. This latter contribution exposes latent factors within social networks and the existence of a clear need for topical affinity between users for a follow link to be created.
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks
Monika Solanki
Monika Solanki
Monika Solanki
Stanford University
Stanford University
Stanford University
Jamie Taylor
Jamie Taylor
Jamie Taylor
Coffee Break
Tejaswini Pendurthi
Tejaswini Pendurthi
Tejaswini Pendurthi
Université de Namur
Université de Namur
Université de Namur
Politecnico di Milano
Politecnico di Milano
Politecnico di Milano
Tsinghua University
Tsinghua University
Tsinghua University
Ricardo Kawase
Ricardo Kawase
Ricardo Kawase
The Linked Data Visualization Model
The potential of the semantic data available in the Web is enormous but in most cases it is very difficult for users to explore and use this data. Applying information visualization techniques to the Semantic Web helps users to easily explore large amounts of data and interact with them. We devise a formal Linked Data Visualization model (LDVM), which allows to dynamically connect data with visualizations.
The Linked Data Visualization Model
The potential of the semantic data available in the Web is enormous but in most cases it is very difficult for users to explore and use this data. Applying information visualization techniques to the Semantic Web helps users to easily explore large amounts of data and interact with them. We devise a formal Linked Data Visualization model (LDVM), which allows to dynamically connect data with visualizations.
The Linked Data Visualization Model
Sebastian Walter
Sebastian Walter
Sebastian Walter
Knowledge Discovery
Web APIs have gained increasing popularity in recent Web service technology development owing to its simplicity of technology stack and the proliferation of mashups. However, efficiently discovering Web APIs and the relevant documentations on the Web is still a challenging task even with the best resources available on the Web. In this paper we cast the problem of detecting the Web API documentations as a text classification problem of classifying a given Web page as Web API associated or not. We propose a supervised generative topic model called feature latent Dirichlet allocation (feaLDA) which offers a generic probabilistic framework for automatic detection of Web APIs. feaLDA not only captures the correspondence between data and the associated class labels, but also provides a mechanism for incorporating side information such as labeled features automatically learned from data that can effectively help improving classification performance. Extensive experiments on our Web APIs documentation dataset shows that the feaLDA model outperforms three strong supervised baselines including naive Bayes, support vector machines, and the maximum entropy model, by over 3% in classification accuracy. In addition, feaLDA also gives superior performance when compared against other existing supervised topic models.
Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web
Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web
Web APIs have gained increasing popularity in recent Web service technology development owing to its simplicity of technology stack and the proliferation of mashups. However, efficiently discovering Web APIs and the relevant documentations on the Web is still a challenging task even with the best resources available on the Web. In this paper we cast the problem of detecting the Web API documentations as a text classification problem of classifying a given Web page as Web API associated or not. We propose a supervised generative topic model called feature latent Dirichlet allocation (feaLDA) which offers a generic probabilistic framework for automatic detection of Web APIs. feaLDA not only captures the correspondence between data and the associated class labels, but also provides a mechanism for incorporating side information such as labeled features automatically learned from data that can effectively help improving classification performance. Extensive experiments on our Web APIs documentation dataset shows that the feaLDA model outperforms three strong supervised baselines including naive Bayes, support vector machines, and the maximum entropy model, by over 3% in classification accuracy. In addition, feaLDA also gives superior performance when compared against other existing supervised topic models.
Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web
Fluid Operations
Fluid Operations
Fluid Operations
Yongchun Xu
Yongchun Xu
Yongchun Xu
Thomas Eiter
Thomas Eiter
Thomas Eiter
Coffee Break
Bielefeld University
Bielefeld University
Bielefeld University
NTT DoCoMo
NTT DoCoMo
NTT DoCoMo
Chiara Ghidini
Chiara Ghidini
Chiara Ghidini
Lorenz Bühmann
Lorenz Bühmann
Lorenz Bühmann
The Semantic Vernacular System is a novel naming system for creating named, machine-interpretable descriptions for groups of organisms. Unlike the traditional scientific naming system, which is based on evolutionary relationships, it emphasizes the observable features of organisms. By independently naming the descriptions composed of sets of observational features, as well as maintaining connections to scientific names, it preserves the observational data used to identify organisms. The system is designed to support a peer-review mechanism for creating new names, and uses a controlled vocabulary encoded in the Web Ontology Language to represent the observational features. A prototype of the system is currently under development in collaboration with the Mushroom Observer website. It allows users to propose new names and descriptions, provide feedback on those proposals, and ultimately have them formally approved. This effort aims at offering the mycology community a knowledge base of fungal observational features and a tool for identifying fungal observations.
Semantic Vernacular System: an Observation-based, Community-powered, and Semantics-enabled Naming System for Organisms
The Semantic Vernacular System is a novel naming system for creating named, machine-interpretable descriptions for groups of organisms. Unlike the traditional scientific naming system, which is based on evolutionary relationships, it emphasizes the observable features of organisms. By independently naming the descriptions composed of sets of observational features, as well as maintaining connections to scientific names, it preserves the observational data used to identify organisms. The system is designed to support a peer-review mechanism for creating new names, and uses a controlled vocabulary encoded in the Web Ontology Language to represent the observational features. A prototype of the system is currently under development in collaboration with the Mushroom Observer website. It allows users to propose new names and descriptions, provide feedback on those proposals, and ultimately have them formally approved. This effort aims at offering the mycology community a knowledge base of fungal observational features and a tool for identifying fungal observations.
Semantic Vernacular System: an Observation-based, Community-powered, and Semantics-enabled Naming System for Organisms
Semantic Vernacular System: an Observation-based, Community-powered, and Semantics-enabled Naming System for Organisms
Description Logic
Best Buy
Best Buy
Best Buy
Coffee Break
Tim Clark
Tim Clark
Tim Clark
Ontology-Based Access to Probabilistic Data with OWL QL
Ontology-Based Access to Probabilistic Data with OWL QL
Ontology-Based Access to Probabilistic Data with OWL QL
We propose a framework for querying probabilistic instance data in the presence of an OWL2 QL ontology, arguing that the interplay of probabilities and ontologies is fruitful in many applications such as managing data that was extracted from the web. The prime inference problem is computing answer probabilities, and it can be implemented using standard probabilistic database systems. We establish a PTime vs. #P dichotomy for the data complexity of this problem by lifting a corresponding result from probabilistic databases. We also demonstrate that query rewriting (backwards chaining) is an important tool for our framework, show that non-existence of a rewriting into first-order logic implies #P-hardness, and briefly discuss approximation of answer probabilities.
We propose a framework for querying probabilistic instance data in the presence of an OWL2 QL ontology, arguing that the interplay of probabilities and ontologies is fruitful in many applications such as managing data that was extracted from the web. The prime inference problem is computing answer probabilities, and it can be implemented using standard probabilistic database systems. We establish a PTime vs. #P dichotomy for the data complexity of this problem by lifting a corresponding result from probabilistic databases. We also demonstrate that query rewriting (backwards chaining) is an important tool for our framework, show that non-existence of a rewriting into first-order logic implies #P-hardness, and briefly discuss approximation of answer probabilities.
Rahul Parundekar
Rahul Parundekar
Rahul Parundekar
North Carolina State University
North Carolina State University
North Carolina State University
Jacopo Urbani
Jacopo Urbani
Jacopo Urbani
Simplifying MIREOT; a MIREOT Protege Plugin
Simplifying MIREOT; a MIREOT Protege Plugin
Simplifying MIREOT; a MIREOT Protege Plugin
The Web Ontology Language (OWL) is a commonly used standard for creating ontology artifacts. However, its capabilities for reusing existing OWL artifacts in the creation of new artifacts is limited to the import of whole ontologies, even when only a small handful of classes, object properties, and so on (which we refer to generically as OWL components) are relevant. This situation can result in extremely large and unwieldy, or even broken, ontologies. To address this problem while still promoting ontology reuse, the OBI Consortium has elucidated the Minimum Information to Reference an External Ontology Term (MIREOT). We provide a suite of plugins to the Protege editor that greatly simplifies the use of MIREOT principles during ontology creation and editing.
The Web Ontology Language (OWL) is a commonly used standard for creating ontology artifacts. However, its capabilities for reusing existing OWL artifacts in the creation of new artifacts is limited to the import of whole ontologies, even when only a small handful of classes, object properties, and so on (which we refer to generically as OWL components) are relevant. This situation can result in extremely large and unwieldy, or even broken, ontologies. To address this problem while still promoting ontology reuse, the OBI Consortium has elucidated the Minimum Information to Reference an External Ontology Term (MIREOT). We provide a suite of plugins to the Protege editor that greatly simplifies the use of MIREOT principles during ontology creation and editing.
User Interfaces and Personalization
University of Applied Sciences and Arts Western Switzerland
University of Applied Sciences and Arts Western Switzerland
University of Applied Sciences and Arts Western Switzerland
Alexander Borgida
Alexander Borgida
Alexander Borgida
Andreas Thor
Andreas Thor
Andreas Thor
INRIA & LIG
INRIA & LIG
INRIA & LIG
Maximilian Nickel
Maximilian Nickel
Maximilian Nickel
Lunch
Monash University
Monash University
Monash University
Ontology Constraints in Incomplete and Complete Data
Ontology and other logical languages are built around the idea that axioms enable the inference of new facts about the available data. In some circumstances, however, the data is meant to be complete in certain ways, and deducing new facts may be undesirable. Previous approaches to this issue have relied on syntactically specifying certain axioms as constraints or adding in new constructs for constraints, and providing a different or extended meaning for constraints that reduces or eliminates their ability to infer new facts without requiring the data to be complete. We propose to instead directly state that the extension of certain concepts and roles are complete by making them DBox predicates, which eliminates the distinction between regular axioms and constraints for these concepts and roles. This proposal eliminates the need for special semantics and avoids problems of previous proposals.
Ontology Constraints in Incomplete and Complete Data
Ontology Constraints in Incomplete and Complete Data
Ontology and other logical languages are built around the idea that axioms enable the inference of new facts about the available data. In some circumstances, however, the data is meant to be complete in certain ways, and deducing new facts may be undesirable. Previous approaches to this issue have relied on syntactically specifying certain axioms as constraints or adding in new constructs for constraints, and providing a different or extended meaning for constraints that reduces or eliminates their ability to infer new facts without requiring the data to be complete. We propose to instead directly state that the extension of certain concepts and roles are complete by making them DBox predicates, which eliminates the distinction between regular axioms and constraints for these concepts and roles. This proposal eliminates the need for special semantics and avoids problems of previous proposals.
Seppo Törmä
Seppo Törmä
Seppo Törmä
SPARQLoid - a Querying System using Own Ontology and Ontology Mappings with Reliability
SPARQLoid - a Querying System using Own Ontology and Ontology Mappings with Reliability
SPARQLoid - a Querying System using Own Ontology and Ontology Mappings with Reliability
Heterogeneity of ontologies on the web of data is very important problem. To solve this problem, there are a lot of researches about ontology mapping/alignment/matching. This paper shows an application called SPARQLoid that is using a query rewriting method to enable the users to query any SPARQL endpoint with the users own ontology even when their mappings are not reliable enough. Often ontology matching is very difficult problem and it sometimes produces mappings under a certain reliability. Based on the given reliability degrees on those mappings, SPARQLoid allows users to query data in the target SPARQL endpoints by using their own (or a specified certain) ontology under a control of sorting order based on their mapping reliability.
Heterogeneity of ontologies on the web of data is very important problem. To solve this problem, there are a lot of researches about ontology mapping/alignment/matching. This paper shows an application called SPARQLoid that is using a query rewriting method to enable the users to query any SPARQL endpoint with the users own ontology even when their mappings are not reliable enough. Often ontology matching is very difficult problem and it sometimes produces mappings under a certain reliability. Based on the given reliability degrees on those mappings, SPARQLoid allows users to query data in the target SPARQL endpoints by using their own (or a specified certain) ontology under a control of sorting order based on their mapping reliability.
University of Victoria
University of Victoria
University of Victoria
Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The SPARQL-RANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for SPARQL-RANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a SPARQL-RANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.
Efficient Execution of Top-K SPARQL Queries
Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The SPARQL-RANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for SPARQL-RANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a SPARQL-RANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.
Efficient Execution of Top-K SPARQL Queries
Efficient Execution of Top-K SPARQL Queries
University of Liverpool
University of Liverpool
University of Liverpool
Using Linked Data To Improve Investment Performance
Using Linked Data To Improve Investment Performance
Using Linked Data To Improve Investment Performance
Instance matching
The Curious Case for Semantics and Linked Data at the Enterprise
The Curious Case for Semantics and Linked Data at the Enterprise
The Curious Case for Semantics and Linked Data at the Enterprise
EPFL
EPFL
EPFL
Shilpa Arora
Shilpa Arora
Shilpa Arora
Birkbeck, University of London
Birkbeck, University of London
Birkbeck, University of London
Bell Labs
Bell Labs
Bell Labs
Nitish Aggarwal
Nitish Aggarwal
Nitish Aggarwal
The paper presents an approach for cost-based query planning for SPARQL queries issued over an OWL ontology using the OWL Direct Semantics entailment regime of SPARQL 1.1. The costs are based on information about the instances of classes and properties that are extracted from a model abstraction built by an OWL reasoner. A static and a dynamic algorithm are presented which use these costs to find optimal or near optimal execution orders for the atoms of a query. For the dynamic case, we improve the performance by exploiting an individual clustering approach that allows for computing the cost functions based on one individual sample from a cluster. Our experimental study shows that the static ordering usually outperforms the dynamic one when accurate statistics are available. This changes, however, when the statistics are less accurate, e.g., due to non-deterministic reasoning decisions.
The paper presents an approach for cost-based query planning for SPARQL queries issued over an OWL ontology using the OWL Direct Semantics entailment regime of SPARQL 1.1. The costs are based on information about the instances of classes and properties that are extracted from a model abstraction built by an OWL reasoner. A static and a dynamic algorithm are presented which use these costs to find optimal or near optimal execution orders for the atoms of a query. For the dynamic case, we improve the performance by exploiting an individual clustering approach that allows for computing the cost functions based on one individual sample from a cluster. Our experimental study shows that the static ordering usually outperforms the dynamic one when accurate statistics are available. This changes, however, when the statistics are less accurate, e.g., due to non-deterministic reasoning decisions.
Cost based Query Ordering over OWL Ontologies
Cost based Query Ordering over OWL Ontologies
Cost based Query Ordering over OWL Ontologies
Federico Piccinini
Federico Piccinini
Federico Piccinini
Viktor Prasanna
Viktor Prasanna
Viktor Prasanna
Free University of Bozen-Bolzano
Free University of Bozen-Bolzano
Free University of Bozen-Bolzano
Streaming and Geospatial DBMSs
Stardog Linked Data Catalog
Stardog Linked Data Catalog
Stardog Linked Data Catalog
Holger Wache
Holger Wache
Holger Wache
Alessio Palmero Aprosio
Alessio Palmero Aprosio
Alessio Palmero Aprosio
Zhe Wu
Zhe Wu
Zhe Wu
Matthias Thimm
Matthias Thimm
Matthias Thimm
MODUL University Vienna
MODUL University Vienna
MODUL University Vienna
Birmingham City University
Birmingham City University
Birmingham City University
Alternative Knowledge Representation Approaches
Giuliana Ucelli
Giuliana Ucelli
Giuliana Ucelli
Hybrid SPARQL queries: fresh vs. fast results
For Linked Data query engines, there are inherent trade-offs between centralised approaches that can efficiently answer queries over data cached from parts of the Web, and live decentralised approaches that can provide fresher results over the entire Web at the cost of slower response times. Herein, we propose a hybrid query execution approach that returns fresher results from a broader range of sources vs. the centralised scenario, while speeding up results vs. the live scenario. We first compare results from two public SPARQL stores against current versions of the Linked Data sources they cache; results are often missing or out-of-date. We thus propose using coherence estimates to split a query into a sub-query for which the cached data have good fresh coverage, and a sub-query that should instead be run live. Finally, we evaluate different hybrid query plans and split positions in a real-world setup. Our results show that hybrid query execution can improve freshness vs. fully cached results while reducing the time taken vs. fully live execution.
For Linked Data query engines, there are inherent trade-offs between centralised approaches that can efficiently answer queries over data cached from parts of the Web, and live decentralised approaches that can provide fresher results over the entire Web at the cost of slower response times. Herein, we propose a hybrid query execution approach that returns fresher results from a broader range of sources vs. the centralised scenario, while speeding up results vs. the live scenario. We first compare results from two public SPARQL stores against current versions of the Linked Data sources they cache; results are often missing or out-of-date. We thus propose using coherence estimates to split a query into a sub-query for which the cached data have good fresh coverage, and a sub-query that should instead be run live. Finally, we evaluate different hybrid query plans and split positions in a real-world setup. Our results show that hybrid query execution can improve freshness vs. fully cached results while reducing the time taken vs. fully live execution.
Hybrid SPARQL queries: fresh vs. fast results
Hybrid SPARQL queries: fresh vs. fast results
Chris Chaulk
Chris Chaulk
Chris Chaulk
The Trials and Tribulations of a Semantic Technology Evangelist
Garlik
Garlik
Garlik
Manolis Koubarakis
Manolis Koubarakis
Manolis Koubarakis
Veronica Rizzi
Veronica Rizzi
Veronica Rizzi
Hassan Saif
Hassan Saif
Hassan Saif
AIFB, University of Karlsruhe
AIFB, University of Karlsruhe
AIFB, University of Karlsruhe
Tomi Kauppinen
Tomi Kauppinen
Tomi Kauppinen
Razan Paul
Razan Paul
Razan Paul
We present Tipalo, an algorithm and tool for automatically typing DBpedia entities. Tipalo identifies the most appropriate types for an entity by interpreting its natural language definition, which is extracted from its corresponding Wikipedia page abstract. Types are identified by means of a set of heuristics based on graph patterns, disambiguated to WordNet, and aligned to two top-level ontologies: WordNet supersenses and a subset of DOLCE+DnS Ultra Lite classes. The algorithm has been tuned against a golden standard that has been built online by a group of selected users, and further evaluated in a user study.
Automatic typing of DBpedia entities
Automatic typing of DBpedia entities
We present Tipalo, an algorithm and tool for automatically typing DBpedia entities. Tipalo identifies the most appropriate types for an entity by interpreting its natural language definition, which is extracted from its corresponding Wikipedia page abstract. Types are identified by means of a set of heuristics based on graph patterns, disambiguated to WordNet, and aligned to two top-level ontologies: WordNet supersenses and a subset of DOLCE+DnS Ultra Lite classes. The algorithm has been tuned against a golden standard that has been built online by a group of selected users, and further evaluated in a user study.
Automatic typing of DBpedia entities
Chris Baillie
Chris Baillie
Chris Baillie
Massachusetts General Hospital
Massachusetts General Hospital
Massachusetts General Hospital
Provenance and Verification
Kno.e.sis, Wright State University
Kno.e.sis, Wright State University
Kno.e.sis, Wright State University
INRIA & University of Grenoble
INRIA & University of Grenoble
INRIA & University of Grenoble
Thomas Scharrenbach
Thomas Scharrenbach
Thomas Scharrenbach
Samsung Information Systems America
Samsung Information Systems America
Samsung Information Systems America
Paolo Ciancarini
Paolo Ciancarini
Paolo Ciancarini
W3C
W3C
W3C
Oleg Ruchayskiy
Oleg Ruchayskiy
Oleg Ruchayskiy
James Michaelis
James Michaelis
James Michaelis
Blazej Bulka
Blazej Bulka
Blazej Bulka
Doctoral Consortium
Sören Auer
Sören Auer
Sören Auer
Patrick Siehndel
Patrick Siehndel
Patrick Siehndel
Nigel Shadbolt
Nigel Shadbolt
Nigel Shadbolt
We present QAKiS, a system for open domain Question Answering over linked data. It addresses the problem of question interpretation as a relation-based match, where fragments of the question are matched to binary relations of the triple store, using relational textual patterns automatically collected. For the demo, the relational patterns are automatically extracted from Wikipedia, while DBpedia is the RDF data set to be queried using a natural language interface.
QAKiS: an Open Domain QA System based on Relational Patterns
We present QAKiS, a system for open domain Question Answering over linked data. It addresses the problem of question interpretation as a relation-based match, where fragments of the question are matched to binary relations of the triple store, using relational textual patterns automatically collected. For the demo, the relational patterns are automatically extracted from Wikipedia, while DBpedia is the RDF data set to be queried using a natural language interface.
QAKiS: an Open Domain QA System based on Relational Patterns
QAKiS: an Open Domain QA System based on Relational Patterns
Thanos G. Stavropoulos
Thanos G. Stavropoulos
Thanos G. Stavropoulos
Michael Fink
Michael Fink
Michael Fink
Dan Brickley
Dan Brickley
Dan Brickley
Francoise Baude
Francoise Baude
Francoise Baude
Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies
Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies
Designing domain ontologies from scratch is a time-consuming endeavor requiring a lot of close collaboration with domain experts. However, domain descriptions such as XML Schemas are often available in early stages of the ontology development process. For my dissertation, I propose a method to convert XML Schemas to OWL ontologies in an automatic way. The approach addresses the transformation of any XML Schema documents by using the XML Schema metamodel, which is completely represented by the XML Schema Metamodel Ontology. Automatically, all Schema declarations and definitions are converted to class axioms, which are intended to be enriched with additional domain-specific semantic information in form of domain ontologies.
Designing domain ontologies from scratch is a time-consuming endeavor requiring a lot of close collaboration with domain experts. However, domain descriptions such as XML Schemas are often available in early stages of the ontology development process. For my dissertation, I propose a method to convert XML Schemas to OWL ontologies in an automatic way. The approach addresses the transformation of any XML Schema documents by using the XML Schema metamodel, which is completely represented by the XML Schema Metamodel Ontology. Automatically, all Schema declarations and definitions are converted to class axioms, which are intended to be enriched with additional domain-specific semantic information in form of domain ontologies.
Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies
KAIST
KAIST
KAIST
Marta Sabou
Marta Sabou
Marta Sabou
Ludwig Maximilian University of Munich
Ludwig Maximilian University of Munich
Ludwig Maximilian University of Munich
This paper introduces the ontology mapping approach of a system that automatically integrates data sources into an ontology-based data integration system (OBDI). In addition to the domain and source ontologies, the mapping algorithm requires a SPARQL query to determine the ontology mapping. Further, the mapping algorithm is dynamic; running each time a query is processed and producing only a partial mapping sufficient to reformulate the query. This approach enables the mapping algorithm to exploit query semantics to correctly choose among ontology mappings that are indistinguishable when only the ontologies are considered. Also, the mapping associates paths with paths, instead of entities with entities. This approach simplifies query reformulation. The sys- tem achieves favorable results when compared to the algorithms developed for Clio, the best automated relational data integration system.
This paper introduces the ontology mapping approach of a system that automatically integrates data sources into an ontology-based data integration system (OBDI). In addition to the domain and source ontologies, the mapping algorithm requires a SPARQL query to determine the ontology mapping. Further, the mapping algorithm is dynamic; running each time a query is processed and producing only a partial mapping sufficient to reformulate the query. This approach enables the mapping algorithm to exploit query semantics to correctly choose among ontology mappings that are indistinguishable when only the ontologies are considered. Also, the mapping associates paths with paths, instead of entities with entities. This approach simplifies query reformulation. The sys- tem achieves favorable results when compared to the algorithms developed for Clio, the best automated relational data integration system.
Queries, the Missing Link in Automatic Data Integration
Queries, the Missing Link in Automatic Data Integration
Queries, the Missing Link in Automatic Data Integration
Bastien Rance
Bastien Rance
Bastien Rance
Rommel N. Carvalho
Rommel N. Carvalho
Rommel N. Carvalho
Distributed Reasoning on Semantic Data Streams
Data streams are being continually generated in diverse application domains such as traffic monitoring, smart buildings, and so on. Stream Reasoning is the area that aims to combine reasoning techniques with data streams. In this paper, we present our approach to enable rule-based reasoning on semantic data streams using data flow networks in a distributed manner.
Data streams are being continually generated in diverse application domains such as traffic monitoring, smart buildings, and so on. Stream Reasoning is the area that aims to combine reasoning techniques with data streams. In this paper, we present our approach to enable rule-based reasoning on semantic data streams using data flow networks in a distributed manner.
Distributed Reasoning on Semantic Data Streams
Distributed Reasoning on Semantic Data Streams
Swiss Federal Institute for Forest, Snow and Landscape Research
Swiss Federal Institute for Forest, Snow and Landscape Research
Swiss Federal Institute for Forest, Snow and Landscape Research
Philipp Fleiss
Philipp Fleiss
Philipp Fleiss
Seema Sundara
Seema Sundara
Seema Sundara
Anisa Rula
Anisa Rula
Anisa Rula
University of Oxford
University of Oxford
University of Oxford
University of Arkansas at Little Rock
University of Arkansas at Little Rock
University of Arkansas at Little Rock
Lora Aroyo
Lora Aroyo
Lora Aroyo
Roberto Garcia
Roberto Garcia
Roberto Garcia
Dave Reynolds
Dave Reynolds
Dave Reynolds
Zhejiang University
Zhejiang University
Zhejiang University
EXALEAD, Dassault Systèmes
EXALEAD, Dassault Systèmes
EXALEAD, Dassault Systèmes
Elena Cabrio
Elena Cabrio
Elena Cabrio
Semantic Web Challenge
Raúl García-Castro
Raúl García-Castro
Raúl García-Castro
Julien Law-To
Julien Law-To
Julien Law-To
Delft University of Technology
Delft University of Technology
Delft University of Technology
Konstantina Bereta
Konstantina Bereta
Konstantina Bereta
Evren Sirin
Evren Sirin
Evren Sirin
Irene Celino
Irene Celino
Irene Celino
Benjamin Grosof
Benjamin Grosof
Benjamin Grosof
Spyros Kotoulas
Spyros Kotoulas
Spyros Kotoulas
Robert Stevens
Robert Stevens
Robert Stevens
Reasoning in RDFS is Inherently Serial, at least in the worst case
Although it appears that reasoning in RDFS is embarrassingly parallel, this is not the case. Because all vocabulary is treated the same way in RDF, it is possible to extend the RDFS ontology vocabulary. The ability permits the creation of useful constructs that are not amenable to parallelism, and that in the end require serial processing.
Reasoning in RDFS is Inherently Serial, at least in the worst case
Although it appears that reasoning in RDFS is embarrassingly parallel, this is not the case. Because all vocabulary is treated the same way in RDF, it is possible to extend the RDFS ontology vocabulary. The ability permits the creation of useful constructs that are not amenable to parallelism, and that in the end require serial processing.
Reasoning in RDFS is Inherently Serial, at least in the worst case
Kewen Wang
Kewen Wang
Kewen Wang
Charles University in Prague
Charles University in Prague
Charles University in Prague
William Hogan
William Hogan
William Hogan
Scalable Geo-thematic Query Answering
First order logic (FOL) rewritability is a desirable feature for query answering over geo-thematic ontologies because in most geoprocessing scenarios one has to cope with large data volumes. Hence, there is a need for combined geo-thematic logics that provide a sufficiently expressive query language allowing for FOL rewritability. The DL-Lite family of description logics is tailored towards FOL rewritability of query answering for unions of conjunctive queries, hence it is a suitable candidate for the thematic component of a combined geo-thematic logic. We show that a weak coupling of DL-Lite with the expressive region connection calculus RCC8 allows for FOL rewritability under a spatial completeness condition for the ABox. Stronger couplings allowing for FOL rewritability are possible only for spatial calculi as weak as the low-resolution calculus RCC2. Already a strong combination of DL-Lite with the low-resolution calculus RCC3 does not allow for FOL rewritability.
Scalable Geo-thematic Query Answering
First order logic (FOL) rewritability is a desirable feature for query answering over geo-thematic ontologies because in most geoprocessing scenarios one has to cope with large data volumes. Hence, there is a need for combined geo-thematic logics that provide a sufficiently expressive query language allowing for FOL rewritability. The DL-Lite family of description logics is tailored towards FOL rewritability of query answering for unions of conjunctive queries, hence it is a suitable candidate for the thematic component of a combined geo-thematic logic. We show that a weak coupling of DL-Lite with the expressive region connection calculus RCC8 allows for FOL rewritability under a spatial completeness condition for the ABox. Stronger couplings allowing for FOL rewritability are possible only for spatial calculi as weak as the low-resolution calculus RCC2. Already a strong combination of DL-Lite with the low-resolution calculus RCC3 does not allow for FOL rewritability.
Scalable Geo-thematic Query Answering
Cristina Sarasua
Cristina Sarasua
Cristina Sarasua
Very Large Scale OWL Reasoning through Distributed Computation
Very Large Scale OWL Reasoning through Distributed Computation
Due to recent developments in reasoning algorithms of the various OWL profiles, the classification time for an ontology has come down drastically. For all of the popular reasoners, in order to process an ontology, an implicit assumption is that the ontology should fit in primary memory. The memory requirements for a reasoner are already quite high, and considering the ever increasing size of the data to be processed and the goal of making reasoning Web scale, this assumption becomes overly restrictive. In our work, we study several distributed classification approaches for the description logic EL+ (a fragment of OWL 2 EL profile). We present the lessons learned from each approach, our current results, and plans for future work.
Due to recent developments in reasoning algorithms of the various OWL profiles, the classification time for an ontology has come down drastically. For all of the popular reasoners, in order to process an ontology, an implicit assumption is that the ontology should fit in primary memory. The memory requirements for a reasoner are already quite high, and considering the ever increasing size of the data to be processed and the goal of making reasoning Web scale, this assumption becomes overly restrictive. In our work, we study several distributed classification approaches for the description logic EL+ (a fragment of OWL 2 EL profile). We present the lessons learned from each approach, our current results, and plans for future work.
Very Large Scale OWL Reasoning through Distributed Computation
Interactive ontology debugging incorporates a user who answers queries about entailments of their intended ontology. In order to minimize the amount of user interaction in a debugging session, a user must choose an appropriate query selection strategy. However, the choice of an unsuitable strategy may result in tremendous overhead in terms of time and cost. We present a learning method for query selection which unites the advantages of existing approaches while overcoming their flaws. Our tests show the utility of our approach when applied to a large set of real-world ontologies, its scalability and adequate reaction time allowing for continuous interactivity.
Interactive ontology debugging incorporates a user who answers queries about entailments of their intended ontology. In order to minimize the amount of user interaction in a debugging session, a user must choose an appropriate query selection strategy. However, the choice of an unsuitable strategy may result in tremendous overhead in terms of time and cost. We present a learning method for query selection which unites the advantages of existing approaches while overcoming their flaws. Our tests show the utility of our approach when applied to a large set of real-world ontologies, its scalability and adequate reaction time allowing for continuous interactivity.
RIO: Minimizing User Interaction in Ontology Debugging
RIO: Minimizing User Interaction in Ontology Debugging
RIO: Minimizing User Interaction in Ontology Debugging
Linked Data
Brian Kettler
Brian Kettler
Brian Kettler
Amit Sheth
Amit Sheth
Amit Sheth
Yahoo! Research
Yahoo! Research
Yahoo! Research
Formal Verification of Data Provenance Records
Data provenance is the history of derivation of a data artifact from its original sources. As the real-life provenance records can likely cover thousands of data items and derivation steps, one of the pressing challenges becomes development of formal frameworks for their automated verification. In this paper, we consider data expressed in standard Semantic Web ontology languages, such as OWL, and define a novel verification formalism called provenance specification logic, building on dynamic logic. We validate our proposal by modeling the test queries presented in The First Provenance Challenge, and conclude that the logic core of such queries can be successfully captured in our formalism.
Data provenance is the history of derivation of a data artifact from its original sources. As the real-life provenance records can likely cover thousands of data items and derivation steps, one of the pressing challenges becomes development of formal frameworks for their automated verification. In this paper, we consider data expressed in standard Semantic Web ontology languages, such as OWL, and define a novel verification formalism called provenance specification logic, building on dynamic logic. We validate our proposal by modeling the test queries presented in The First Provenance Challenge, and conclude that the logic core of such queries can be successfully captured in our formalism.
Formal Verification of Data Provenance Records
Formal Verification of Data Provenance Records
Systap
Systap
Systap
Provenance is an increasingly important aspect of data management that is often underestimated and neglected by practitioners. In our work, we target the problem of reconstructing provenance of files in a shared folder setting, assuming that only standard filesystem metadata are available. We propose a content-based approach that is able to reconstruct provenance automatically, leveraging several similarity measures and edit distance algorithms, adapting and integrating them into a multi-signal pipeline. We discuss our research methodology and show some promising preliminary results.
Reconstructing Provenance
Reconstructing Provenance
Provenance is an increasingly important aspect of data management that is often underestimated and neglected by practitioners. In our work, we target the problem of reconstructing provenance of files in a shared folder setting, assuming that only standard filesystem metadata are available. We propose a content-based approach that is able to reconstruct provenance automatically, leveraging several similarity measures and edit distance algorithms, adapting and integrating them into a multi-signal pipeline. We discuss our research methodology and show some promising preliminary results.
Reconstructing Provenance
Quest: Effcient SPARQL-to-SQL for RDF and OWL
Quest: Effcient SPARQL-to-SQL for RDF and OWL
In this demo we introduce Quest, a new system that provides SPARQL query answering with support for OWL~2~QL and RDFS entailments. Quest allows to link the vocabulary of an ontology to the content of a relational database through mapping axioms. These are then used together with the ontology to answer a SPARQL query by means of a single SQL query that is executed over the database. Quest uses highly-optimised query rewriting techniques to generate the SQL query which not only takes into account the entailments of the ontology and data, but is also 'lean' and simple so that it can be executed efficiently by any SQL engine. Quest supports commercial and open source databases, including database federation tools like Teiid to allow for Ontology Based Data Integration of relational and other sources (e.g., CSV, Excel, XML). Here we will briefly describe Quest mapping language, the query answering process and the most relevant optimisation techniques used by the system. We will conclude with a brief description of the content of this demo.
In this demo we introduce Quest, a new system that provides SPARQL query answering with support for OWL~2~QL and RDFS entailments. Quest allows to link the vocabulary of an ontology to the content of a relational database through mapping axioms. These are then used together with the ontology to answer a SPARQL query by means of a single SQL query that is executed over the database. Quest uses highly-optimised query rewriting techniques to generate the SQL query which not only takes into account the entailments of the ontology and data, but is also 'lean' and simple so that it can be executed efficiently by any SQL engine. Quest supports commercial and open source databases, including database federation tools like Teiid to allow for Ontology Based Data Integration of relational and other sources (e.g., CSV, Excel, XML). Here we will briefly describe Quest mapping language, the query answering process and the most relevant optimisation techniques used by the system. We will conclude with a brief description of the content of this demo.
Quest: Effcient SPARQL-to-SQL for RDF and OWL
University of Texas at Dallas
University of Texas at Dallas
University of Texas at Dallas
Technical University of Hamburg
Technical University of Hamburg
Technical University of Hamburg
Scalable and Domain-Independent Entity Coreference: Establishing High Quality Data Linkages Across Heterogeneous Data Sources
Due to the decentralized nature of the Semantic Web, the same real world entity may be described in various data sources and assigned syntactically distinct identifiers. In order to facilitate data utilization in the Semantic Web, without compromising the freedom of people to publish their data, one critical problem is to appropriately interlink such heterogeneous data. This interlinking process can also be referred to as Entity Coreference, i.e., finding which identifiers refer to the same real world entity. This proposal will investigate algorithms to solve this entity coreference problem in the Semantic Web in several aspects. The essence of entity coreference is to compute the similarity of instance pairs. Given the diversity of domains of existing datasets, it is important that an entity coreference algorithm be able to achieve good precision and recall across domains represented in various ways. Furthermore, in order to scale to large datasets, an algorithm should be able to intelligently select what information to utilize for comparison and determine whether to compare a pair of instances to reduce the overall complexity. Finally, appropriate evaluation strategies need to be chosen to verify the effectiveness of the algorithms.
Scalable and Domain-Independent Entity Coreference: Establishing High Quality Data Linkages Across Heterogeneous Data Sources
Scalable and Domain-Independent Entity Coreference: Establishing High Quality Data Linkages Across Heterogeneous Data Sources
Due to the decentralized nature of the Semantic Web, the same real world entity may be described in various data sources and assigned syntactically distinct identifiers. In order to facilitate data utilization in the Semantic Web, without compromising the freedom of people to publish their data, one critical problem is to appropriately interlink such heterogeneous data. This interlinking process can also be referred to as Entity Coreference, i.e., finding which identifiers refer to the same real world entity. This proposal will investigate algorithms to solve this entity coreference problem in the Semantic Web in several aspects. The essence of entity coreference is to compute the similarity of instance pairs. Given the diversity of domains of existing datasets, it is important that an entity coreference algorithm be able to achieve good precision and recall across domains represented in various ways. Furthermore, in order to scale to large datasets, an algorithm should be able to intelligently select what information to utilize for comparison and determine whether to compare a pair of instances to reduce the overall complexity. Finally, appropriate evaluation strategies need to be chosen to verify the effectiveness of the algorithms.
TELEIOS is a recent European project that addresses the need for scalable access to petabytes of Earth Observation data and the discovery and exploitation of knowledge that is hidden in them. In this demo paper we demonstrate a fire monitoring service that we have implemented in context of the project TELEIOS and explain how Semantic Web and Linked Data technologies allow the service to go beyond relevant services currently deployed in various Earth Observation data centers.
Real Time Fire Monitoring Using Semantic Web and Linked Data Technologies
Real Time Fire Monitoring Using Semantic Web and Linked Data Technologies
Real Time Fire Monitoring Using Semantic Web and Linked Data Technologies
TELEIOS is a recent European project that addresses the need for scalable access to petabytes of Earth Observation data and the discovery and exploitation of knowledge that is hidden in them. In this demo paper we demonstrate a fire monitoring service that we have implemented in context of the project TELEIOS and explain how Semantic Web and Linked Data technologies allow the service to go beyond relevant services currently deployed in various Earth Observation data centers.
Heiko Paulheim
Heiko Paulheim
Heiko Paulheim
Volker Tresp
Volker Tresp
Volker Tresp
Charles III University of Madrid
Charles III University of Madrid
Charles III University of Madrid
Rolf Grütter
Rolf Grütter
Rolf Grütter
Jan Vitek
Jan Vitek
Jan Vitek
Alexandre Passant
Alexandre Passant
Alexandre Passant
Trevor Collins
Trevor Collins
Trevor Collins
Li Tian
Li Tian
Li Tian
Replication for Linked Data
With the Semantic Web scaling up, and more triple-stores with update facilities being available, the need for higher levels of simultaneous triple-stores with identical information becomes more and more urgent. However, where such Data Replication approaches are common in the database community, there is no comprehensive approach for data replication for the Semantic Web. In this research proposal, we will discuss the problem space and scenarios of data replication in the Semantic Web, and explain how we plan on dealing with this issue.
Replication for Linked Data
With the Semantic Web scaling up, and more triple-stores with update facilities being available, the need for higher levels of simultaneous triple-stores with identical information becomes more and more urgent. However, where such Data Replication approaches are common in the database community, there is no comprehensive approach for data replication for the Semantic Web. In this research proposal, we will discuss the problem space and scenarios of data replication in the Semantic Web, and explain how we plan on dealing with this issue.
Replication for Linked Data
University of Texas at El Paso
University of Texas at El Paso
University of Texas at El Paso
SPARQL Update for Complex Event Processing
Complex event processing is currently done primarily with proprietary definition languages. Future smart environments will require collaboration of multi-platform sensors operated by multiple parties. The goal of my research is to verify the applicability of standard-compliant SPARQL for complex event processing tasks. If successful, semantic web standards RDF, SPARQL and OWL with their established base of tools have many other benefits for event processing including support for interconnecting disjoint vocabularies, enriching event information with linked open data and reasoning over semantically annotated content. A software platform capable of continuous incremental evaluation of multiple parallel SPARQL queries is a key enabler of the approach.
SPARQL Update for Complex Event Processing
Complex event processing is currently done primarily with proprietary definition languages. Future smart environments will require collaboration of multi-platform sensors operated by multiple parties. The goal of my research is to verify the applicability of standard-compliant SPARQL for complex event processing tasks. If successful, semantic web standards RDF, SPARQL and OWL with their established base of tools have many other benefits for event processing including support for interconnecting disjoint vocabularies, enriching event information with linked open data and reasoning over semantically annotated content. A software platform capable of continuous incremental evaluation of multiple parallel SPARQL queries is a key enabler of the approach.
SPARQL Update for Complex Event Processing
Epimorphics Ltd
Epimorphics Ltd
Epimorphics Ltd
Adila A. Krisnadhi
Adila A. Krisnadhi
Adila A. Krisnadhi
Yuan-Fang Li
Yuan-Fang Li
Yuan-Fang Li
Bhavani Thuraisingham
Bhavani Thuraisingham
Bhavani Thuraisingham
Sean Felten
Sean Felten
Sean Felten
CSIRO Australia
CSIRO Australia
CSIRO Australia
Knowledge Pattern Extraction and their usage in Exploratory Search
Knowledge Pattern Extraction and their usage in Exploratory Search
Knowledge Pattern Extraction and their usage in Exploratory Search
Knowledge interaction in Web context is a challenging problem. For instance, it requires to deal with complex structures able to filter knowledge by drawing a meaningful context boundary around data. We assume that these complex structures can be formalized as Knowledge Patterns (KPs), aka frames. This Ph.D. work is aimed at developing methods for extracting KPs from the Web and at applying KPs to exploratory search tasks. We want to extract KPs by analyzing the structure of Web links from rich resources, such as Wikipedia.
Knowledge interaction in Web context is a challenging problem. For instance, it requires to deal with complex structures able to filter knowledge by drawing a meaningful context boundary around data. We assume that these complex structures can be formalized as Knowledge Patterns (KPs), aka frames. This Ph.D. work is aimed at developing methods for extracting KPs from the Web and at applying KPs to exploratory search tasks. We want to extract KPs by analyzing the structure of Web links from rich resources, such as Wikipedia.
Riccardo Rosati
Riccardo Rosati
Riccardo Rosati
James Pustejovsky
James Pustejovsky
James Pustejovsky
Leora Morgenstern
Leora Morgenstern
Leora Morgenstern
University of Lleida
University of Lleida
University of Lleida
Mathieu d'Aquin
Mathieu d'Aquin
Mathieu d'Aquin
George Mason University
George Mason University
George Mason University
TasLab, Informatica Trentina SpA
TasLab, Informatica Trentina SpA
TasLab, Informatica Trentina SpA
Telecom Bretagne
Telecom Bretagne
Telecom Bretagne
5th International Workshop on Semantic Sensor Networks
Annika Hinze
Annika Hinze
Annika Hinze
Feiyu Xu
Feiyu Xu
Feiyu Xu
Cisco Systems
Cisco Systems
Cisco Systems
Philippe Cudré-Mauroux
Philippe Cudré-Mauroux
Philippe Cudré-Mauroux
Berlin Institute of Technology
Berlin Institute of Technology
Berlin Institute of Technology
Rensselaer Polytechnic Institute
Rensselaer Polytechnic Institute
Rensselaer Polytechnic Institute
Hans Vasquez-Gross
Hans Vasquez-Gross
Hans Vasquez-Gross
Jeff Z. Pan
Jeff Z. Pan
Jeff Z. Pan
Mark Greaves
Mark Greaves
Mark Greaves
Esteban Zimanyi
Esteban Zimanyi
Esteban Zimanyi
Jinhyung Kim
Jinhyung Kim
Jinhyung Kim
Composition of Linked Data-based RESTful Services
Composition of Linked Data-based RESTful Services
We address the problem of developing a scalable composition framework for Linked Data-based services, which retains the advantages of the loose coupling fostered by REST.
We address the problem of developing a scalable composition framework for Linked Data-based services, which retains the advantages of the loose coupling fostered by REST.
Composition of Linked Data-based RESTful Services
Giusy Di Lorenzo
Giusy Di Lorenzo
Giusy Di Lorenzo
Jay Myers
Jay Myers
Jay Myers
A pair of RDF instances are said to corefer when they are intended to denote the same thing in the world, for example, when two nodes of type foaf:Person describe the same individual. This problem is central to integrating and inter-linking semi-structured datasets. We are developing an online, unsupervised coreference resolution framework for heterogeneous, semi-structured data. The online aspect requires us to process new instances as they appear and not as a batch. The instances are heterogeneous in that they may contain terms from different ontologies whose alignments are not known in advance. Our framework encompasses a two-phased clustering algorithm that is both flexible and distributable, a probabilistic multidimensional attribute model that will support robust schema mappings, and a consolidation algorithm that will be used to perform instance consolidation in order to improve accuracy rates over time by addressing data sparseness.
Online Unsupervised Coreference Resolution for Semi-Structured Heterogeneous Data
Online Unsupervised Coreference Resolution for Semi-Structured Heterogeneous Data
A pair of RDF instances are said to corefer when they are intended to denote the same thing in the world, for example, when two nodes of type foaf:Person describe the same individual. This problem is central to integrating and inter-linking semi-structured datasets. We are developing an online, unsupervised coreference resolution framework for heterogeneous, semi-structured data. The online aspect requires us to process new instances as they appear and not as a batch. The instances are heterogeneous in that they may contain terms from different ontologies whose alignments are not known in advance. Our framework encompasses a two-phased clustering algorithm that is both flexible and distributable, a probabilistic multidimensional attribute model that will support robust schema mappings, and a consolidation algorithm that will be used to perform instance consolidation in order to improve accuracy rates over time by addressing data sparseness.
Online Unsupervised Coreference Resolution for Semi-Structured Heterogeneous Data
Semantic Web Success Story: Practical Integration of Semantic Web Technology and Linked Data Principles in the Architecture and Implementation of an Enterprise Product
Semantic Web Success Story: Practical Integration of Semantic Web Technology and Linked Data Principles in the Architecture and Implementation of an Enterprise Product
Semantic Web Success Story: Practical Integration of Semantic Web Technology and Linked Data Principles in the Architecture and Implementation of an Enterprise Product
James Fan
James Fan
James Fan
Lockheed Martin
Lockheed Martin
Lockheed Martin
Detection, Representation, and Exploitation of Events in the Semantic Web
Falco Cescolini
Falco Cescolini
Falco Cescolini
Hanmin Jung
Hanmin Jung
Hanmin Jung
Wolters Kluwer
Wolters Kluwer
Wolters Kluwer
Ljiljana Stojanovic
Ljiljana Stojanovic
Ljiljana Stojanovic
Matteo Palmonari
Matteo Palmonari
Matteo Palmonari
York Sure-Vetter
York Sure-Vetter
York Sure-Vetter
David Norheim
David Norheim
David Norheim
Raphaël Troncy
Raphaël Troncy
Raphaël Troncy
Michael Uschold
Michael Uschold
Michael Uschold
Abraham Bernstein
Abraham Bernstein
Abraham Bernstein
Purdue University
Purdue University
Purdue University
Olaf Hartig
Olaf Hartig
Olaf Hartig
Stefano Fumeo
Stefano Fumeo
Stefano Fumeo
KMi, The Open University
KMi, The Open University
KMi, The Open University
Vincenzo Maltese
Vincenzo Maltese
Vincenzo Maltese
IIT Varanasi
IIT Varanasi
IIT Varanasi
Paul Buitelaar
Paul Buitelaar
Paul Buitelaar
Google Inc.
Google Inc.
Google Inc.
Naoki Fukuta
Naoki Fukuta
Naoki Fukuta
Krzysztof Janowicz
Krzysztof Janowicz
Krzysztof Janowicz
In real world cases, building reliable problem centric views over Linked Data is a challenging task. An ideal method should include a formal representation of the requirements of the needed dataset and a controlled process moving from the original sources to the outcome. We believe that a goal oriented approach, similar to the AI planning problem, could be successful in controlling the process of linked data fusion, as well as to formalize the relations between requirements, process and result.
Towards a theoretical foundation for the harmonization of linked data
Towards a theoretical foundation for the harmonization of linked data
Towards a theoretical foundation for the harmonization of linked data
In real world cases, building reliable problem centric views over Linked Data is a challenging task. An ideal method should include a formal representation of the requirements of the needed dataset and a controlled process moving from the original sources to the outcome. We believe that a goal oriented approach, similar to the AI planning problem, could be successful in controlling the process of linked data fusion, as well as to formalize the relations between requirements, process and result.
University of Aberdeen
University of Aberdeen
University of Aberdeen
P.E.S. Institute of Technology
P.E.S. Institute of Technology
P.E.S. Institute of Technology
Thanh Tran
Thanh Tran
Thanh Tran
David De Roure
David De Roure
David De Roure
Anika Schumann
Anika Schumann
Anika Schumann
Yuan Ni
Yuan Ni
Yuan Ni
Caroline Barrière
Caroline Barrière
Caroline Barrière
Jennifer Golbeck
Jennifer Golbeck
Jennifer Golbeck
Olaf Görlitz
Olaf Görlitz
Olaf Görlitz
A Multi-Domain Framework for Community Building Based on Data Tagging
In this paper, we present a doctoral thesis which introduces a new approach of time series enrichment with semantics. The paper shows the problem of assigning time series data to the right party of interest and why this problem could not be solved so far. We demonstrate a new way of processing semantic time series and the consequential ability of addressing users. The combination of time series processing and Semantic Web technologies leads us to a new powerful method of data processing and data generation, which offers completely new opportunities to the expert user.
In this paper, we present a doctoral thesis which introduces a new approach of time series enrichment with semantics. The paper shows the problem of assigning time series data to the right party of interest and why this problem could not be solved so far. We demonstrate a new way of processing semantic time series and the consequential ability of addressing users. The combination of time series processing and Semantic Web technologies leads us to a new powerful method of data processing and data generation, which offers completely new opportunities to the expert user.
A Multi-Domain Framework for Community Building Based on Data Tagging
A Multi-Domain Framework for Community Building Based on Data Tagging
University of New South Wales
University of New South Wales
University of New South Wales
Danh Le Phuoc
Danh Le Phuoc
Danh Le Phuoc
Charalampos Nikolaou
Charalampos Nikolaou
Charalampos Nikolaou
Guéret Christophe
Guéret Christophe
Guéret Christophe
Emily Merrill
Emily Merrill
Emily Merrill
Jérôme Euzenat
Jérôme Euzenat
Jérôme Euzenat
Mathias Brochhausen
Mathias Brochhausen
Mathias Brochhausen
Luciano Serafini
Luciano Serafini
Luciano Serafini
University of Montpellier
University of Montpellier
University of Montpellier
Question Answering and NLP
Financial Market Sensing via Querying a Time-Dependent RDF Graph of Sentiment Indicators Aggregated from Web-Based News Articles
Financial Market Sensing via Querying a Time-Dependent RDF Graph of Sentiment Indicators Aggregated from Web-Based News Articles
Financial Market Sensing via Querying a Time-Dependent RDF Graph of Sentiment Indicators Aggregated from Web-Based News Articles
Ulrike Sattler
Ulrike Sattler
Ulrike Sattler
Philippe Thiran
Philippe Thiran
Philippe Thiran
Aston University
Aston University
Aston University
Miriam Fernandez
Miriam Fernandez
Miriam Fernandez
Manos Karpathiotakis
Manos Karpathiotakis
Manos Karpathiotakis
Jans Aasman
Jans Aasman
Jans Aasman
Nicola Guarino
Nicola Guarino
Nicola Guarino
Adrian Paschke
Adrian Paschke
Adrian Paschke
Fairtrace - Tracing the textile industry
Fairtrace - Tracing the textile industry
Fairtrace - Tracing the textile industry
Applications
Alexander Löser
Alexander Löser
Alexander Löser
Thomas Steiner
Thomas Steiner
Thomas Steiner
Richard Cyganiak
Richard Cyganiak
Richard Cyganiak
Nanjing University
Nanjing University
Nanjing University
Bernardo Cuenca Grau
Bernardo Cuenca Grau
Bernardo Cuenca Grau
Harokopio University
Harokopio University
Harokopio University
Scalable semantic processing of huge, distributed real-time streams: Semantics Between Event Processing and Cloud Computing
Registration
Drupal as a Semantic Web platform
Drupal as a Semantic Web platform
Drupal as a Semantic Web platform
eGov and Smart Cities
Ramakanth Kavuluru
Ramakanth Kavuluru
Ramakanth Kavuluru
Valerie Issarny
Valerie Issarny
Valerie Issarny
LD4D: Linked Data for Development
Hugh Glaser
Hugh Glaser
Hugh Glaser
Vit Novacek
Vit Novacek
Vit Novacek
Daniel Gerber
Daniel Gerber
Daniel Gerber
Customer Adoption of Semantic Web Technologies - Sharing our Experience at Oracle
Customer Adoption of Semantic Web Technologies - Sharing our Experience at Oracle
Customer Adoption of Semantic Web Technologies - Sharing our Experience at Oracle
Infrastructure
Registration
Chenyang Wu
Chenyang Wu
Chenyang Wu
SAIC
SAIC
SAIC
Universidade Nova de Lisboa
Universidade Nova de Lisboa
Universidade Nova de Lisboa
Financial Information Management Using the Semantic Web
On Demand Access to Big Data through Semantic Technologies
On Demand Access to Big Data through Semantic Technologies
On Demand Access to Big Data through Semantic Technologies
Pitney Bowes
Pitney Bowes
Pitney Bowes
Pedro Debevere
Pedro Debevere
Pedro Debevere
Sebastian Krause
Sebastian Krause
Sebastian Krause
University of Edinburgh
University of Edinburgh
University of Edinburgh
Vinh Nguyen
Vinh Nguyen
Vinh Nguyen
Heiner Stuckenschmidt
Heiner Stuckenschmidt
Heiner Stuckenschmidt
Getting to know PROV - the W3C Provenance Specifications
Research Track
Yohei Murakami
Yohei Murakami
Yohei Murakami
Shanghai Jiao Tong University
Shanghai Jiao Tong University
Shanghai Jiao Tong University
Vulcan Inc.
Vulcan Inc.
Vulcan Inc.
Linked Enterprise Data: leveraging the Semantic Web stack in a corporate IS environment
Linked Enterprise Data: leveraging the Semantic Web stack in a corporate IS environment
Linked Enterprise Data: leveraging the Semantic Web stack in a corporate IS environment
Vanessa Lopez
Vanessa Lopez
Vanessa Lopez
Xing Niu
Xing Niu
Xing Niu
Matthew Perry
Matthew Perry
Matthew Perry
National Library of Medicine
National Library of Medicine
National Library of Medicine
LL-NLP Tutorial: What to do with long literals? Ask the NLP community...
Martin Serrano
Martin Serrano
Martin Serrano
German Research Centre for Artificial Intelligence (DKFI)
German Research Centre for Artificial Intelligence (DKFI)
German Research Centre for Artificial Intelligence (DKFI)
Linked Data at The Open University: From Technical Challenges to Organizational Innovation
Linked Data at The Open University: From Technical Challenges to Organizational Innovation
Linked Data at The Open University: From Technical Challenges to Organizational Innovation
Spiros Skiadopoulos
Spiros Skiadopoulos
Spiros Skiadopoulos
Christos Tryfonopoulos
Christos Tryfonopoulos
Christos Tryfonopoulos
Tom De Nies
Tom De Nies
Tom De Nies
Paolo Ciccarese
Paolo Ciccarese
Paolo Ciccarese
Arantza Illarramendi
Arantza Illarramendi
Arantza Illarramendi
Armin Haller
Armin Haller
Armin Haller
The Web of Data for E-Commerce in Brief
Wolfgang Nejdl
Wolfgang Nejdl
Wolfgang Nejdl
Semantic Web in Biomedicine
Laurens Rietveld
Laurens Rietveld
Laurens Rietveld
Fortifying a SPARQL Endpoint for Enterprise Usage Scenarios
Fortifying a SPARQL Endpoint for Enterprise Usage Scenarios
Fortifying a SPARQL Endpoint for Enterprise Usage Scenarios
IBM China Research Laboratory
IBM China Research Laboratory
IBM China Research Laboratory
Siegfried Handschuh
Siegfried Handschuh
Siegfried Handschuh
Applied Reasoning and Querying
Michael Schmidt
Michael Schmidt
Michael Schmidt
Harry Halpin
Harry Halpin
Harry Halpin
Making Sense of Research with Rexplore
Making Sense of Research with Rexplore
While there are many tools and services which support the exploration of research data, by and large these tend to provide a limited set of functionalities, which cover primarily ranking measures and mechanisms for relating authors, typically on the basis of simple co-authorship relations. To try and improve over the current state of affairs, we are developing a novel tool for exploring research data, which is called Rexplore. Rexplore builds on an intelligent algorithm for automatically identifying hierarchical and equivalence relations between research areas, to provide a variety of functionalities and visualizations to help users to make sense of a research area. These include visualizations to detect trends in research; ways to cluster authors according to several dynamic similarity measures; and fine-grained mechanisms for ranking authors, taking into account parameters such as ranking criterion, career stage, calendar years, publication venues, etc.
Making Sense of Research with Rexplore
While there are many tools and services which support the exploration of research data, by and large these tend to provide a limited set of functionalities, which cover primarily ranking measures and mechanisms for relating authors, typically on the basis of simple co-authorship relations. To try and improve over the current state of affairs, we are developing a novel tool for exploring research data, which is called Rexplore. Rexplore builds on an intelligent algorithm for automatically identifying hierarchical and equivalence relations between research areas, to provide a variety of functionalities and visualizations to help users to make sense of a research area. These include visualizations to detect trends in research; ways to cluster authors according to several dynamic similarity measures; and fine-grained mechanisms for ranking authors, taking into account parameters such as ranking criterion, career stage, calendar years, publication venues, etc.
The amount of data available in the Linked Data cloud continues to grow. Yet, few services consume and produce linked data. There is recent work that allows a user to define a linked service from an online service, which includes the specifications for consuming and producing linked data, but building such models is time consuming and requires specialized knowledge of RDF and SPARQL. This paper presents a new approach that allows domain experts to rapidly create semantic models of services by demonstration in an interactive web-based interface. First, the user provides examples of the service request URLs. Then, the system automatically proposes a service model the user can refine interactively. Finally, the system saves a service specification using a new expressive vocabulary that includes lowering and lifting rules. This approach empowers end users to rapidly model existing services and immediately use them to consume and produce linked data.
The amount of data available in the Linked Data cloud continues to grow. Yet, few services consume and produce linked data. There is recent work that allows a user to define a linked service from an online service, which includes the specifications for consuming and producing linked data, but building such models is time consuming and requires specialized knowledge of RDF and SPARQL. This paper presents a new approach that allows domain experts to rapidly create semantic models of services by demonstration in an interactive web-based interface. First, the user provides examples of the service request URLs. Then, the system automatically proposes a service model the user can refine interactively. Finally, the system saves a service specification using a new expressive vocabulary that includes lowering and lifting rules. This approach empowers end users to rapidly model existing services and immediately use them to consume and produce linked data.
Rapidly Integrating Services into the Linked Data Cloud
Rapidly Integrating Services into the Linked Data Cloud
Rapidly Integrating Services into the Linked Data Cloud
Semantic Web Rules: Fundamentals, Applications, and Standards
Harith Alani
Harith Alani
Harith Alani
Peter Patel-Schneider
Peter Patel-Schneider
Peter Patel-Schneider
Bernhard Schandl
Bernhard Schandl
Bernhard Schandl
Michael Kifer
Michael Kifer
Michael Kifer
Alexey Boyarsky
Alexey Boyarsky
Alexey Boyarsky
MeDetect: Domain Entity Annotation in Biomedical References Using Linked Open Data
Recently, with the ever-growing of textual medicine records, annotating domain entities has been regarded as an important task in the biomedical field. On the other hand, the process of interlinking open data sources is actively pursued within the Linking Open Data (LOD) project. The number of entities and the number of properties describing semantic relationships between entities within the linked data cloud are very large. In this paper, we propose a knowledge-incentive approach based on LOD for entity annotation in the biomedical field. With this approach, we implement MeDetect, a prototype system to solve the problems mentioned above. The experimental results verify the effectiveness and efficiency of our approach.
Recently, with the ever-growing of textual medicine records, annotating domain entities has been regarded as an important task in the biomedical field. On the other hand, the process of interlinking open data sources is actively pursued within the Linking Open Data (LOD) project. The number of entities and the number of properties describing semantic relationships between entities within the linked data cloud are very large. In this paper, we propose a knowledge-incentive approach based on LOD for entity annotation in the biomedical field. With this approach, we implement MeDetect, a prototype system to solve the problems mentioned above. The experimental results verify the effectiveness and efficiency of our approach.
MeDetect: Domain Entity Annotation in Biomedical References Using Linked Open Data
MeDetect: Domain Entity Annotation in Biomedical References Using Linked Open Data
University of Peloponnese
University of Peloponnese
University of Peloponnese
Learning on Linked Data: Tensors and their Applications in Graph-Structured Domains
Link Discovery with Guaranteed Reduction Ratio in Affine Spaces with Minkowski Measures
Time-efficient algorithms are essential to address the complex linking tasks that arise when trying to discover links on the Web of Data. Although several lossless approaches have been developed for this exact purpose, they do not offer theoretical guarantees with respect to their performance. In this paper, we address this drawback by presenting the first Link Discovery approach with theoretical quality guarantees. In particular, we prove that given an achievable reduction ratio r, our Link Discovery approach HR3 can achieve a reduction ratio r'<=r in a metric space where distances are measured by the means of a Minkowski metric of any order p >= 2. We compare HR3 and the HYPPO algorithm implemented in LIMES 0.5 with respect to the number of comparisons they carry out. In addition, we compare our approach with the algorithms implemented in the state-of-the-art frameworks LIMES 0.5 and SILK 2.5 with respect to runtime. We show that HR3 outperforms these previous approaches with respect to runtime in each of our four experimental setups.
Link Discovery with Guaranteed Reduction Ratio in Affine Spaces with Minkowski Measures
Link Discovery with Guaranteed Reduction Ratio in Affine Spaces with Minkowski Measures
Time-efficient algorithms are essential to address the complex linking tasks that arise when trying to discover links on the Web of Data. Although several lossless approaches have been developed for this exact purpose, they do not offer theoretical guarantees with respect to their performance. In this paper, we address this drawback by presenting the first Link Discovery approach with theoretical quality guarantees. In particular, we prove that given an achievable reduction ratio r, our Link Discovery approach HR3 can achieve a reduction ratio r'<=r in a metric space where distances are measured by the means of a Minkowski metric of any order p >= 2. We compare HR3 and the HYPPO algorithm implemented in LIMES 0.5 with respect to the number of comparisons they carry out. In addition, we compare our approach with the algorithms implemented in the state-of-the-art frameworks LIMES 0.5 and SILK 2.5 with respect to runtime. We show that HR3 outperforms these previous approaches with respect to runtime in each of our four experimental setups.
Daniela Ferrari
Daniela Ferrari
Daniela Ferrari
David Shamma
David Shamma
David Shamma
Rafael S. Gonçalves
Rafael S. Gonçalves
Rafael S. Gonçalves
Pavlos Fafalios
Pavlos Fafalios
Pavlos Fafalios
Tenforce
Tenforce
Tenforce
Josef Hardi
Josef Hardi
Josef Hardi
Michael Gruninger
Michael Gruninger
Michael Gruninger
Abir Qasem
Abir Qasem
Abir Qasem
Dataset about iswc2012-alignments.
Tue May 03 19:01:38 CEST 2016
University of Queensland
University of Queensland
University of Queensland
Mining Patterns from Clinical Trial Annotated Datasets by Exploiting the NCI Thesaurus
Annotations of clinical trials with controlled vocabularies of drugs and diseases, encode scientific knowledge that can be mined to discover relationships between scientific concepts. We present PAnG (Patterns in Annotation Graphs), a tool that relies on dense subgraphs, graph summarization and taxonomic distance metrics, computed using the NCI Thesaurus, to identify patterns.
Mining Patterns from Clinical Trial Annotated Datasets by Exploiting the NCI Thesaurus
Annotations of clinical trials with controlled vocabularies of drugs and diseases, encode scientific knowledge that can be mined to discover relationships between scientific concepts. We present PAnG (Patterns in Annotation Graphs), a tool that relies on dense subgraphs, graph summarization and taxonomic distance metrics, computed using the NCI Thesaurus, to identify patterns.
Mining Patterns from Clinical Trial Annotated Datasets by Exploiting the NCI Thesaurus
Jena-HBase: A Distributed, Scalable and Effcient RDF Triple Store
Jena-HBase: A Distributed, Scalable and Effcient RDF Triple Store
Lack of scalability is one of the most significant problems faced by single machine RDF data stores. The advent of Cloud Computing and related tools and technologies has paved a way for a distributed ecosystem of RDF triple stores that can potentially allow up to a planet scale storage along with distributed query processing capabilities. Towards this end, we present Jena-HBase, a HBase backed triple store that can be used with the Jena framework. Jena-HBase provides end-users with a scalable storage and querying solution that supports all features from the RDF specification.
Lack of scalability is one of the most significant problems faced by single machine RDF data stores. The advent of Cloud Computing and related tools and technologies has paved a way for a distributed ecosystem of RDF triple stores that can potentially allow up to a planet scale storage along with distributed query processing capabilities. Towards this end, we present Jena-HBase, a HBase backed triple store that can be used with the Jena framework. Jena-HBase provides end-users with a scalable storage and querying solution that supports all features from the RDF specification.
Jena-HBase: A Distributed, Scalable and Effcient RDF Triple Store
Frank van Harmelen
Frank van Harmelen
Frank van Harmelen
Alejandro Rodríguez González
Alejandro Rodríguez González
Alejandro Rodríguez González
BBC
BBC
BBC
Kathryn Laskey
Kathryn Laskey
Kathryn Laskey
Bundeswehr University Munich
Bundeswehr University Munich
Bundeswehr University Munich
The Not-So-Easy Task of Computing Class Subsumptions in OWL RL
The Not-So-Easy Task of Computing Class Subsumptions in OWL RL
The lightweight ontology language OWL RL is used for reasoning with large amounts of data. To this end, the W3C standard provides a simple system of deduction rules, which operate directly on the RDF syntax of OWL. Several similar systems have been studied. However, these approaches are usually complete for instance retrieval only. This paper asks if and how such methods could also be used for computing entailed subclass relationships. Checking entailment for arbitrary OWL RL class subsumptions is co-NP-hard, but tractable rule-based reasoning is possible when restricting to subsumptions between atomic classes. Surprisingly, however, this cannot be achieved in any RDF-based rule system, i.e., the W3C calculus cannot be extended to compute all atomic class subsumptions. We identify syntactic restrictions to mitigate this problem, and propose a rule system that is sound and complete for many OWL RL ontologies.
The lightweight ontology language OWL RL is used for reasoning with large amounts of data. To this end, the W3C standard provides a simple system of deduction rules, which operate directly on the RDF syntax of OWL. Several similar systems have been studied. However, these approaches are usually complete for instance retrieval only. This paper asks if and how such methods could also be used for computing entailed subclass relationships. Checking entailment for arbitrary OWL RL class subsumptions is co-NP-hard, but tractable rule-based reasoning is possible when restricting to subsumptions between atomic classes. Surprisingly, however, this cannot be achieved in any RDF-based rule system, i.e., the W3C calculus cannot be extended to compute all atomic class subsumptions. We identify syntactic restrictions to mitigate this problem, and propose a rule system that is sound and complete for many OWL RL ontologies.
The Not-So-Easy Task of Computing Class Subsumptions in OWL RL
Sebastian Tramp
Sebastian Tramp
Sebastian Tramp
Esko Nuutila
Esko Nuutila
Esko Nuutila
Fabio Ciravegna
Fabio Ciravegna
Fabio Ciravegna
Carlo Allocca
Carlo Allocca
Carlo Allocca
Kathryn Dunn
Kathryn Dunn
Kathryn Dunn
Linked Data Fusion in ODCleanStore
As part of LOD2 project and OpenData.cz initiative, we are developing an ODCleanStore framework enabling management of Linked Data. In this paper, we focus on the query-time data fusion in ODCleanStore, which provides data consumers with integrated views on Linked Data; the fused data (1) has solved conflicts according to the preferred conflict resolution policies and (2) is accompanied with provenance and quality scores, so that the consumers can judge the usefulness and trustworthiness of the data for their task at hand.
Linked Data Fusion in ODCleanStore
As part of LOD2 project and OpenData.cz initiative, we are developing an ODCleanStore framework enabling management of Linked Data. In this paper, we focus on the query-time data fusion in ODCleanStore, which provides data consumers with integrated views on Linked Data; the fused data (1) has solved conflicts according to the preferred conflict resolution policies and (2) is accompanied with provenance and quality scores, so that the consumers can judge the usefulness and trustworthiness of the data for their task at hand.
Linked Data Fusion in ODCleanStore
Robert Engels
Robert Engels
Robert Engels
Enrico Motta
Enrico Motta
Enrico Motta
RDF Query Processing in the Cloud
Pallavi Karanth
Pallavi Karanth
Pallavi Karanth
CMU
CMU
CMU
Trient Consulting Group Srl
Trient Consulting Group Srl
Trient Consulting Group Srl
Riichiro Mizoguchi
Riichiro Mizoguchi
Riichiro Mizoguchi
Thorsten Krueger
Thorsten Krueger
Thorsten Krueger
Xingjian Zhang
Xingjian Zhang
Xingjian Zhang
Jane Hunter
Jane Hunter
Jane Hunter
Pennsylvania State University
Pennsylvania State University
Pennsylvania State University
MORe: Modular Combination of OWL Reasoners for Ontology Classification
Classification is a fundamental reasoning task in ontology design, and there is currently a wide range of reasoners highly optimised for classification of OWL 2 ontologies. There are also several reasoners that are complete for restricted fragments of OWL 2 , such as the OWL 2 EL profile. These reasoners are much more efficient than fully-fledged OWL 2 reasoners, but they are not complete for ontologies containing (even if just a few) axioms outside the relevant fragment. In this paper, we propose a novel classification technique that combines an OWL 2 reasoner and an efficient reasoner for a given fragment in such a way that the bulk of the workload is assigned to the latter. Reasoners are combined in a black-box modular manner, and the specifics of their implementation (and even of their reasoning technique) are irrelevant to our approach.
Classification is a fundamental reasoning task in ontology design, and there is currently a wide range of reasoners highly optimised for classification of OWL 2 ontologies. There are also several reasoners that are complete for restricted fragments of OWL 2 , such as the OWL 2 EL profile. These reasoners are much more efficient than fully-fledged OWL 2 reasoners, but they are not complete for ontologies containing (even if just a few) axioms outside the relevant fragment. In this paper, we propose a novel classification technique that combines an OWL 2 reasoner and an efficient reasoner for a given fragment in such a way that the bulk of the workload is assigned to the latter. Reasoners are combined in a black-box modular manner, and the specifics of their implementation (and even of their reasoning technique) are irrelevant to our approach.
MORe: Modular Combination of OWL Reasoners for Ontology Classification
MORe: Modular Combination of OWL Reasoners for Ontology Classification
Steven Jennings
Steven Jennings
Steven Jennings
Leigh Dodds
Leigh Dodds
Leigh Dodds
IBM
IBM
IBM
Vasant Honavar
Vasant Honavar
Vasant Honavar
FOAF project
FOAF project
FOAF project
Simón Bolívar University
Simón Bolívar University
Simón Bolívar University
Sapienza University of Rome
Sapienza University of Rome
Sapienza University of Rome
Key-Sun Choi
Key-Sun Choi
Key-Sun Choi
Université de Montréal
Université de Montréal
Université de Montréal
Posters and Demos
Université Paris-Est, LIGM
Université Paris-Est, LIGM
Université Paris-Est, LIGM
Pascal Hitzler
Pascal Hitzler
Pascal Hitzler
Tania Tudorache
Tania Tudorache
Tania Tudorache
Christian Meilicke
Christian Meilicke
Christian Meilicke
Özgür Lütfü Özcep
Özgür Lütfü Özcep
Özgür Lütfü Özcep
Technical University Munich
Technical University Munich
Technical University Munich
Trinity College, Dublin
Trinity College, Dublin
Trinity College, Dublin
Antidot
Antidot
Antidot
Matthias Klusch
Matthias Klusch
Matthias Klusch
Kelly Reynolds
Kelly Reynolds
Kelly Reynolds
Willem Robert Van Hage
Willem Robert Van Hage
Willem Robert Van Hage
ourSpaces - A Semantic Virtual Research Environment
ourSpaces - A Semantic Virtual Research Environment
ourSpaces - A Semantic Virtual Research Environment
In this demo we present ourSpaces, a semantic Virtual Research Environment designed to support inter-disciplinary research teams. The system utilizes technologies such as OWL, RDF and a rule-based reasoner to support the management of provenance information, social networks, online communication and policy enforcement within the VRE.
In this demo we present ourSpaces, a semantic Virtual Research Environment designed to support inter-disciplinary research teams. The system utilizes technologies such as OWL, RDF and a rule-based reasoner to support the management of provenance information, social networks, online communication and policy enforcement within the VRE.
Sebastian Hellmann
Sebastian Hellmann
Sebastian Hellmann
EURECOM
EURECOM
EURECOM
On the Diversity and Availability of Temporal Information in Linked Open Data
An increasing amount of data is published and consumed on the Web according to the Linked Data paradigm. In consideration of both publishers and consumers, the temporal dimension of data is important. In this paper we investigate the characterisation and availability of temporal information in Linked Data at large scale. Based on an abstract definition of temporal information we conduct experiments to evaluate the availability of such information using the data from the 2011 Billion Triple Challenge (BTC) dataset. Focusing in particular on the representation of temporal meta-information, i.e., temporal information associated with RDF statements and graphs, we investigate the approaches proposed in the literature, performing both a quantitative and a qualitative analysis and proposing guidelines for data consumers and publishers. Our experiments show that the amount of temporal information available in the LOD cloud is still very small; several different models have been used on different datasets, with a prevalence of approaches based on the annotation of RDF documents.
An increasing amount of data is published and consumed on the Web according to the Linked Data paradigm. In consideration of both publishers and consumers, the temporal dimension of data is important. In this paper we investigate the characterisation and availability of temporal information in Linked Data at large scale. Based on an abstract definition of temporal information we conduct experiments to evaluate the availability of such information using the data from the 2011 Billion Triple Challenge (BTC) dataset. Focusing in particular on the representation of temporal meta-information, i.e., temporal information associated with RDF statements and graphs, we investigate the approaches proposed in the literature, performing both a quantitative and a qualitative analysis and proposing guidelines for data consumers and publishers. Our experiments show that the amount of temporal information available in the LOD cloud is still very small; several different models have been used on different datasets, with a prevalence of approaches based on the annotation of RDF documents.
On the Diversity and Availability of Temporal Information in Linked Open Data
On the Diversity and Availability of Temporal Information in Linked Open Data
Yves Raimond
Yves Raimond
Yves Raimond
Denny Vrandecic
Denny Vrandecic
Denny Vrandecic
Image & Pervasive Access Lab (IPAL), UMI CNRS
Image & Pervasive Access Lab (IPAL), UMI CNRS
Image & Pervasive Access Lab (IPAL), UMI CNRS
IBM Haifa Research Laboratory
IBM Haifa Research Laboratory
IBM Haifa Research Laboratory
Diego Calvanese
Diego Calvanese
Diego Calvanese
Ivan Herman
Ivan Herman
Ivan Herman
Letizia Tanca
Letizia Tanca
Letizia Tanca
University of the Basque Country
University of the Basque Country
University of the Basque Country
University of Innsbruck
University of Innsbruck
University of Innsbruck
University of Southern California
University of Southern California
University of Southern California
Martin Stephenson
Martin Stephenson
Martin Stephenson
Geert-Jan Houben
Geert-Jan Houben
Geert-Jan Houben
Gangemi
Aldo Gangemi
Aldo
Aldo Gangemi
Aldo Gangemi
The Open University
The Open University
The Open University
National and Kapodistrian University of Athens
National and Kapodistrian University of Athens
National and Kapodistrian University of Athens
Case Western Reserve University
Case Western Reserve University
Case Western Reserve University
Gnowsis
Gnowsis
Gnowsis
Minh Dao-Tran
Minh Dao-Tran
Minh Dao-Tran
Joint Workshop on Scalable and High-Performance Semantic Web Systems
Hima Yalamanchili
Hima Yalamanchili
Hima Yalamanchili
Krishnaprasad Thirunarayan
Krishnaprasad Thirunarayan
Krishnaprasad Thirunarayan
Kostis Kyzirakos
Kostis Kyzirakos
Kostis Kyzirakos
Arne Peters
Arne Peters
Arne Peters
Brandeis University
Brandeis University
Brandeis University
Padmashree Ravindra
Padmashree Ravindra
Padmashree Ravindra
Marco Balduini
Marco Balduini
Marco Balduini
Tommaso Di Noia
Tommaso Di Noia
Tommaso Di Noia
Jan Michelfeit
Jan Michelfeit
Jan Michelfeit
Qualcomm
Qualcomm
Qualcomm
Ian Dickinson
Ian Dickinson
Ian Dickinson
Diana Maynard
Diana Maynard
Diana Maynard
Kazuhiko Ohe
Kazuhiko Ohe
Kazuhiko Ohe
Render
Render
Render
Kouji Kozaki
Kouji Kozaki
Kouji Kozaki
DERI, Galway
DERI, Galway
DERI, Galway
Carsten Lutz
Carsten Lutz
Carsten Lutz
Dave Kolas
Dave Kolas
Dave Kolas
Nuance Communications
Nuance Communications
Nuance Communications
École Nationale Supérieure des Mines de Saint-Étienne
École Nationale Supérieure des Mines de Saint-Étienne
École Nationale Supérieure des Mines de Saint-Étienne
Extracting Relevant Subgraphs from Graph Navigation
Extracting Relevant Subgraphs from Graph Navigation
The main goal of current Web navigation languages is to retrieve set of nodes reachable from a given node. No information is provided about the fragments of the Web navigated to reach these nodes. In other words, information about their connections is lost. This paper presents an efficient algorithm to extract relevant parts of these Web fragments and shows the importance of producing subgraphs besides of sets of nodes. We discuss examples with real data using an implementation of the algorithm in the EXpRESs tool.
Extracting Relevant Subgraphs from Graph Navigation
The main goal of current Web navigation languages is to retrieve set of nodes reachable from a given node. No information is provided about the fragments of the Web navigated to reach these nodes. In other words, information about their connections is lost. This paper presents an efficient algorithm to extract relevant parts of these Web fragments and shows the importance of producing subgraphs besides of sets of nodes. We discuss examples with real data using an implementation of the algorithm in the EXpRESs tool.
Marine Biological Laboratory
Marine Biological Laboratory
Marine Biological Laboratory
Julien Cojan
Julien Cojan
Julien Cojan
Aditya Kalyanpur
Aditya Kalyanpur
Aditya Kalyanpur
Alberto Lavelli
Alberto Lavelli
Alberto Lavelli
Concept-Based Semantic Difference in Expressive Description Logics
Concept-Based Semantic Difference in Expressive Description Logics
Detecting, much less understanding, the difference between two description logic based ontologies is challenging for ontology engineers due, in part, to the possibility of complex, non-local logic effects of axiom changes. First, it is often quite difficult to even determine which concepts have had their meaning altered by a change. Second, once a concept change is pinpointed, the problem of distinguishing whether the concept is directly or indirectly affected by a change has yet to be tackled. To address the first issue, various principled notions of ``semantic diff'' (based on deductive inseparability) have been proposed in the literature and shown to be computationally practical for the expressively restricted case of ELHr-terminologies. However, problems arise even for such limited logics as ALC: First, computation gets more difficult, becoming undecidable for logics such as SROIQ which underly the Web Ontology Language (OWL). Second, the presence of negation and disjunction make the standard semantic difference too sensitive to change: essentially, any logically effectual change always affects all terms in the ontology. In order to tackle these issues, we formulate the central notion of finding the minimal change set based on model inseparability, and present a method to differentiate changes which are specific to (thus directly affect) particular concept names. Subsequently we devise a series of computable approximations, and compare the variously approximated change sets over a series of versions of the NCI Thesaurus (NCIt).
Concept-Based Semantic Difference in Expressive Description Logics
Detecting, much less understanding, the difference between two description logic based ontologies is challenging for ontology engineers due, in part, to the possibility of complex, non-local logic effects of axiom changes. First, it is often quite difficult to even determine which concepts have had their meaning altered by a change. Second, once a concept change is pinpointed, the problem of distinguishing whether the concept is directly or indirectly affected by a change has yet to be tackled. To address the first issue, various principled notions of ``semantic diff'' (based on deductive inseparability) have been proposed in the literature and shown to be computationally practical for the expressively restricted case of ELHr-terminologies. However, problems arise even for such limited logics as ALC: First, computation gets more difficult, becoming undecidable for logics such as SROIQ which underly the Web Ontology Language (OWL). Second, the presence of negation and disjunction make the standard semantic difference too sensitive to change: essentially, any logically effectual change always affects all terms in the ontology. In order to tackle these issues, we formulate the central notion of finding the minimal change set based on model inseparability, and present a method to differentiate changes which are specific to (thus directly affect) particular concept names. Subsequently we devise a series of computable approximations, and compare the variously approximated change sets over a series of versions of the NCI Thesaurus (NCIt).
Héctor Pérez-Urbina
Héctor Pérez-Urbina
Héctor Pérez-Urbina
Evaluation of a layered approach to question answering over linked data
We present a question answering system architecture which processes natural language questions in a pipeline consisting of five steps: i) question parsing and query template generation, ii) lookup in an inverted index, iii) string similarity computation, iv) lookup in a lexical database in order to find synonyms, and v) semantic similarity computation. These steps are ordered with respect to their computational effort, following the idea of layered processing: questions are passed on along the pipeline only if they cannot be answered on the basis of earlier processing steps, thereby invoking computationally expensive operations only for complex queries that require them. In this paper we present an evaluation of the system on the dataset provided by the 2nd Open Challenge on Question Answering over Linked Data (QALD-2). The main, novel contribution is a systematic empirical investigation of the impact of the single processing components on the overall performance of question answering over linked data.
Evaluation of a layered approach to question answering over linked data
We present a question answering system architecture which processes natural language questions in a pipeline consisting of five steps: i) question parsing and query template generation, ii) lookup in an inverted index, iii) string similarity computation, iv) lookup in a lexical database in order to find synonyms, and v) semantic similarity computation. These steps are ordered with respect to their computational effort, following the idea of layered processing: questions are passed on along the pipeline only if they cannot be answered on the basis of earlier processing steps, thereby invoking computationally expensive operations only for complex queries that require them. In this paper we present an evaluation of the system on the dataset provided by the 2nd Open Challenge on Question Answering over Linked Data (QALD-2). The main, novel contribution is a systematic empirical investigation of the impact of the single processing components on the overall performance of question answering over linked data.
Evaluation of a layered approach to question answering over linked data
Fouad Zablith
Fouad Zablith
Fouad Zablith
INSTANS: High-Performance Event Processing with Standard RDF and SPARQL
INSTANS: High-Performance Event Processing with Standard RDF and SPARQL
Smart environments require collaboration of multi-platform sensors operated by multiple parties. Proprietary event processing solutions do not have enough interoperation flexibility, easily leading to overlapping functions wasting hardware and software resources as well as data communications. Our goal is to verify the applicability of standard-compliant SPARQL for any complex event processing task. If found feasible, semantic web methods RDF, SPARQL and OWL have the built-in support for interconnecting disjoint vocabularies, enriching event information with linked open data and reasoning over semantically annotated content, yielding a very flexible event processing environment. Our approach is designed to meet these requirements. Our INSTANS platform based on continuous execution of interconnected SPARQL queries using the Rete-algorithm is a new approach showing improved performance for event processing tasks over current SPARQL-based solutions.
INSTANS: High-Performance Event Processing with Standard RDF and SPARQL
Smart environments require collaboration of multi-platform sensors operated by multiple parties. Proprietary event processing solutions do not have enough interoperation flexibility, easily leading to overlapping functions wasting hardware and software resources as well as data communications. Our goal is to verify the applicability of standard-compliant SPARQL for any complex event processing task. If found feasible, semantic web methods RDF, SPARQL and OWL have the built-in support for interconnecting disjoint vocabularies, enriching event information with linked open data and reasoning over semantically annotated content, yielding a very flexible event processing environment. Our approach is designed to meet these requirements. Our INSTANS platform based on continuous execution of interconnected SPARQL queries using the Rete-algorithm is a new approach showing improved performance for event processing tasks over current SPARQL-based solutions.
University of Surrey
University of Surrey
University of Surrey
Jie Liu
Jie Liu
Jie Liu
Isabel Cruz
Isabel Cruz
Isabel Cruz
Journal Workshop Session
Valentina Maccatrozzo
Valentina Maccatrozzo
Valentina Maccatrozzo
Mari Carmen Suárez-Figueroa
Mari Carmen Suárez-Figueroa
Mari Carmen Suárez-Figueroa
Czech Technical University in Prague
Czech Technical University in Prague
Czech Technical University in Prague
Michael Hausenblas
Michael Hausenblas
Michael Hausenblas
Angel García Crespo
Angel García Crespo
Angel García Crespo
Most of the semantic content available has been generated automatically by using annotation services for existing content. Automatic annotation is not of sufficient quality to enable focused search and retrieval: either too many or too few terms are semantically annotated. User-defined semantic enrichment allows for a more targeted approach. We developed a tool for semantic annotation of digital documents and conducted an end-user study to evaluate its acceptance by and usability for non-expert users. This paper presents the results of this user study and discusses the lessons learned about both the semantic enrichment process and our methodology of exposing non-experts to semantic enrichment.
Most of the semantic content available has been generated automatically by using annotation services for existing content. Automatic annotation is not of sufficient quality to enable focused search and retrieval: either too many or too few terms are semantically annotated. User-defined semantic enrichment allows for a more targeted approach. We developed a tool for semantic annotation of digital documents and conducted an end-user study to evaluate its acceptance by and usability for non-expert users. This paper presents the results of this user study and discusses the lessons learned about both the semantic enrichment process and our methodology of exposing non-experts to semantic enrichment.
Semantic Enrichment by Non-Experts: Usability of Manual Annotation Tools
Semantic Enrichment by Non-Experts: Usability of Manual Annotation Tools
Semantic Enrichment by Non-Experts: Usability of Manual Annotation Tools
Ian Horrocks
Ian Horrocks
Ian Horrocks
Linked Stream Data Processing Engines: Facts and Figures
Linked Stream Data, i.e., the RDF data model extended for representing stream data generated from sensors social network applications, is gaining popularity. This has motivated considerable work on developing corresponding data models associated with processing engines. However, current implemented engines have not been thoroughly evaluated to assess their capabilities. For reasonable systematic evaluations, in this work we propose a novel, customizable evaluation framework and a corresponding methodology for realistic data generation, system testing, and result analysis. Based on this evaluation environment, extensive experiments have been conducted in order to compare the state-of-the-art LSD engines wrt. qualitative and quantitative properties, taking into account the underlying principles of stream processing. Consequently, we provide a detailed analysis of the experimental outcomes that reveal useful findings for improving current and future engines.
Linked Stream Data, i.e., the RDF data model extended for representing stream data generated from sensors social network applications, is gaining popularity. This has motivated considerable work on developing corresponding data models associated with processing engines. However, current implemented engines have not been thoroughly evaluated to assess their capabilities. For reasonable systematic evaluations, in this work we propose a novel, customizable evaluation framework and a corresponding methodology for realistic data generation, system testing, and result analysis. Based on this evaluation environment, extensive experiments have been conducted in order to compare the state-of-the-art LSD engines wrt. qualitative and quantitative properties, taking into account the underlying principles of stream processing. Consequently, we provide a detailed analysis of the experimental outcomes that reveal useful findings for improving current and future engines.
Linked Stream Data Processing Engines: Facts and Figures
Linked Stream Data Processing Engines: Facts and Figures
DiscOU: A Flexible Discovery Engine for Open Educational Resources Using Semantic Indexing and Relationship Summaries
DiscOU: A Flexible Discovery Engine for Open Educational Resources Using Semantic Indexing and Relationship Summaries
We demonstrate the DiscOU engine implementing a resource discovery approach where the textual components of open educational resources are automatically annotated with relevant entities (using a named entity recognition system), so that these rich annotations can be searched by similarity, based on existing resources of interest.
We demonstrate the DiscOU engine implementing a resource discovery approach where the textual components of open educational resources are automatically annotated with relevant entities (using a named entity recognition system), so that these rich annotations can be searched by similarity, based on existing resources of interest.
DiscOU: A Flexible Discovery Engine for Open Educational Resources Using Semantic Indexing and Relationship Summaries
Roman Prokofyev
Roman Prokofyev
Roman Prokofyev
Michael Ward
Michael Ward
Michael Ward
Marco Ruzzi
Marco Ruzzi
Marco Ruzzi
Laboratoire d’Informatique de l’universit Paris-Nord (LIPN) - UMR 7030 Universit Paris 13 - CNRS
Laboratoire d’Informatique de l’universit Paris-Nord (LIPN) - UMR 7030 Universit Paris 13 - CNRS
Laboratoire d’Informatique de l’universit Paris-Nord (LIPN) - UMR 7030 Universit Paris 13 - CNRS
Deborah L. McGuinness
Deborah L. McGuinness
Deborah L. McGuinness
Usability and user satisfaction are of paramount importance when designing interactive software solutions. Furthermore, the optimal design can be dependent not only on the task but also on the type of user. Evaluations can shed light on these issues; however, very few studies have focused on assessing the usability of semantic search systems. As semantic search becomes mainstream, there is growing need for standardised, comprehensive evaluation frameworks. In this study, we assess the usability and user satisfaction of different semantic search query input approaches (natural language and view-based) from the perspective of different user types (experts and casuals). Contrary to previous studies, we found that casual users preferred the form-based query approach whereas expert users found the graph-based to be the most intuitive. Additionally, the controlled-language model offered the most support for casual users but was perceived as restrictive by experts, thus limiting their ability to express their information needs.
Usability and user satisfaction are of paramount importance when designing interactive software solutions. Furthermore, the optimal design can be dependent not only on the task but also on the type of user. Evaluations can shed light on these issues; however, very few studies have focused on assessing the usability of semantic search systems. As semantic search becomes mainstream, there is growing need for standardised, comprehensive evaluation frameworks. In this study, we assess the usability and user satisfaction of different semantic search query input approaches (natural language and view-based) from the perspective of different user types (experts and casuals). Contrary to previous studies, we found that casual users preferred the form-based query approach whereas expert users found the graph-based to be the most intuitive. Additionally, the controlled-language model offered the most support for casual users but was perceived as restrictive by experts, thus limiting their ability to express their information needs.
Evaluating Semantic Search Query Approaches with Expert and Casual Users
Evaluating Semantic Search Query Approaches with Expert and Casual Users
Evaluating Semantic Search Query Approaches with Expert and Casual Users
Personalised Graph-based Selection of Web APIs
Modelling and understanding various contexts of users is important to enable personalised selection of Web APIs in directories such as Programmable Web. Currently, relationships between users and Web APIs are not clearly understood and utilized by existing selection approaches. In this paper, we present a semantic model of a Web API directory graph that captures relationships such as Web APIs, mashups, developers, and categories. We describe a novel configurable graph-based method for selection of Web APIs with personalised and temporal aspects. The method allows users to get more control over their preferences and recommended Web APIs while they can exploit information about their social links and preferences. We evaluate the method on a real-world dataset from ProgrammableWeb.com, and show that it provides more contextualised results than currently available popularity based rankings.
Personalised Graph-based Selection of Web APIs
Personalised Graph-based Selection of Web APIs
Modelling and understanding various contexts of users is important to enable personalised selection of Web APIs in directories such as Programmable Web. Currently, relationships between users and Web APIs are not clearly understood and utilized by existing selection approaches. In this paper, we present a semantic model of a Web API directory graph that captures relationships such as Web APIs, mashups, developers, and categories. We describe a novel configurable graph-based method for selection of Web APIs with personalised and temporal aspects. The method allows users to get more control over their preferences and recommended Web APIs while they can exploit information about their social links and preferences. We evaluate the method on a real-world dataset from ProgrammableWeb.com, and show that it provides more contextualised results than currently available popularity based rankings.
Jean Christoph Jung
Jean Christoph Jung
Jean Christoph Jung
University of Toronto
University of Toronto
University of Toronto
Jesse Weaver
Jesse Weaver
Jesse Weaver
Everything is Connected: Using Linked Data for Multimedia Narration of Connections between Concepts
Everything is Connected: Using Linked Data for Multimedia Narration of Connections between Concepts
This paper introduces a Linked Data application for automatically generating a story between two concepts in the Web of Data, based on formally described links. A path between two concepts is obtained by querying multiple linked open datasets; the path is then enriched with multimedia presentation material for each node in order to obtain a full multimedia presentation of the found path.
Everything is Connected: Using Linked Data for Multimedia Narration of Connections between Concepts
This paper introduces a Linked Data application for automatically generating a story between two concepts in the Web of Data, based on formally described links. A path between two concepts is obtained by querying multiple linked open datasets; the path is then enriched with multimedia presentation material for each node in order to obtain a full multimedia presentation of the found path.
Lightning Talks
Juan F. Sequeda
Juan F. Sequeda
Juan F. Sequeda
Elsevier
Elsevier
Elsevier
Souripriya Das
Souripriya Das
Souripriya Das
Tracking user interests over time is important for making accurate recommendations. However, the widely-used time-decay-based approach worsens the sparsity problem because it deemphasizes old item transactions. We introduce two ideas to solve the sparsity problem. First, we divide the users' transactions into epochs i.e. time periods, and identify epochs that are dominated by interests similar to the current interests of the active user. Thus, it can eliminate dissimilar transactions while making use of similar transactions that exist in prior epochs. Second, we use a taxonomy of items to model user item transactions in each epoch. This well captures the interests of users in each epoch even if there are few transactions. It suits the situations in which the items transacted by users dynamically change over time; the semantics behind classes do not change so often while individual items often appear and disappear. Fortunately, many taxonomies are now available on the web because of the spread of the Linked Open Data vision. We can now use those to understand dynamic user interests semantically. We evaluate our method using a dataset, a music listening history, extracted from users' tweets and one containing a restaurant visit history gathered from a gourmet guide site. The results show that our method predicts user interests much more accurately than the previous time-decay-based method.
Collaborative Filtering by Analyzing Dynamic User Interests Modeled by Taxonomy
Collaborative Filtering by Analyzing Dynamic User Interests Modeled by Taxonomy
Tracking user interests over time is important for making accurate recommendations. However, the widely-used time-decay-based approach worsens the sparsity problem because it deemphasizes old item transactions. We introduce two ideas to solve the sparsity problem. First, we divide the users' transactions into epochs i.e. time periods, and identify epochs that are dominated by interests similar to the current interests of the active user. Thus, it can eliminate dissimilar transactions while making use of similar transactions that exist in prior epochs. Second, we use a taxonomy of items to model user item transactions in each epoch. This well captures the interests of users in each epoch even if there are few transactions. It suits the situations in which the items transacted by users dynamically change over time; the semantics behind classes do not change so often while individual items often appear and disappear. Fortunately, many taxonomies are now available on the web because of the spread of the Linked Open Data vision. We can now use those to understand dynamic user interests semantically. We evaluate our method using a dataset, a music listening history, extracted from users' tweets and one containing a restaurant visit history gathered from a gourmet guide site. The results show that our method predicts user interests much more accurately than the previous time-decay-based method.
Collaborative Filtering by Analyzing Dynamic User Interests Modeled by Taxonomy
We tackle the problem of improving the relevance of automatically selected tags in large-scale ontology-based information systems. Contrary to traditional settings where tags can be chosen arbitrarily, we focus on the problem of recommending tags (e.g., concepts) directly from a collaborative, user-driven ontology. We compare the effectiveness of a series of approaches to select the best tags ranging from traditional IR techniques such as TF/IDF weighting to novel techniques based on ontological distances and latent Dirichlet allocation. All our experiments are run against a real corpus of tags and documents extracted from the ScienceWise portal, which is connected to ArXiv.org and is currently used by growing number of researchers. The datasets for the experiments are made available online for reproducibility purposes.
We tackle the problem of improving the relevance of automatically selected tags in large-scale ontology-based information systems. Contrary to traditional settings where tags can be chosen arbitrarily, we focus on the problem of recommending tags (e.g., concepts) directly from a collaborative, user-driven ontology. We compare the effectiveness of a series of approaches to select the best tags ranging from traditional IR techniques such as TF/IDF weighting to novel techniques based on ontological distances and latent Dirichlet allocation. All our experiments are run against a real corpus of tags and documents extracted from the ScienceWise portal, which is connected to ArXiv.org and is currently used by growing number of researchers. The datasets for the experiments are made available online for reproducibility purposes.
Tag Recommendation for Large-Scale Ontology-Based Information Systems
Tag Recommendation for Large-Scale Ontology-Based Information Systems
Tag Recommendation for Large-Scale Ontology-Based Information Systems
Southeast University
Southeast University
Southeast University
Embedded EL + Reasoning on Programmable Logic Controllers
Embedded EL + Reasoning on Programmable Logic Controllers
Many industrial use cases, such as machine diagnostics, can benefit from embedded reasoning, the task of running knowledge-based reasoning techniques on embedded controllers as widely used in industrial automation. However, due to the memory and CPU restrictions of embedded devices like programmable logic controllers (PLCs), state-of-the-art reasoning tools and methods cannot be easily migrated to industrial automation environments. In this paper, we describe an approach to porting lightweight OWL 2 EL reasoning to a PLC platform to run in an industrial automation environment. We report on initial runtime experiments carried out on a prototypical implementation of a PLC-based EL+ -reasoner in the context of a use case about turbine diagnostics.
Many industrial use cases, such as machine diagnostics, can benefit from embedded reasoning, the task of running knowledge-based reasoning techniques on embedded controllers as widely used in industrial automation. However, due to the memory and CPU restrictions of embedded devices like programmable logic controllers (PLCs), state-of-the-art reasoning tools and methods cannot be easily migrated to industrial automation environments. In this paper, we describe an approach to porting lightweight OWL 2 EL reasoning to a PLC platform to run in an industrial automation environment. We report on initial runtime experiments carried out on a prototypical implementation of a PLC-based EL+ -reasoner in the context of a use case about turbine diagnostics.
Embedded EL + Reasoning on Programmable Logic Controllers
Giorgos Stoilos
Giorgos Stoilos
Giorgos Stoilos
Olivier Bodenreider
Olivier Bodenreider
Olivier Bodenreider
Michael Schumacher
Michael Schumacher
Michael Schumacher
Thorsten Liebig
Thorsten Liebig
Thorsten Liebig
Kavitha Srinivas
Kavitha Srinivas
Kavitha Srinivas
University of Economics, Prague
University of Economics, Prague
University of Economics, Prague
STI International
STI International
STI International
Humboldt University of Berlin
Humboldt University of Berlin
Humboldt University of Berlin
Marco de Gemmis
Marco de Gemmis
Marco de Gemmis
Toward an ecosystem of LOD in the field: LOD content generation and its consuming service
This paper proposes to apply semantic technologies in a new domain, Field research. It is said that if "raw data" is openly available on the Web, it will be used by other people to do wonderful things. But, it would be better to show a use case together with that data, especially in the dawn of LOD. Therefore, we are proceeding with both of LOD content generation and its application for a specific domain. The application addresses an issue of information retrieval in the field, and the mechanism of LOD generation from the Web might be applied to the other domain. Firstly, we demonstrate the use of our mobile application, which searches a plant fitting the environmental conditions obtained by the smartphone's sensors. Then, we introduce our approach of the LOD generation, and present an evaluation showing its practical effectiveness.
This paper proposes to apply semantic technologies in a new domain, Field research. It is said that if "raw data" is openly available on the Web, it will be used by other people to do wonderful things. But, it would be better to show a use case together with that data, especially in the dawn of LOD. Therefore, we are proceeding with both of LOD content generation and its application for a specific domain. The application addresses an issue of information retrieval in the field, and the mechanism of LOD generation from the Web might be applied to the other domain. Firstly, we demonstrate the use of our mobile application, which searches a plant fitting the environmental conditions obtained by the smartphone's sensors. Then, we introduce our approach of the LOD generation, and present an evaluation showing its practical effectiveness.
Toward an ecosystem of LOD in the field: LOD content generation and its consuming service
Toward an ecosystem of LOD in the field: LOD content generation and its consuming service
Giulia Masotti
Giulia Masotti
Giulia Masotti
Joint Workshop on Large and Heterogeneous Data and Quantitative Formalization in the Semantic Web
Vaibhav Khadilkar
Vaibhav Khadilkar
Vaibhav Khadilkar
Songyun Duan
Songyun Duan
Songyun Duan
David Mizell
David Mizell
David Mizell
Web of Linked Entities
Grigoris Antoniou
Grigoris Antoniou
Grigoris Antoniou
Maciej Zaremba
Maciej Zaremba
Maciej Zaremba
José Luis Ambite
José Luis Ambite
José Luis Ambite
University of Modena and Reggio Emilia
University of Modena and Reggio Emilia
University of Modena and Reggio Emilia
Big Graph Data Panel
Wei Hu
Wei Hu
Wei Hu
Aris Gkoulalas-Divanis
Aris Gkoulalas-Divanis
Aris Gkoulalas-Divanis
Stuart Wrigley
Stuart Wrigley
Stuart Wrigley
Aritificial Intelligence Journal
Aritificial Intelligence Journal
Aritificial Intelligence Journal
Tutorial Track
Jie Tang
Jie Tang
Jie Tang
Aristotle University of Thessaloniki
Aristotle University of Thessaloniki
Aristotle University of Thessaloniki
Massachusetts Institute of Technology
Massachusetts Institute of Technology
Massachusetts Institute of Technology
Peter Edwards
Peter Edwards
Peter Edwards
Robert Isele
Robert Isele
Robert Isele
Biomedical ontologies have become a mainstream topic in medical research. They represent important sources of evolved knowledge that may be automatically integrated in decision support methods. Grounding clinical and radiographic findings in concepts defined by a biomedical ontology, e.g., the Human Phenotype Ontology, enables us to compute semantic similarity between them. In this paper, we focus on using such similarity measures to predict disorders on undiagnosed patient cases in the bone dysplasia domain. Different methods for computing the semantic similarity have been implemented. All methods have been evaluated based on their support in achieving a higher prediction accuracy. The outcome of this research enables us to understand the feasibility of developing decision support methods based on ontology-driven semantic similarity in the skeletal dysplasia domain.
Semantic similarity-driven decision support in the skeletal dysplasia domain
Semantic similarity-driven decision support in the skeletal dysplasia domain
Semantic similarity-driven decision support in the skeletal dysplasia domain
Biomedical ontologies have become a mainstream topic in medical research. They represent important sources of evolved knowledge that may be automatically integrated in decision support methods. Grounding clinical and radiographic findings in concepts defined by a biomedical ontology, e.g., the Human Phenotype Ontology, enables us to compute semantic similarity between them. In this paper, we focus on using such similarity measures to predict disorders on undiagnosed patient cases in the bone dysplasia domain. Different methods for computing the semantic similarity have been implemented. All methods have been evaluated based on their support in achieving a higher prediction accuracy. The outcome of this research enables us to understand the feasibility of developing decision support methods based on ontology-driven semantic similarity in the skeletal dysplasia domain.
Andreas Harth
Andreas Harth
Andreas Harth
5th International Terra Cognita Workshop 2012
Leipzig University
Leipzig University
Leipzig University
YarcData
YarcData
YarcData
Semantic Reasoning in Context-Aware Assistive Environments to Support Ageing with Dementia
Semantic Reasoning in Context-Aware Assistive Environments to Support Ageing with Dementia
Semantic Reasoning in Context-Aware Assistive Environments to Support Ageing with Dementia
Robust solutions for ambient assisted living are numerous, yet predominantly specific in their scope of usability. In this paper, we describe the potential contribution of semantic web technologies to building more versatile solutions - a step towards adaptable context-aware engines and simplified deployments. Our conception and deployment work in hindsight, we highlight some implementation challenges and requirements for semantic web tools that would help to ease the development of context-aware services and thus generalize real-life deployment of semantically driven assistive technologies. We also compare available tools with regard to these requirements and validate our choices by providing some results from a real-life deployment.
Robust solutions for ambient assisted living are numerous, yet predominantly specific in their scope of usability. In this paper, we describe the potential contribution of semantic web technologies to building more versatile solutions - a step towards adaptable context-aware engines and simplified deployments. Our conception and deployment work in hindsight, we highlight some implementation challenges and requirements for semantic web tools that would help to ease the development of context-aware services and thus generalize real-life deployment of semantically driven assistive technologies. We also compare available tools with regard to these requirements and validate our choices by providing some results from a real-life deployment.
International Workshop on Semantic Technologies meet Recommender Systems & Big Data
Extracting Justifications from BioPortal Ontologies
This paper presents an evaluation of state of the art black box justification finding algorithms on the NCBO BioPortal ontology corpus. This corpus represents a set of naturally occurring ontologies that vary greatly in size and expressivity. The results paint a picture of the performance that can be expected when finding all justifications for entailments using black box justification finding techniques. The results also show that many naturally occurring ontologies exhibit a rich justificatory structure, with some ontologies having extremely high numbers of justifications per entailment.
Extracting Justifications from BioPortal Ontologies
This paper presents an evaluation of state of the art black box justification finding algorithms on the NCBO BioPortal ontology corpus. This corpus represents a set of naturally occurring ontologies that vary greatly in size and expressivity. The results paint a picture of the performance that can be expected when finding all justifications for entailments using black box justification finding techniques. The results also show that many naturally occurring ontologies exhibit a rich justificatory structure, with some ontologies having extremely high numbers of justifications per entailment.
Extracting Justifications from BioPortal Ontologies
Christopher Brewster
Christopher Brewster
Christopher Brewster
Marcel Karnstedt
Marcel Karnstedt
Marcel Karnstedt
Mayo Clinic
Mayo Clinic
Mayo Clinic
GESIS – Leibniz Institute for the Social Sciences
GESIS – Leibniz Institute for the Social Sciences
GESIS – Leibniz Institute for the Social Sciences
Mike Dean
Mike Dean
Mike Dean
Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough?
Testbeds proposed so far to evaluate, compare, and eventually improve SPARQL query federation systems have still some limitations. Some variables and configurations that may have an impact on the behavior of these systems (e.g., network latency, data partitioning and query properties) are not sufficiently defined; this affects the results and repeatability of independent evaluation studies, and hence the insights that can be obtained from them. In this paper we evaluate FedBench, the most comprehensive testbed up to now, and empirically probe the need of considering additional dimensions and variables. The evaluation has been conducted on three SPARQL query federation systems, and the analysis of these results has allowed to uncover properties of these systems that would normally be hidden with the original testbeds.
Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough?
Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough?
Testbeds proposed so far to evaluate, compare, and eventually improve SPARQL query federation systems have still some limitations. Some variables and configurations that may have an impact on the behavior of these systems (e.g., network latency, data partitioning and query properties) are not sufficiently defined; this affects the results and repeatability of independent evaluation studies, and hence the insights that can be obtained from them. In this paper we evaluate FedBench, the most comprehensive testbed up to now, and empirically probe the need of considering additional dimensions and variables. The evaluation has been conducted on three SPARQL query federation systems, and the analysis of these results has allowed to uncover properties of these systems that would normally be hidden with the original testbeds.
Norwegian University of Science and Technology
Norwegian University of Science and Technology
Norwegian University of Science and Technology
Edoardo Pignotti
Edoardo Pignotti
Edoardo Pignotti
J William Murdock
J William Murdock
J William Murdock
In this paper we present the Quonto Inconsistent Data handler (QuID). QuID is a reasoner for OWL 2 QL that is based on the system Quonto and is able to deal with inconsistent ontologies. The central aspect of QuID is that it implements two different, orthogonal strategies for dealing with inconsistency: ABox repairing techniques, based on data manipulation, and consistent query answering techniques, based on query rewriting. Moreover, by exploiting the ability of Quonto to delegate the management of the ABox to a relational database system (DBMS), such techniques are potentially able to handle very large inconsistent ABoxes. For the above reasons, QuID allows for experimentally comparing the above two different strategies for inconsistency handling in the context of OWL 2 QL. We thus report on the experimental evaluation that we have conducted using QuID. Our results clearly point out that inconsistency-tolerance in OWL 2 QL ontologies is feasible in practical cases. Moreover, our evaluation singles out the different sources of complexity for the data manipulation technique and the query rewriting technique, and allows for identifying the conditions under which one method is more efficient than the other.
Evaluation of techniques for inconsistency handling in OWL 2 QL ontologies
In this paper we present the Quonto Inconsistent Data handler (QuID). QuID is a reasoner for OWL 2 QL that is based on the system Quonto and is able to deal with inconsistent ontologies. The central aspect of QuID is that it implements two different, orthogonal strategies for dealing with inconsistency: ABox repairing techniques, based on data manipulation, and consistent query answering techniques, based on query rewriting. Moreover, by exploiting the ability of Quonto to delegate the management of the ABox to a relational database system (DBMS), such techniques are potentially able to handle very large inconsistent ABoxes. For the above reasons, QuID allows for experimentally comparing the above two different strategies for inconsistency handling in the context of OWL 2 QL. We thus report on the experimental evaluation that we have conducted using QuID. Our results clearly point out that inconsistency-tolerance in OWL 2 QL ontologies is feasible in practical cases. Moreover, our evaluation singles out the different sources of complexity for the data manipulation technique and the query rewriting technique, and allows for identifying the conditions under which one method is more efficient than the other.
Evaluation of techniques for inconsistency handling in OWL 2 QL ontologies
Evaluation of techniques for inconsistency handling in OWL 2 QL ontologies
Anna Lisa Gentile
Anna Lisa
Gentile
University of Georgia
University of Georgia
University of Georgia
Dimitrios Michail
Dimitrios Michail
Dimitrios Michail
Bert van Nuffelen
Bert van Nuffelen
Bert van Nuffelen
Max Planck Institute for Computer Science
Max Planck Institute for Computer Science
Max Planck Institute for Computer Science
Martin Theobald
Martin Theobald
Martin Theobald
Kerry Taylor
Kerry Taylor
Kerry Taylor
Bioinformatics at Centre for Plant Biotechnology and Genomics UPM-INIA
Bioinformatics at Centre for Plant Biotechnology and Genomics UPM-INIA
Bioinformatics at Centre for Plant Biotechnology and Genomics UPM-INIA
Knowledge Extraction and Consolidation from Social media
Sherif Sakr
Sherif Sakr
Sherif Sakr
Thomas Bouttaz
Thomas Bouttaz
Thomas Bouttaz
In recent years, strategies for Linked Data consumption have caught attention in Semantic Web research. For direct consumption by users, Linked Data mashups, interfaces, and visualizations have become a popular research area. Many approaches in this field aim to make Linked Data interaction more user friendly to improve its accessibility for nontechnical users. A subtask for Linked Data interfaces is to present entities and their properties in a concise form. In general, these summaries take individual attributes and sometimes user contexts and preferences into account. But the objective evaluation of the quality of such summaries is an expensive task. In this paper we introduce a game-based approach aiming to establish a ground truth for the evaluation of entity summarization. We exemplify the applicability of the approach by evaluating two recent summarization approaches.
In recent years, strategies for Linked Data consumption have caught attention in Semantic Web research. For direct consumption by users, Linked Data mashups, interfaces, and visualizations have become a popular research area. Many approaches in this field aim to make Linked Data interaction more user friendly to improve its accessibility for nontechnical users. A subtask for Linked Data interfaces is to present entities and their properties in a concise form. In general, these summaries take individual attributes and sometimes user contexts and preferences into account. But the objective evaluation of the quality of such summaries is an expensive task. In this paper we introduce a game-based approach aiming to establish a ground truth for the evaluation of entity summarization. We exemplify the applicability of the approach by evaluating two recent summarization approaches.
Evaluating Entity Summarization Using a Game-Based Ground Truth
Evaluating Entity Summarization Using a Game-Based Ground Truth
Evaluating Entity Summarization Using a Game-Based Ground Truth
David Karger
David Karger
David Karger
Ying Zhang
Ying Zhang
Ying Zhang
Aibo Tian
Aibo Tian
Aibo Tian
Ian Davis
Ian Davis
Ian Davis
self
self
self
Peter Haase
Peter Haase
Peter Haase
David Lewis
David Lewis
David Lewis
Jim Hendler
Jim Hendler
Jim Hendler
Jürgen Umbrich
Jürgen Umbrich
Jürgen Umbrich
Chenghua Lin
Chenghua Lin
Chenghua Lin
Elena Simperl
Elena Simperl
Elena Simperl
Tomas Vitvar
Tomas Vitvar
Tomas Vitvar
3roundstones
3roundstones
3roundstones
Joaquim Gabarro
Joaquim Gabarro
Joaquim Gabarro
Alan Eckhardt
Alan Eckhardt
Alan Eckhardt
Adrian Mocan
Adrian Mocan
Adrian Mocan
Matthew Rowe
Matthew Rowe
Matthew Rowe
Poster/Demo Session, Semantic Web Challenge and Reception
University of Grenoble
University of Grenoble
University of Grenoble
Haofen Wang
Haofen Wang
Haofen Wang
Linköping University
Linköping University
Linköping University
Michael Benedikt
Michael Benedikt
Michael Benedikt
Gary Wills
Gary Wills
Gary Wills
Marco Rospocher
Marco Rospocher
Marco Rospocher
University of Würzburg
University of Würzburg
University of Würzburg
Sandia National Laboratories
Sandia National Laboratories
Sandia National Laboratories
Autonomous University of Madrid
Autonomous University of Madrid
Autonomous University of Madrid
University of Kentucky
University of Kentucky
University of Kentucky
Cory Henson
Cory Henson
Cory Henson
Institute for Infocomm Research
Institute for Infocomm Research
Institute for Infocomm Research
Todd Minning
Todd Minning
Todd Minning
Pavel Shvaiko
Pavel Shvaiko
Pavel Shvaiko
CERN
CERN
CERN
Laura Hollink
Laura Hollink
Laura Hollink
Pontifical Catholic University of Rio de Janeiro
Pontifical Catholic University of Rio de Janeiro
Pontifical Catholic University of Rio de Janeiro
Lina Zhou
Lina Zhou
Lina Zhou
University of Manchester
University of Manchester
University of Manchester
Stéphane Corlosquet
Stéphane Corlosquet
Stéphane Corlosquet
Frantisek Simancik
Frantisek Simancik
Frantisek Simancik
Industrial Track
Josep Maria Brunetti Fernández
Josep Maria Brunetti Fernández
Josep Maria Brunetti Fernández
Jaroslav Kuchar
Jaroslav Kuchar
Jaroslav Kuchar
University of Milan Bicocca
University of Milan Bicocca
University of Milan Bicocca
Federico Michele Facca
Federico Michele Facca
Federico Michele Facca
Alexander Seeliger
Alexander Seeliger
Alexander Seeliger
Murat Kantarcioglu
Murat Kantarcioglu
Murat Kantarcioglu
Elizabeth Daly
Elizabeth Daly
Elizabeth Daly
National Technical University of Athens
National Technical University of Athens
National Technical University of Athens
JWS Lunch
Alois Haselboeck
Alois Haselboeck
Alois Haselboeck
Milan Stankovic
Milan Stankovic
Milan Stankovic
Sentiment analysis over Twitter offer organisations a fast and effective way to monitor the publics' feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results. In this paper, we introduce a novel approach of adding semantics as additional features into the training set for sentiment analysis. For each extracted entity (e.g. iPhone) from tweets, we add its semantic concept (e.g. ''Apple product'') as an additional feature, and measure the correlation of the representative concept with negative/positive sentiment. We apply this approach to predict sentiment for three different Twitter datasets. Our results show an average increase of F harmonic accuracy score for identifying both negative and positive sentiment of around 6.5% and 4.8% over the baselines of unigrams and part-of-speech features respectively. We also compare against an approach based on sentiment-bearing topic analysis, and find that semantic features produce better Recall and F score when classifying negative sentiment, and better Precision with lower Recall and F score in positive sentiment classification.
Semantic Sentiment Analysis of Twitter
Sentiment analysis over Twitter offer organisations a fast and effective way to monitor the publics' feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results. In this paper, we introduce a novel approach of adding semantics as additional features into the training set for sentiment analysis. For each extracted entity (e.g. iPhone) from tweets, we add its semantic concept (e.g. ''Apple product'') as an additional feature, and measure the correlation of the representative concept with negative/positive sentiment. We apply this approach to predict sentiment for three different Twitter datasets. Our results show an average increase of F harmonic accuracy score for identifying both negative and positive sentiment of around 6.5% and 4.8% over the baselines of unigrams and part-of-speech features respectively. We also compare against an approach based on sentiment-bearing topic analysis, and find that semantic features produce better Recall and F score when classifying negative sentiment, and better Precision with lower Recall and F score in positive sentiment classification.
Semantic Sentiment Analysis of Twitter
Semantic Sentiment Analysis of Twitter
Tobias Schuchert
Tobias Schuchert
Tobias Schuchert
Marie-Christine Rousset
Marie-Christine Rousset
Marie-Christine Rousset
Technische Universität Ilmenau
Technische Universität Ilmenau
Technische Universität Ilmenau
CrowdMAP: Crowdsourcing Ontology Alignment with Microtasks
CrowdMAP: Crowdsourcing Ontology Alignment with Microtasks
The last decade of research in ontology alignment has brought a variety of computational techniques to discover correspondences between ontologies. While the accuracy of automatic approaches has continuously improved, human contributions remain a key ingredient of the process: this input serves as a valuable source of domain knowledge that is used to train the algorithms and to validate and augment automatically computed alignments. In this paper, we introduce CROWDMAP, a model to acquire such human contributions via microtask crowdsourcing. For a given pair of ontologies, CROWDMAP translates the alignment problem into microtasks that address individual alignment questions, publishes the microtasks on an online labor market, and evaluates the quality of the results obtained from the crowd. We evaluated the current implementation of CROWDMAP in a series of experiments using ontologies and reference alignments from the Ontology Alignment Evaluation Initiative and the crowdsourcing platform CrowdFlower. The experiments clearly demonstrated that the overall approach is feasible, and can improve the accuracy of existing ontology alignment solutions in a fast, scalable, and cost-effective manner.
CrowdMAP: Crowdsourcing Ontology Alignment with Microtasks
The last decade of research in ontology alignment has brought a variety of computational techniques to discover correspondences between ontologies. While the accuracy of automatic approaches has continuously improved, human contributions remain a key ingredient of the process: this input serves as a valuable source of domain knowledge that is used to train the algorithms and to validate and augment automatically computed alignments. In this paper, we introduce CROWDMAP, a model to acquire such human contributions via microtask crowdsourcing. For a given pair of ontologies, CROWDMAP translates the alignment problem into microtasks that address individual alignment questions, publishes the microtasks on an online labor market, and evaluates the quality of the results obtained from the crowd. We evaluated the current implementation of CROWDMAP in a series of experiments using ontologies and reference alignments from the Ontology Alignment Evaluation Initiative and the crowdsourcing platform CrowdFlower. The experiments clearly demonstrated that the overall approach is feasible, and can improve the accuracy of existing ontology alignment solutions in a fast, scalable, and cost-effective manner.
Computas
Computas
Computas
Klaas Dellschaft
Klaas Dellschaft
Klaas Dellschaft
Natasha F. Noy
Natasha F. Noy
Natasha F. Noy
FZI Forschungszentrum_Informatik
FZI Forschungszentrum_Informatik
FZI Forschungszentrum_Informatik
Antoine Zimmermann
Antoine Zimmermann
Antoine Zimmermann
Browsing Causal Chains in a Disease Ontology
In order to realize sophisticated medical information systems, many medical ontologies have been developed. We proposed a definition of disease based on River Flow Model which captures a disease as a causal chain of clinical disorders. We also developed a disease ontology based on the model. It includes definitions of more than 6,000 diseases with 17,000 causal relationships. This demonstration summarizes the disease ontology and a browsing system for causal chains defined in it.
In order to realize sophisticated medical information systems, many medical ontologies have been developed. We proposed a definition of disease based on River Flow Model which captures a disease as a causal chain of clinical disorders. We also developed a disease ontology based on the model. It includes definitions of more than 6,000 diseases with 17,000 causal relationships. This demonstration summarizes the disease ontology and a browsing system for causal chains defined in it.
Browsing Causal Chains in a Disease Ontology
Browsing Causal Chains in a Disease Ontology
Daniel Bär
Daniel Bär
Daniel Bär
Polytechnic University of Catalonia
Polytechnic University of Catalonia
Polytechnic University of Catalonia
Trent Schmidt
Trent Schmidt
Trent Schmidt
Matthias Nickles
Matthias Nickles
Matthias Nickles
Ghent University
Ghent University
Ghent University
University of Münster
University of Münster
University of Münster
Trond Aalberg
Trond Aalberg
Trond Aalberg
David Newman
David Newman
David Newman
The Semantic Web in 2022
Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web
Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web
We present a large-scale relation extraction (RE) system which learns grammar-based RE rules from the Web by utilizing large numbers of relation instances as seed. Our goal is to obtain rule sets large enough to cover the actual range of linguistic variation, thus tackling the long-tail problem of real-world applications. A variant of distant supervision learns several relations in parallel, enabling a new method of rule filtering. The system detects both binary and n-ary relations. We target 39 relations from Freebase, for which 3M sentences extracted from 20M web pages serve as the basis for learning an average of 40K distinctive rules per relation. Employing an efficient dependency parser, the average run time for each relation is only 19 hours. We compare these rules with ones learned from local corpora of different sizes and demonstrate that the Web is indeed needed for a good coverage of linguistic variation.
We present a large-scale relation extraction (RE) system which learns grammar-based RE rules from the Web by utilizing large numbers of relation instances as seed. Our goal is to obtain rule sets large enough to cover the actual range of linguistic variation, thus tackling the long-tail problem of real-world applications. A variant of distant supervision learns several relations in parallel, enabling a new method of rule filtering. The system detects both binary and n-ary relations. We target 39 relations from Freebase, for which 3M sentences extracted from 20M web pages serve as the basis for learning an average of 40K distinctive rules per relation. Employing an efficient dependency parser, the average run time for each relation is only 19 hours. We compare these rules with ones learned from local corpora of different sizes and demonstrate that the Web is indeed needed for a good coverage of linguistic variation.
Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web
Tim Berners-Lee
Tim Berners-Lee
Tim Berners-Lee
Recently more and more structured data in form of RDF triples have been published and integrated into Linked Open Data (LOD). While the current LOD contains hundreds of data sources with billions of triples, it has a small number of distinct relations compared with the large number of entities. On the other hand, Web pages are growing rapidly, which results in much larger number of textual contents to be exploited. With the popularity and wide adoption of open information extraction technology, extracting entities and relations among them from text at the Web scale is possible. In this paper, we present an approach to extract the subject individuals and the object counterparts for the relations from text and determine the most appropriate domain and range as well as the most confident dependency path patterns for the given relation based on the EM algorithm. As a preliminary results, we built a knowledge base for relations extracted from Chinese encyclopedias. The experimental results show the effectiveness of our approach to extract relations with reasonable domain, range and path pattern restrictions as well as high-quality triples.
Building Large Scale Relation KB from Text
Building Large Scale Relation KB from Text
Recently more and more structured data in form of RDF triples have been published and integrated into Linked Open Data (LOD). While the current LOD contains hundreds of data sources with billions of triples, it has a small number of distinct relations compared with the large number of entities. On the other hand, Web pages are growing rapidly, which results in much larger number of textual contents to be exploited. With the popularity and wide adoption of open information extraction technology, extracting entities and relations among them from text at the Web scale is possible. In this paper, we present an approach to extract the subject individuals and the object counterparts for the relations from text and determine the most appropriate domain and range as well as the most confident dependency path patterns for the given relation based on the EM algorithm. As a preliminary results, we built a knowledge base for relations extracted from Chinese encyclopedias. The experimental results show the effectiveness of our approach to extract relations with reasonable domain, range and path pattern restrictions as well as high-quality triples.
Building Large Scale Relation KB from Text
Vladimir Mikhailov
Vladimir Mikhailov
Vladimir Mikhailov
University of Alberta
University of Alberta
University of Alberta
Kapila Ponnamperuma
Kapila Ponnamperuma
Kapila Ponnamperuma
Carlo Curino
Carlo Curino
Carlo Curino
Forward Look
Forward Look
Forward Look
Hypios
Hypios
Hypios
Mining Semantic Relations between Research Areas
For a number of years now we have seen the emergence of repositories of research data specified using OWL/RDF as representation languages, and conceptualized according to a variety of ontologies. This class of solutions promises both to facilitate the integration of research data with other relevant sources of information and also to support more intelligent forms of querying and exploration. However, an issue which has only been partially addressed is that of generating and characterizing semantically the relations that exist between research areas. This problem has been traditionally addressed by manually creating taxonomies, such as the ACM classification of research topics. However, this manual approach is inadequate for a number of reasons: these taxonomies are very coarse-grained and they do not cater for the finegrained research topics, which define the level at which typically researchers (and even more so, PhD students) operate. Moreover, they evolve slowly, and therefore they tend not to cover the most recent research trends. In addition, as we move towards a semantic characterization of these relations, there is arguably a need for a more sophisticated characterization than a homogeneous taxonomy, to reflect the different ways in which research areas can be related. In this paper we propose Klink, a new approach to i) automatically generating relations between research areas and ii) populating a bibliographic ontology, which combines both machine learning methods and external knowledge, which is drawn from a number of resources, including Google Scholar and Wikipedia. We have tested a number of alternative algorithms and our evaluation shows that a method relying on both external knowledge and the ability to detect temporal relations between research areas performs best with respect to a manually constructed standard.
Mining Semantic Relations between Research Areas
Mining Semantic Relations between Research Areas
For a number of years now we have seen the emergence of repositories of research data specified using OWL/RDF as representation languages, and conceptualized according to a variety of ontologies. This class of solutions promises both to facilitate the integration of research data with other relevant sources of information and also to support more intelligent forms of querying and exploration. However, an issue which has only been partially addressed is that of generating and characterizing semantically the relations that exist between research areas. This problem has been traditionally addressed by manually creating taxonomies, such as the ACM classification of research topics. However, this manual approach is inadequate for a number of reasons: these taxonomies are very coarse-grained and they do not cater for the finegrained research topics, which define the level at which typically researchers (and even more so, PhD students) operate. Moreover, they evolve slowly, and therefore they tend not to cover the most recent research trends. In addition, as we move towards a semantic characterization of these relations, there is arguably a need for a more sophisticated characterization than a homogeneous taxonomy, to reflect the different ways in which research areas can be related. In this paper we propose Klink, a new approach to i) automatically generating relations between research areas and ii) populating a bibliographic ontology, which combines both machine learning methods and external knowledge, which is drawn from a number of resources, including Google Scholar and Wikipedia. We have tested a number of alternative algorithms and our evaluation shows that a method relying on both external knowledge and the ability to detect temporal relations between research areas performs best with respect to a manually constructed standard.
Valeria Fionda
Valeria Fionda
Valeria Fionda
Christian Dirschl
Christian Dirschl
Christian Dirschl
This demo enables the automatic creation of semantically annotated YouTube media fragments. A video is first ingested in the Synote system and a new method enables to retrieve its associated subtitles or closed captions. Next, NERD is used to extract named entities from the transcripts which are then temporally aligned with the video. The entities are disambiguated in the LOD clound and a user interface enables to browse through the entities detected in a video or get more information. We evaluated our application with 60 videos from 3 YouTube channels.
Creating Enriched YouTube Media Fragments With NERD Using Timed-Text
This demo enables the automatic creation of semantically annotated YouTube media fragments. A video is first ingested in the Synote system and a new method enables to retrieve its associated subtitles or closed captions. Next, NERD is used to extract named entities from the transcripts which are then temporally aligned with the video. The entities are disambiguated in the LOD clound and a user interface enables to browse through the entities detected in a video or get more information. We evaluated our application with 60 videos from 3 YouTube channels.
Creating Enriched YouTube Media Fragments With NERD Using Timed-Text
Creating Enriched YouTube Media Fragments With NERD Using Timed-Text
Gianluca Demartini
Gianluca Demartini
Gianluca Demartini
A key objective of multidimensional dataset analysis is to reveal patterns of interest to analysts. However, multidimensional analysis has been observed to be dicult for analysts, due to the challenges of both presenting and navigating large datasets. This work explores how initial summarizations of multidimensional datasets can be generated for consuming parties (designed to reduce the number of data points which would need to be displayed) driven by summarization policies based on provided dataset values. Additionally, functionality for explaining the derivation of summarizations is being developed - in line with prior work on aiding analyst interactions with data processing systems. To help drive development of this work, as well as provide illustrative use cases, we are presently developing a dataset summarization generator as part of greater work being done in the Foresight and Understanding from Scientific Exposition (FUSE) program.
Applying Multidimensional Navigation and Explanation in Semantic Dataset Summarization
A key objective of multidimensional dataset analysis is to reveal patterns of interest to analysts. However, multidimensional analysis has been observed to be dicult for analysts, due to the challenges of both presenting and navigating large datasets. This work explores how initial summarizations of multidimensional datasets can be generated for consuming parties (designed to reduce the number of data points which would need to be displayed) driven by summarization policies based on provided dataset values. Additionally, functionality for explaining the derivation of summarizations is being developed - in line with prior work on aiding analyst interactions with data processing systems. To help drive development of this work, as well as provide illustrative use cases, we are presently developing a dataset summarization generator as part of greater work being done in the Foresight and Understanding from Scientific Exposition (FUSE) program.
Applying Multidimensional Navigation and Explanation in Semantic Dataset Summarization
Applying Multidimensional Navigation and Explanation in Semantic Dataset Summarization
Jönköping University
Jönköping University
Jönköping University
Sakthi Sundaram
Sakthi Sundaram
Sakthi Sundaram
FORTH-ICS
FORTH-ICS
FORTH-ICS
Thibaut Tiberghien
Thibaut Tiberghien
Thibaut Tiberghien
Clark & Parsia
Clark & Parsia
Clark & Parsia
Takeshi Imai
Takeshi Imai
Takeshi Imai
Christian Schallhart
Christian Schallhart
Christian Schallhart
Hoan Nguyen Mau Quoc
Hoan Nguyen Mau Quoc
Hoan Nguyen Mau Quoc
OCLC
OCLC
OCLC
University of Maryland, College Park
University of Maryland, College Park
University of Maryland, College Park
Claudio Gutierrez
Claudio Gutierrez
Claudio Gutierrez
Aidan Hogan
Aidan Hogan
Aidan Hogan
Mike Stonebreaker
Mike Stonebreaker
Mike Stonebreaker
David Liu
David Liu
David Liu
Andrea Maurino
Andrea Maurino
Andrea Maurino
Retrieving the causes of road traffic congestions in quasi real-time is an important task that will enable city managers to get better insight into traffic issues and thus take appropriate corrective actions in a timely way. Our work, accepted at ISWC 2012 In-Use track, tackles this problem by integrating and reasoning over a variety of heterogeneous data sources including data streams. In this paper we present an initial prototype of our work for the city of Dublin, Ireland.
A Prototype for Semantic based Diagnosis of Road Traffic Congestions
Retrieving the causes of road traffic congestions in quasi real-time is an important task that will enable city managers to get better insight into traffic issues and thus take appropriate corrective actions in a timely way. Our work, accepted at ISWC 2012 In-Use track, tackles this problem by integrating and reasoning over a variety of heterogeneous data sources including data streams. In this paper we present an initial prototype of our work for the city of Dublin, Ireland.
A Prototype for Semantic based Diagnosis of Road Traffic Congestions
A Prototype for Semantic based Diagnosis of Road Traffic Congestions
Giovanni Grasso
Giovanni Grasso
Giovanni Grasso
Tim Furche
Tim Furche
Tim Furche
Christopher Matheus
Christopher Matheus
Christopher Matheus
University of Mannheim
University of Mannheim
University of Mannheim
Naimdjon Takhirov
Naimdjon Takhirov
Naimdjon Takhirov
Enrico Franconi
Enrico Franconi
Enrico Franconi
Yunjia Li
Yunjia Li
Yunjia Li
Jacco Van Ossenbruggen
Jacco Van Ossenbruggen
Jacco Van Ossenbruggen
Hitting the Sweetspot: Economic Rewriting of Knowledge Bases
Three conflicting requirements arise in the context of knowledge base (KB) extraction: the size of the extracted KB, the size of the corresponding signature and the syntactic similarity of the extracted KB with the original one. Minimal module extraction and uniform interpolation assign an absolute priority to one of these requirements, thereby limiting the possibilities to influence the other two. We propose a novel technique for EL that does not require such an extreme prioritization. We propose a tractable rewriting approach and empirically compare the technique with existing approaches with encouraging results.
Three conflicting requirements arise in the context of knowledge base (KB) extraction: the size of the extracted KB, the size of the corresponding signature and the syntactic similarity of the extracted KB with the original one. Minimal module extraction and uniform interpolation assign an absolute priority to one of these requirements, thereby limiting the possibilities to influence the other two. We propose a novel technique for EL that does not require such an extreme prioritization. We propose a tractable rewriting approach and empirically compare the technique with existing approaches with encouraging results.
Hitting the Sweetspot: Economic Rewriting of Knowledge Bases
Hitting the Sweetspot: Economic Rewriting of Knowledge Bases
Houda Khrouf
Houda Khrouf
Houda Khrouf
Ioannis Vlahavas
Ioannis Vlahavas
Ioannis Vlahavas
2nd Joint Workshop on Knowledge Evolution and Ontology Dynamics
Yannis Tzitzikas
Yannis Tzitzikas
Yannis Tzitzikas
On Direct Debugging of Aligned Ontologies
On Direct Debugging of Aligned Ontologies
On Direct Debugging of Aligned Ontologies
Modern ontology debugging methods allow efficient identification and localization of faulty axioms defined by a user while developing an ontology. However, in many use cases such as ontology alignment the ontologies might include many conflict sets, i.e. sets of axioms preserving the faults, thus making ontology diagnosis infeasible. In this paper we present a debugging approach based on a direct computation of diagnoses that omits calculation of conflict sets. Embedded in an ontology debugger, the proposed algorithm is able to identify diagnoses for an ontology which includes a large number of faults and for which application of standard diagnosis methods fails. The evaluation results show that the approach is practicable and is able to identify a fault in adequate time.
Modern ontology debugging methods allow efficient identification and localization of faulty axioms defined by a user while developing an ontology. However, in many use cases such as ontology alignment the ontologies might include many conflict sets, i.e. sets of axioms preserving the faults, thus making ontology diagnosis infeasible. In this paper we present a debugging approach based on a direct computation of diagnoses that omits calculation of conflict sets. Embedded in an ontology debugger, the proposed algorithm is able to identify diagnoses for an ontology which includes a large number of faults and for which application of standard diagnosis methods fails. The evaluation results show that the approach is practicable and is able to identify a fault in adequate time.
Ming Mao
Ming Mao
Ming Mao
Pol Mac Aonghusa
Pol Mac Aonghusa
Pol Mac Aonghusa
Venkat Krishnamurthy
Venkat Krishnamurthy
Venkat Krishnamurthy
EMC Corporation
EMC Corporation
EMC Corporation
Due to the high worst case complexity of the core reasoning problem for the expressive profiles of OWL 2, ontology engineers are often surprised and confused by the performance behaviour of reasoners on their ontologies. Even very experienced modellers with a sophisticated grasp of reasoning algorithms do not have a good mental model of reasoner performance behaviour. Seemingly innocuous changes to an OWL ontology can degrade classification time from instantaneous to too long to wait for. Similarly, switching reasoners (e.g., to take advantage of specific features) can result in wildly different classification times. In this paper we investigate performance variability phenomena in OWL ontologies, and present methods to identify subsets of an ontology which are performance-degrading for a given reasoner. When such (ideally small) subsets are removed from an ontology, and the remainder is much easier for the given reasoner to reason over, we designate them âhot spotsâ?. The identification of these hot spots allows users to isolate difficult portions of the ontology in a principled and systematic way. Moreover, we devise and compare various methods for approximate reasoning and knowledge compilation based on hot spots. We verify our techniques with a select set of varyingly difficult ontologies from the NCBO BioPortal, and were able to, firstly, successfully identify performance hot spots against the major freely available DL reasoners, and, secondly, significantly improve classification time using approximate reasoning based on hot spots.
Performance Heterogeneity and Approximate Reasoning in Description Logic Ontologies
Due to the high worst case complexity of the core reasoning problem for the expressive profiles of OWL 2, ontology engineers are often surprised and confused by the performance behaviour of reasoners on their ontologies. Even very experienced modellers with a sophisticated grasp of reasoning algorithms do not have a good mental model of reasoner performance behaviour. Seemingly innocuous changes to an OWL ontology can degrade classification time from instantaneous to too long to wait for. Similarly, switching reasoners (e.g., to take advantage of specific features) can result in wildly different classification times. In this paper we investigate performance variability phenomena in OWL ontologies, and present methods to identify subsets of an ontology which are performance-degrading for a given reasoner. When such (ideally small) subsets are removed from an ontology, and the remainder is much easier for the given reasoner to reason over, we designate them âhot spotsâ?. The identification of these hot spots allows users to isolate difficult portions of the ontology in a principled and systematic way. Moreover, we devise and compare various methods for approximate reasoning and knowledge compilation based on hot spots. We verify our techniques with a select set of varyingly difficult ontologies from the NCBO BioPortal, and were able to, firstly, successfully identify performance hot spots against the major freely available DL reasoners, and, secondly, significantly improve classification time using approximate reasoning based on hot spots.
Performance Heterogeneity and Approximate Reasoning in Description Logic Ontologies
Performance Heterogeneity and Approximate Reasoning in Description Logic Ontologies
Ingenta
Ingenta
Ingenta
David Huynh
David Huynh
David Huynh
Doctoral Consortium
Carlos Buil-Aranda
Carlos Buil-Aranda
Carlos Buil-Aranda
Mirko Graziosi
Mirko Graziosi
Mirko Graziosi
Raymond Lloyd
Raymond Lloyd
Raymond Lloyd
Do-Heon Jeong
Do-Heon Jeong
Do-Heon Jeong
The Seventh International Workshop on Ontology Matching
Predicting Reasoning Performance Using Ontology Metrics
A key issue in semantic reasoning is the computational complexity of inference tasks on expressive ontology languages such as OWL DL and OWL 2 DL. Theoretical works have established worst-case complexity results for reasoning tasks for these languages. However, hardness of reasoning about individual ontologies has not been adequately characterised. In this paper, we conduct a systematic study to tackle this problem using machine learning techniques, covering over 350 real-world ontologies and four state-of-the-art, widely-used OWL 2 reasoners. Our main contributions are two-fold. Firstly, we learn various classifiers that accurately predict classification time for an ontology based on its metric values. Secondly, we identify a number of metrics that can be used to effectively predict reasoning performance. Our prediction models have been shown to be highly effective, achieving an accuracy of over 80%.
Predicting Reasoning Performance Using Ontology Metrics
Predicting Reasoning Performance Using Ontology Metrics
A key issue in semantic reasoning is the computational complexity of inference tasks on expressive ontology languages such as OWL DL and OWL 2 DL. Theoretical works have established worst-case complexity results for reasoning tasks for these languages. However, hardness of reasoning about individual ontologies has not been adequately characterised. In this paper, we conduct a systematic study to tackle this problem using machine learning techniques, covering over 350 real-world ontologies and four state-of-the-art, widely-used OWL 2 reasoners. Our main contributions are two-fold. Firstly, we learn various classifiers that accurately predict classification time for an ontology based on its metric values. Secondly, we identify a number of metrics that can be used to effectively predict reasoning performance. Our prediction models have been shown to be highly effective, achieving an accuracy of over 80%.
Town Hall Meeting
Mark Wilkinson
Mark Wilkinson
Mark Wilkinson
Third International Workshop on Consuming Linked Data
Ralf Möller
Ralf Möller
Ralf Möller
Steve Harris
Steve Harris
Steve Harris
Norman Heino
Norman Heino
Norman Heino
Minh-Duc Pham
Minh-Duc Pham
Minh-Duc Pham
In this paper we present a demo for efficient detecting of visitors attention in museum environment based on the application of intelligent complex event processing and semantic technologies. The detection takes advantage of semantics: (i) in design time for the correlation of sensors data via modeling of the interesting situations and annotation of artworks and their parts and (ii) in real-time for the more accurate and precise detection of the interesting situation. The results of the proposed approach have been applied in the EU project ARtSENSE.
In this paper we present a demo for efficient detecting of visitors attention in museum environment based on the application of intelligent complex event processing and semantic technologies. The detection takes advantage of semantics: (i) in design time for the correlation of sensors data via modeling of the interesting situations and annotation of artworks and their parts and (ii) in real-time for the more accurate and precise detection of the interesting situation. The results of the proposed approach have been applied in the EU project ARtSENSE.
Demo: Efficient Human Attention Detection in Museums based on Semantics and Complex Event Processing
Demo: Efficient Human Attention Detection in Museums based on Semantics and Complex Event Processing
Demo: Efficient Human Attention Detection in Museums based on Semantics and Complex Event Processing
Opening Ceremony
This paper presents an approach to automatically extract entities and relationships from textual documents. The main goal is to populate a knowledge base that hosts this structured information about domain entities. The extracted entities and their expected relationships are verified using two evidence based techniques: classification and linking. This last process also enables the linking of our knowledge base to other sources which are part of the Linked Open Data cloud. We demonstrate the benefit of our approach through series of experiments with real-world datasets.
An Evidence-based Verification Approach to Extract Entities and Relations for Knowledge Base Population
This paper presents an approach to automatically extract entities and relationships from textual documents. The main goal is to populate a knowledge base that hosts this structured information about domain entities. The extracted entities and their expected relationships are verified using two evidence based techniques: classification and linking. This last process also enables the linking of our knowledge base to other sources which are part of the Linked Open Data cloud. We demonstrate the benefit of our approach through series of experiments with real-world datasets.
An Evidence-based Verification Approach to Extract Entities and Relations for Knowledge Base Population
An Evidence-based Verification Approach to Extract Entities and Relations for Knowledge Base Population
Hewlett Packard Laboratories
Hewlett Packard Laboratories
Hewlett Packard Laboratories
Guilin Qi
Guilin Qi
Guilin Qi
Vrije Universiteit Amsterdam
Vrije Universiteit Amsterdam
Vrije Universiteit Amsterdam
Paolo Castagna
Paolo Castagna
Paolo Castagna
Jonathon Hare
Jonathon Hare
Jonathon Hare
Tim Berners-Lee; John Gianandrea; Frank van Harmelen; Mike Stonebreaker; Bryan Thompson
Tim Berners-Lee; John Gianandrea; Frank van Harmelen; Mike Stonebreaker; Bryan Thompson
Tim Berners-Lee; John Gianandrea; Frank van Harmelen; Mike Stonebreaker; Bryan Thompson
Karl Rieb
Karl Rieb
Karl Rieb
Sugar Labs
Sugar Labs
Sugar Labs
AKSW Group
AKSW Group
AKSW Group
Yong-Bin Kang
Yong-Bin Kang
Yong-Bin Kang
Demonstrating Blank Node Matching and RDF/S Comparison Functions
The ability to compute the differences that exist between two RDF/S Knowledge Bases (KBs) is important for aiding humans to understand the evolution of knowledge, and for reducing the amount of data that need to be exchanged and managed over the network in order to build synchronization, versioning and replication services. We will show how we can exploit blank node anonymity in order to reduce the delta size when comparing RDF/S KBs. We will show experimental results over real and synthetic data sets that demonstrate significant reductions of the sizes of the computed deltas, and how the reduced deltas can be visualized. (This demo paper accompanies a research paper accepted for ISWC'2012)
The ability to compute the differences that exist between two RDF/S Knowledge Bases (KBs) is important for aiding humans to understand the evolution of knowledge, and for reducing the amount of data that need to be exchanged and managed over the network in order to build synchronization, versioning and replication services. We will show how we can exploit blank node anonymity in order to reduce the delta size when comparing RDF/S KBs. We will show experimental results over real and synthetic data sets that demonstrate significant reductions of the sizes of the computed deltas, and how the reduced deltas can be visualized. (This demo paper accompanies a research paper accepted for ISWC'2012)
Demonstrating Blank Node Matching and RDF/S Comparison Functions
Demonstrating Blank Node Matching and RDF/S Comparison Functions
Giuseppe Pirró
Giuseppe Pirró
Giuseppe Pirró
Manfred Hauswirth
Manfred Hauswirth
Manfred Hauswirth
Qiang Yang
Qiang Yang
Qiang Yang
Bastian Krayer
Bastian Krayer
Bastian Krayer
Pablo N. Mendes
Pablo N. Mendes
Pablo N. Mendes
Thierry Declerck
Thierry Declerck
Thierry Declerck
Jacobs University Bremen
Jacobs University Bremen
Jacobs University Bremen
Rinke Hoekstra
Rinke Hoekstra
Rinke Hoekstra
University of Turin
University of Turin
University of Turin
Talis
Talis
Talis
Sebastian Rudolph
Sebastian Rudolph
Sebastian Rudolph
University of Arizona
University of Arizona
University of Arizona
Veli Bicer
Veli Bicer
Veli Bicer
EMC R&D Brazil
EMC R&D Brazil
EMC R&D Brazil
Line Pouchard
Line Pouchard
Line Pouchard
University of Illinois at Chicago
University of Illinois at Chicago
University of Illinois at Chicago
Weinan Zhang
Weinan Zhang
Weinan Zhang
Andreas Zankl
Andreas Zankl
Andreas Zankl
Miel Vander Sande
Miel Vander Sande
Miel Vander Sande
Axel Polleres
Axel Polleres
Axel Polleres
Giorgos Flouris
Giorgos Flouris
Giorgos Flouris
Ansgar Scherp
Ansgar Scherp
Ansgar Scherp
Thomas Malone
Thomas Malone
Thomas Malone
Oscar Corcho
Oscar Corcho
Oscar Corcho
Charalambos Kontoes
Charalambos Kontoes
Charalambos Kontoes
Lehigh University
Lehigh University
Lehigh University
Magnus Knuth
Magnus Knuth
Magnus Knuth
Mohsen Taheriyan
Mohsen Taheriyan
Mohsen Taheriyan
Mounir Mokhtari
Mounir Mokhtari
Mounir Mokhtari
IBM Research
IBM Research
IBM Research
Sa-Kwang Song
Sa-Kwang Song
Sa-Kwang Song
John Gianandrea
John Gianandrea
John Gianandrea
Industry Track I
Jill Wegrzyn
Jill Wegrzyn
Jill Wegrzyn
Industry Track II
Montiago Labute
Montiago Labute
Montiago Labute
Query Driven Hypothesis Generation for Answering Queries over NLP Graphs
Query Driven Hypothesis Generation for Answering Queries over NLP Graphs
Query Driven Hypothesis Generation for Answering Queries over NLP Graphs
It has become common to use RDF to store the results of Natural Language Processing (NLP) as a graph of the entities mentioned in the text with the relationships mentioned in the text as links between them. These NLP graphs can be measured with Precision and Recall against a ground truth graph representing what the documents actually say. When asking conjunctive queries on NLP graphs, the Recall of the query is expected to be roughly the product of the Recall of the relations in each conjunct. Since Recall is typically less than one, conjunctive query Recall on NLP graphs degrades geometrically with the number of conjuncts. We present an approach to address this Recall problem by hypothesizing links in the graph that would improve query Recall, and then attempting to find more evidence to support them. Using this approach, we confirm that in the context of answering queries over NLP graphs, we can use lower confidence results from NLP components if they complete a query result.
It has become common to use RDF to store the results of Natural Language Processing (NLP) as a graph of the entities mentioned in the text with the relationships mentioned in the text as links between them. These NLP graphs can be measured with Precision and Recall against a ground truth graph representing what the documents actually say. When asking conjunctive queries on NLP graphs, the Recall of the query is expected to be roughly the product of the Recall of the relations in each conjunct. Since Recall is typically less than one, conjunctive query Recall on NLP graphs degrades geometrically with the number of conjuncts. We present an approach to address this Recall problem by hypothesizing links in the graph that would improve query Recall, and then attempting to find more evidence to support them. Using this approach, we confirm that in the context of answering queries over NLP graphs, we can use lower confidence results from NLP components if they complete a query result.
Markus Luczak-Rösch
Markus Luczak-Rösch
Markus Luczak-Rösch
Adding Realtime Coverage to the Google Knowledge Graph
Adding Realtime Coverage to the Google Knowledge Graph
Adding Realtime Coverage to the Google Knowledge Graph
In May 2012, the Web search engine Google has introduced the so-called Knowledge Graph, a graph that understands real-world entities and their relationships to one another. Entities covered by the Knowledge Graph include landmarks, celebrities, cities, sports teams, buildings, movies, celestial objects, works of art, and more. The graph enhances Google search in three main ways: by disambiguation of search queries, by search log-based summarization of key facts, and by explorative search suggestions. With this paper, we suggest a fourth way of enhancing Web search: through the addition of realtime coverage of what people say about real-world entities on social networks. We report on a browser extension that seamlessly adds relevant microposts from the social networking sites Google+, Facebook, and Twitter in form of a panel to Knowledge Graph entities. In a true Linked Data fashion, we interlink detected concepts in microposts with Freebase entities, and evaluate our approach for both relevancy and usefulness. The extension is freely available, we invite the reader to reconstruct the examples of this paper to see how realtime opinions may have changed since time of writing.
In May 2012, the Web search engine Google has introduced the so-called Knowledge Graph, a graph that understands real-world entities and their relationships to one another. Entities covered by the Knowledge Graph include landmarks, celebrities, cities, sports teams, buildings, movies, celestial objects, works of art, and more. The graph enhances Google search in three main ways: by disambiguation of search queries, by search log-based summarization of key facts, and by explorative search suggestions. With this paper, we suggest a fourth way of enhancing Web search: through the addition of realtime coverage of what people say about real-world entities on social networks. We report on a browser extension that seamlessly adds relevant microposts from the social networking sites Google+, Facebook, and Twitter in form of a panel to Knowledge Graph entities. In a true Linked Data fashion, we interlink detected concepts in microposts with Freebase entities, and evaluate our approach for both relevancy and usefulness. The extension is freely available, we invite the reader to reconstruct the examples of this paper to see how realtime opinions may have changed since time of writing.
Technion – Israel Institute of Technology
Technion – Israel Institute of Technology
Technion – Israel Institute of Technology
Pedro Szekely
Pedro Szekely
Pedro Szekely
Industry Track III
Carsten Keßler
Carsten Keßler
Carsten Keßler
Philipp Cimiano
Philipp Cimiano
Philipp Cimiano
Sonia Bergamaschi
Sonia Bergamaschi
Sonia Bergamaschi
Experiences with modeling composite phenotypes in the SKELETOME project
Experiences with modeling composite phenotypes in the SKELETOME project
Experiences with modeling composite phenotypes in the SKELETOME project
Semantic annotation of patient data in the skeletal dysplasia domain (e.g., clinical summaries) is a challenging process due to the structural and lexical differences existing between the terms used to describe radiographic findings. In this paper we propose an ontology aimed at representing the intrinsic structure of such radiographic findings in a standard manner, in order to bridge the different lexical variations of the actual terms. Furthermore, we describe and evaluate an algorithm capable of mapping concepts of this ontology to exact or broader terms in the main phenotype ontology used in the bone dysplasia domain.
Semantic annotation of patient data in the skeletal dysplasia domain (e.g., clinical summaries) is a challenging process due to the structural and lexical differences existing between the terms used to describe radiographic findings. In this paper we propose an ontology aimed at representing the intrinsic structure of such radiographic findings in a standard manner, in order to bridge the different lexical variations of the actual terms. Furthermore, we describe and evaluate an algorithm capable of mapping concepts of this ontology to exact or broader terms in the main phenotype ontology used in the bone dysplasia domain.
Takahiro Kawamura
Takahiro Kawamura
Takahiro Kawamura
Thetida Zetta
Thetida Zetta
Thetida Zetta
The 11th International Semantic Web Conference
Nicola Fanizzi
Nicola Fanizzi
Nicola Fanizzi
Zachary Daniels
Zachary Daniels
Zachary Daniels
Markus Strohmaier
Markus Strohmaier
Markus Strohmaier
Claudia d'Amato
Claudia d'Amato
Claudia d'Amato
University of Zurich
University of Zurich
University of Zurich
Conference Dinner
Technische Universität Darmstadt
Technische Universität Darmstadt
Technische Universität Darmstadt
Guus Schreiber
Guus Schreiber
Guus Schreiber
Jeff Heflin
Jeff Heflin
Jeff Heflin
Roger Hall
Roger Hall
Roger Hall
Carlos Viegas Damásio
Carlos Viegas Damásio
Carlos Viegas Damásio
Markus Krötzsch
Markus Krötzsch
Markus Krötzsch
Eric Goodman
Eric Goodman
Eric Goodman
Semantic Web Challenge
Khadija Elbedweihy
Khadija Elbedweihy
Khadija Elbedweihy
Bruno Alves
Bruno Alves
Bruno Alves
University of Tokyo
University of Tokyo
University of Tokyo
Thomas Rindflesch
Thomas Rindflesch
Thomas Rindflesch
Asunción Gómez-Pérez
Asunción Gómez-Pérez
Asunción Gómez-Pérez
Volker Haarslev
Volker Haarslev
Volker Haarslev
Ray W. Fergerson
Ray W. Fergerson
Ray W. Fergerson
Conor Hayes
Conor Hayes
Conor Hayes
Satya Sahoo
Satya Sahoo
Satya Sahoo
John Domingue
John Domingue
John Domingue
Tim Finin
Tim Finin
Tim Finin
Joint Workshop on Semantic Technologies Applied to Biomedical Informatics and Individualized Medicine (SATBI+SWIM 2012)
Diagnosis, or the method to connect causes to its effects, is an important reasoning task for obtaining insight on cities and reaching the concept of sustainable and smarter cities that is envisioned nowadays. This paper, focusing on transportation and its road traffic, presents how road traffic congestions can be detected and diagnosed in quasi real-time. We adapt pure Artificial Intelligence diagnosis techniques to fully exploit knowledge which is captured through relevant semantics-augmented stream and static data from various domains. Our prototype of semantic-aware diagnosis of road traffic congestions, experimented in Dublin Ireland, works efficiently with large, heterogeneous information sources and delivers value-added services to citizens and city managers in quasi real-time.
Applying Semantic Web Technologies for Diagnosing Road Traffic Congestions
Applying Semantic Web Technologies for Diagnosing Road Traffic Congestions
Applying Semantic Web Technologies for Diagnosing Road Traffic Congestions
Diagnosis, or the method to connect causes to its effects, is an important reasoning task for obtaining insight on cities and reaching the concept of sustainable and smarter cities that is envisioned nowadays. This paper, focusing on transportation and its road traffic, presents how road traffic congestions can be detected and diagnosed in quasi real-time. We adapt pure Artificial Intelligence diagnosis techniques to fully exploit knowledge which is captured through relevant semantics-augmented stream and static data from various domains. Our prototype of semantic-aware diagnosis of road traffic congestions, experimented in Dublin Ireland, works efficiently with large, heterogeneous information sources and delivers value-added services to citizens and city managers in quasi real-time.
Huajun Chen
Huajun Chen
Huajun Chen
ourSpaces - Design and Deployment of a Semantic Virtual Research Environment
ourSpaces - Design and Deployment of a Semantic Virtual Research Environment
In this paper we discuss our experience with the design, development and deployment of the ourSpaces Virtual Research Environment. ourSpaces makes use of Semantic Web technologies to create a platform to support multidisciplinary research groups. This paper introduces the main semantic components of the system: a framework to capture the provenance of the research process, a collection of services to create and visualise metadata and a policy reasoning service. We also describe different approaches to support interaction between users and metadata within the VRE. We discuss the lessons learnt during the deployment process with three case study groups. Finally, we present our conclusions and future directions for exploration in terms of developing ourSpaces further.
In this paper we discuss our experience with the design, development and deployment of the ourSpaces Virtual Research Environment. ourSpaces makes use of Semantic Web technologies to create a platform to support multidisciplinary research groups. This paper introduces the main semantic components of the system: a framework to capture the provenance of the research process, a collection of services to create and visualise metadata and a policy reasoning service. We also describe different approaches to support interaction between users and metadata within the VRE. We discuss the lessons learnt during the deployment process with three case study groups. Finally, we present our conclusions and future directions for exploration in terms of developing ourSpaces further.
ourSpaces - Design and Deployment of a Semantic Virtual Research Environment
Toshiba
Toshiba
Toshiba
Patrick Rodler
Patrick Rodler
Patrick Rodler
Sambhawa Priya
Sambhawa Priya
Sambhawa Priya
Russell Newman
Russell Newman
Russell Newman
Giuseppe Rizzo
Giuseppe Rizzo
Giuseppe Rizzo
Yogesh Simmhan
Yogesh Simmhan
Yogesh Simmhan
8th International Workshop on Uncertainty Reasoning for the Semantic Web
Shizuoka University
Shizuoka University
Shizuoka University
Bridgewater College
Bridgewater College
Bridgewater College
Marieke Van Erp
Marieke Van Erp
Marieke Van Erp
Industry Track IV
Hugh Williams
Hugh Williams
Hugh Williams
Rehab Albeladi
Rehab Albeladi
Rehab Albeladi
Griffith University
Griffith University
Griffith University
Semantic Web allows us to model and query time-invariant or slowly evolving knowledge using ontologies. Emerging applications in Cyber Physical Systems such as Smart Power Grids that require continuous information monitoring and integration present novel opportunities and challenges for Semantic Web technologies. Semantic Web is promising to model diverse Smart Grid domain knowledge for enhanced situation awareness and response by multi-disciplinary participants. However, current technology does pose a performance overhead for dynamic analysis of sensor measurements. In this paper, we combine semantic web and complex event processing for stream based semantic querying. We illustrate its adoption in the USC Campus Micro-Grid for detecting and enacting dynamic response strategies to peak power situations by diverse user roles. We also describe the semantic ontology and event query model that supports this. Further, we introduce and evaluate caching techniques to improve the response time for semantic event queries to meet our application needs and enable sustainable energy management.
Incorporating Semantic Knowledge into Dynamic Data Processing for Smart Power Grids
Incorporating Semantic Knowledge into Dynamic Data Processing for Smart Power Grids
Semantic Web allows us to model and query time-invariant or slowly evolving knowledge using ontologies. Emerging applications in Cyber Physical Systems such as Smart Power Grids that require continuous information monitoring and integration present novel opportunities and challenges for Semantic Web technologies. Semantic Web is promising to model diverse Smart Grid domain knowledge for enhanced situation awareness and response by multi-disciplinary participants. However, current technology does pose a performance overhead for dynamic analysis of sensor measurements. In this paper, we combine semantic web and complex event processing for stream based semantic querying. We illustrate its adoption in the USC Campus Micro-Grid for detecting and enacting dynamic response strategies to peak power situations by diverse user roles. We also describe the semantic ontology and event query model that supports this. Further, we introduce and evaluate caching techniques to improve the response time for semantic event queries to meet our application needs and enable sustainable energy management.
Incorporating Semantic Knowledge into Dynamic Data Processing for Smart Power Grids
University of Fribourg
University of Fribourg
University of Fribourg
Takahisa Fujino
Takahisa Fujino
Takahisa Fujino
Bijan Parsia
Bijan Parsia
Bijan Parsia
Francesco Osborne
Francesco Osborne
Francesco Osborne
Laszlo Török
Laszlo Török
Laszlo Török
Francois Scharffe
Francois Scharffe
Francois Scharffe
OpenLink Software
OpenLink Software
OpenLink Software
DEQA: Deep Web Extraction for Question Answering
DEQA: Deep Web Extraction for Question Answering
Despite decades of effort, intelligent object search remains elusive. Neither search engine nor semantic web technologies alone have managed to provide usable systems for simple questions such as "Find me a flat with a garden and more than two bedrooms near a supermarket." We introduce DEQA, a conceptual framework that achieves this elusive goal through combining state-of-the-art semantic technologies with effective data extraction. To that end, we apply DEQA to the UK real estate domain and show that it can answer a significant percentage of such questions correctly. DEQA achieves this by mapping natural language questions to SPARQL patterns. These patterns are then evaluated on an RDF database of current real estate offers. The offers are obtained using OXPATH, a state-of-the-art data extraction system, on the major agencies in the Oxford area and linked through LIMES to background knowledge such as the location of supermarkets.
DEQA: Deep Web Extraction for Question Answering
Despite decades of effort, intelligent object search remains elusive. Neither search engine nor semantic web technologies alone have managed to provide usable systems for simple questions such as "Find me a flat with a garden and more than two bedrooms near a supermarket." We introduce DEQA, a conceptual framework that achieves this elusive goal through combining state-of-the-art semantic technologies with effective data extraction. To that end, we apply DEQA to the UK real estate domain and show that it can answer a significant percentage of such questions correctly. DEQA achieves this by mapping natural language questions to SPARQL patterns. These patterns are then evaluated on an RDF database of current real estate offers. The offers are obtained using OXPATH, a state-of-the-art data extraction system, on the major agencies in the Oxford area and linked through LIMES to background knowledge such as the location of supermarkets.
Mike Wald
Mike Wald
Mike Wald
In this paper, we present QuerioCity, a platform to catalog, index and query highly heterogenous information coming from complex systems, such as cities. A series of challenges are identified: namely, the heterogeneity of the domain and the lack of a common model, the volume of information and the number of data sets, the requirement for a low entry threshold to the system, the diversity of the input data, in terms of format, syntax and update frequency (streams vs static data), and the sensitivity of the information. We propose an approach for incremental and continuous integration of static and streaming data, based on Semantic Web technologies. The proposed system is unique in the literature in terms of handling of multiple integrations of available data sets in combination with flexible provenance tracking, privacy protection and continuous integration of streams. We report on lessons learnt from building the first prototype for Dublin.
QuerioCity: A Linked Data Platform for Urban Information Management
QuerioCity: A Linked Data Platform for Urban Information Management
In this paper, we present QuerioCity, a platform to catalog, index and query highly heterogenous information coming from complex systems, such as cities. A series of challenges are identified: namely, the heterogeneity of the domain and the lack of a common model, the volume of information and the number of data sets, the requirement for a low entry threshold to the system, the diversity of the input data, in terms of format, syntax and update frequency (streams vs static data), and the sensitivity of the information. We propose an approach for incremental and continuous integration of static and streaming data, based on Semantic Web technologies. The proposed system is unique in the literature in terms of handling of multiple integrations of available data sets in combination with flexible provenance tracking, privacy protection and continuous integration of streams. We report on lessons learnt from building the first prototype for Dublin.
QuerioCity: A Linked Data Platform for Urban Information Management
Edna Ruckhaus
Edna Ruckhaus
Edna Ruckhaus
Eva Blomqvist
Eva Blomqvist
Eva Blomqvist
Benoit Christophe
Benoit Christophe
Benoit Christophe
Amel Bennaceur
Amel Bennaceur
Amel Bennaceur
Keynote 1: The Semantic Web and Collective Intelligence
Jyotishman Pathak
Jyotishman Pathak
Jyotishman Pathak
The success of pervasive computing depends on the ability to compose a multitude of networked applications dynamically in order to achieve user goals. However, applications from different providers are not able to interoperate due to incompatible interaction protocols or disparate data models. Instant messaging is a representative example of the current situation, where various competing applications keep emerging. To enforce interoperability at runtime and in a non-intrusive manner, mediators are used to perform the necessary translations and coordination between the heterogeneous applications. Nevertheless, the design of mediators requires considerable knowledge about each application as well as a substantial development effort. In this paper we present an approach based on ontology reasoning and model checking in order to generate correct-by-construction mediators automatically. We demonstrate the feasibility of our approach through a prototype tool and show that it synthesises mediators that achieve efficient interoperation of instant messaging applications.
Achieving Interoperability through Semantics-based Technologies: The Instant Messaging Case
Achieving Interoperability through Semantics-based Technologies: The Instant Messaging Case
Achieving Interoperability through Semantics-based Technologies: The Instant Messaging Case
The success of pervasive computing depends on the ability to compose a multitude of networked applications dynamically in order to achieve user goals. However, applications from different providers are not able to interoperate due to incompatible interaction protocols or disparate data models. Instant messaging is a representative example of the current situation, where various competing applications keep emerging. To enforce interoperability at runtime and in a non-intrusive manner, mediators are used to perform the necessary translations and coordination between the heterogeneous applications. Nevertheless, the design of mediators requires considerable knowledge about each application as well as a substantial development effort. In this paper we present an approach based on ontology reasoning and model checking in order to generate correct-by-construction mediators automatically. We demonstrate the feasibility of our approach through a prototype tool and show that it synthesises mediators that achieve efficient interoperation of instant messaging applications.
Andriana Gkaniatsou
Andriana Gkaniatsou
Andriana Gkaniatsou
Osaka University
Osaka University
Osaka University
Nick Bassiliades
Nick Bassiliades
Nick Bassiliades
Richard Boyce
Richard Boyce
Richard Boyce
Keynote 2: Driving Innovation with Open Data and Interoperability
Hong Li
Hong Li
Hong Li
Peter Mika
Peter Mika
Peter Mika
Payam Barnaghi
Payam Barnaghi
Payam Barnaghi
Using SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and Metadata
BioPortal is a repository of biomedical ontologies - the largest such repository, with more than 300 ontologies to date. This set includes ontologies that were developed in OWL, OBO and other languages, as well as a large number of medical terminologies that the US National Library of Medicine distributes in its own proprietary format. We have published the RDF based serializations of all these ontologies and their metadata at sparql.bioontology.org. This dataset contains 203M triples, representing both content and metadata for the 300+ ontologies; and 9M mappings between terms. This endpoint can be queried with SPARQL which opens new usage scenarios for the biomedical domain. This paper presents lessons learned from having redesigned several applications that today use this SPARQL endpoint to consume ontological data.
Using SPARQL to Query BioPortal Ontologies and Metadata
BioPortal is a repository of biomedical ontologies - the largest such repository, with more than 300 ontologies to date. This set includes ontologies that were developed in OWL, OBO and other languages, as well as a large number of medical terminologies that the US National Library of Medicine distributes in its own proprietary format. We have published the RDF based serializations of all these ontologies and their metadata at sparql.bioontology.org. This dataset contains 203M triples, representing both content and metadata for the 300+ ontologies; and 9M mappings between terms. This endpoint can be queried with SPARQL which opens new usage scenarios for the biomedical domain. This paper presents lessons learned from having redesigned several applications that today use this SPARQL endpoint to consume ontological data.
Louiqa Raschid
Louiqa Raschid
Louiqa Raschid
Amar Djalil Mezaour
Amar Djalil Mezaour
Amar Djalil Mezaour
University of Crete
University of Crete
University of Crete
Keynote 3: Tackling Climate Change: Unfinished Business from the Last “Winter”
Renaud Delbru
Renaud Delbru
Renaud Delbru
The LOD2 Stack is an integrated distribution of aligned tools which support the whole life cycle of Linked Data from extraction, authoring/creation via enrichment, interlinking, fusing to maintenance. The LOD2 Stack comprises new and substantially extended existing tools from the LOD2 project partners and third parties. The stack is designed to be versatile; for all functionality we define clear interfaces, which enable the plugging in of alternative third-party implementations. The architecture of the LOD2 Stack is based on three pillars: (1) Software integration and deployment using the Debian packaging system. (2) Use of a central SPARQL endpoint and standardized vocabularies for knowledge base access and integration between the different tools of the LOD2 Stack. (3) Integration of the LOD2 Stack user interfaces based on REST enabled Web Applications. These three pillars comprise the methodological and technological framework for integrating the very heterogeneous LOD2 Stack components into a consistent framework. In this article we describe these pillars in more detail and give an overview of the individual LOD2 Stack components. The article also includes a description of a real-world usage scenario in the publishing domain.
The LOD2 Stack is an integrated distribution of aligned tools which support the whole life cycle of Linked Data from extraction, authoring/creation via enrichment, interlinking, fusing to maintenance. The LOD2 Stack comprises new and substantially extended existing tools from the LOD2 project partners and third parties. The stack is designed to be versatile; for all functionality we define clear interfaces, which enable the plugging in of alternative third-party implementations. The architecture of the LOD2 Stack is based on three pillars: (1) Software integration and deployment using the Debian packaging system. (2) Use of a central SPARQL endpoint and standardized vocabularies for knowledge base access and integration between the different tools of the LOD2 Stack. (3) Integration of the LOD2 Stack user interfaces based on REST enabled Web Applications. These three pillars comprise the methodological and technological framework for integrating the very heterogeneous LOD2 Stack components into a consistent framework. In this article we describe these pillars in more detail and give an overview of the individual LOD2 Stack components. The article also includes a description of a real-world usage scenario in the publishing domain.
Managing the life-cycle of Linked Data with the LOD2 Stack
Managing the life-cycle of Linked Data with the LOD2 Stack
Managing the life-cycle of Linked Data with the LOD2 Stack
Gerd Zechmeister
Gerd Zechmeister
Gerd Zechmeister
Enrico Daga
Enrico Daga
Enrico Daga
Kostyantyn Shchekotykhin
Kostyantyn Shchekotykhin
Kostyantyn Shchekotykhin
The 2nd International Workshop on Linked Science 2012Tackling Big Data
Francesco Draicchio
Francesco Draicchio
Francesco Draicchio
Jagannathan Srinivasan
Jagannathan Srinivasan
Jagannathan Srinivasan
Hamdi Aloulou
Hamdi Aloulou
Hamdi Aloulou
BBN Technologies
BBN Technologies
BBN Technologies
Alex Crow
Alex Crow
Alex Crow
Li Ding
Li Ding
Li Ding
Jun Zhao
Jun Zhao
Jun Zhao
Chemnitz University of Technology
Chemnitz University of Technology
Chemnitz University of Technology
The inherent heterogeneity of datasets on the Semantic Web has created a need to interlink them, and several tools have emerged that automate this task. In this paper we are interested in what happens if we enrich these matching tools with knowledge of the domain of the ontologies. We explore how to express the notion of a domain in terms usable for an ontology matching tool, and we examine various methods to decide what constitutes the domain of a given dataset. We show how we can use this in a matching tool, and study the effect of domain knowledge on the quality of the alignment. We perform evaluations for two scenarios: Last.fm artists and UMLS medical terms. To quantify the added value of domain knowledge, we compare our domain-aware matching approach to a standard approach based on a manually created reference alignment. The results indicate that the proposed domain-aware approach indeed outperforms the standard approach, with a large effect on ambiguous concepts but a much smaller effect on unambiguous concepts.
Domain-aware Ontology Matching
Domain-aware Ontology Matching
Domain-aware Ontology Matching
The inherent heterogeneity of datasets on the Semantic Web has created a need to interlink them, and several tools have emerged that automate this task. In this paper we are interested in what happens if we enrich these matching tools with knowledge of the domain of the ontologies. We explore how to express the notion of a domain in terms usable for an ontology matching tool, and we examine various methods to decide what constitutes the domain of a given dataset. We show how we can use this in a matching tool, and study the effect of domain knowledge on the quality of the alignment. We perform evaluations for two scenarios: Last.fm artists and UMLS medical terms. To quantify the added value of domain knowledge, we compare our domain-aware matching approach to a standard approach based on a manually created reference alignment. The results indicate that the proposed domain-aware approach indeed outperforms the standard approach, with a large effect on ambiguous concepts but a much smaller effect on unambiguous concepts.
Nathan Wilson
Nathan Wilson
Nathan Wilson
Hai Zhuge
Hai Zhuge
Hai Zhuge
Christina Unger
Christina Unger
Christina Unger
Josh Hanna
Josh Hanna
Josh Hanna
Fraunhofer Institute
Fraunhofer Institute
Fraunhofer Institute
Robust Runtime Optimization and Skew-Resistant Execution of Analytical SPARQL Queries on Pig
We describe a system that incrementally translates SPARQL queries to Pig Latin and executes them on a Hadoop cluster. This system is designed to work efficiently on complex queries with many self-joins over huge datasets, avoiding job failures even in the case of joins with unexpected high-value skew. To be robust against cost estimation errors, our system interleaves query optimization with query execution, determining the next steps to take based on data samples and statistics gathered during the previous step. Furthermore, we have developed a novel skew-resistant join algorithm that replicates tuples corresponding to popular keys. We evaluate the effectiveness of our approach both on a synthetic benchmark known to generate complex queries (BSBM-BI) as well as on a Yahoo! case of data analysis using RDF data crawled from the web. Our results indicate that our system is indeed capable of processing huge datasets without pre-computed statistics while exhibiting good load-balancing properties.
Robust Runtime Optimization and Skew-Resistant Execution of Analytical SPARQL Queries on Pig
Robust Runtime Optimization and Skew-Resistant Execution of Analytical SPARQL Queries on Pig
We describe a system that incrementally translates SPARQL queries to Pig Latin and executes them on a Hadoop cluster. This system is designed to work efficiently on complex queries with many self-joins over huge datasets, avoiding job failures even in the case of joins with unexpected high-value skew. To be robust against cost estimation errors, our system interleaves query optimization with query execution, determining the next steps to take based on data samples and statistics gathered during the previous step. Furthermore, we have developed a novel skew-resistant join algorithm that replicates tuples corresponding to popular keys. We evaluate the effectiveness of our approach both on a synthetic benchmark known to generate complex queries (BSBM-BI) as well as on a Yahoo! case of data analysis using RDF data crawled from the web. Our results indicate that our system is indeed capable of processing huge datasets without pre-computed statistics while exhibiting good load-balancing properties.
Michael Sintek
Michael Sintek
Michael Sintek
Gianluca Correndo
Gianluca Correndo
Gianluca Correndo
Awards and Closing Ceremony
First Workshop on Programming the Semantic Web
John Breslin
John Breslin
John Breslin
Manuel Salvadores
Manuel Salvadores
Manuel Salvadores
Rudi Studer
Rudi Studer
Rudi Studer
Fabien Duchateau
Fabien Duchateau
Fabien Duchateau
Mentor Lunch
The distributed and heterogeneous nature of Linked Open Data requires flexible and federated techniques for query evaluation. In order to evaluate current federation querying approaches a general methodology for conducting benchmarks is mandatory. In this paper, we present a classification methodology for federated SPARQL queries. This methodology can be used by developers of federated querying approaches to compose a set of test benchmarks that cover diverse characteristics of different queries and allows for comparability. We further develop a heuristic called SPLODGE for automatic generation of benchmark queries that is based on this methodology and takes into account the number of sources to be queried and several complexity parameters. We evaluate the adequacy of our methodology and the query generation strategy by applying them on the 2011 billion triple challenge data set.
The distributed and heterogeneous nature of Linked Open Data requires flexible and federated techniques for query evaluation. In order to evaluate current federation querying approaches a general methodology for conducting benchmarks is mandatory. In this paper, we present a classification methodology for federated SPARQL queries. This methodology can be used by developers of federated querying approaches to compose a set of test benchmarks that cover diverse characteristics of different queries and allows for comparability. We further develop a heuristic called SPLODGE for automatic generation of benchmark queries that is based on this methodology and takes into account the number of sources to be queried and several complexity parameters. We evaluate the adequacy of our methodology and the query generation strategy by applying them on the 2011 billion triple challenge data set.
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
Karl Hammar
Karl Hammar
Karl Hammar
A Comparison of Hard Filters and Soft Evidence for Answer Typing in Watson
A Comparison of Hard Filters and Soft Evidence for Answer Typing in Watson
Questions often explicitly request a particular type of answer. One popular approach to answering natural language questions involves filtering candidate answers based on precompiled lists of instances of common answer types (e.g., countries, animals, foods, etc.). Such a strategy is poorly suited to an open domain in which there is an extremely broad range of types of answers, and the most frequently occurring types cover only a small fraction of all answers. In this paper we present an alternative approach called TyCor, that employs soft filtering of candidates using multiple strategies and sources. We find that TyCor significantly outperforms a single-source, single-strategy hard filtering approach, demonstrating both that multi-source multi-strategy outperforms a single source, single strategy, and that its fault tolerance yields significantly better performance than a hard filter.
A Comparison of Hard Filters and Soft Evidence for Answer Typing in Watson
Questions often explicitly request a particular type of answer. One popular approach to answering natural language questions involves filtering candidate answers based on precompiled lists of instances of common answer types (e.g., countries, animals, foods, etc.). Such a strategy is poorly suited to an open domain in which there is an extremely broad range of types of answers, and the most frequently occurring types cover only a small fraction of all answers. In this paper we present an alternative approach called TyCor, that employs soft filtering of candidates using multiple strategies and sources. We find that TyCor significantly outperforms a single-source, single-strategy hard filtering approach, demonstrating both that multi-source multi-strategy outperforms a single source, single strategy, and that its fault tolerance yields significantly better performance than a hard filter.
Qunzhi Zhou
Qunzhi Zhou
Qunzhi Zhou
Yutaka Matsuo
Yutaka Matsuo
Yutaka Matsuo
Lyndon Nixon
Lyndon Nixon
Lyndon Nixon
Simone Contessa
Simone Contessa
Simone Contessa
Ana Armas
Ana Armas
Ana Armas
Nadeschda Nikitina
Nadeschda Nikitina
Nadeschda Nikitina
Rutgers University
Rutgers University
Rutgers University
Guillermo Palma
Guillermo Palma
Guillermo Palma
RDFS Reasoning on Massively Parallel Hardware
RDFS Reasoning on Massively Parallel Hardware
Recent developments in hardware have shown an increase in parallelism as opposed to clock rates. In order to fully exploit these new avenues of performance improvement, computationally expensive workloads have to be expressed in a way that allows for fine-grained parallelism. In this paper, we address the problem of describing RDFS entailment in such a way. Different from previous work on parallel RDFS reasoning, we assume a shared memory architecture. We analyze the problem of duplicates that naturally occur in RDFS reasoning and develop strategies towards its mitigation, exploiting all levels of our architecture. We implement and evaluate our approach on two real-world datasets and study its performance characteristics on different levels of parallelization. We conclude that RDFS entailment lends itself well to parallelization but can benefit even more from careful optimizations that take into account intricacies of modern parallel hardware.
Recent developments in hardware have shown an increase in parallelism as opposed to clock rates. In order to fully exploit these new avenues of performance improvement, computationally expensive workloads have to be expressed in a way that allows for fine-grained parallelism. In this paper, we address the problem of describing RDFS entailment in such a way. Different from previous work on parallel RDFS reasoning, we assume a shared memory architecture. We analyze the problem of duplicates that naturally occur in RDFS reasoning and develop strategies towards its mitigation, exploiting all levels of our architecture. We implement and evaluate our approach on two real-world datasets and study its performance characteristics on different levels of parallelization. We conclude that RDFS entailment lends itself well to parallelization but can benefit even more from careful optimizations that take into account intricacies of modern parallel hardware.
RDFS Reasoning on Massively Parallel Hardware
University of Graz
University of Graz
University of Graz
Trentino government linked open geo-data: a case study
Trentino government linked open geo-data: a case study
Our work is settled in the context of the public administration domain, where data can come from different entities, can be produced, stored and delivered in different formats and can have different levels of quality. Hence, such a heterogeneity has to be addressed, while performing various data integration tasks. We report our experimental work on publishing some government linked open geo-metadata and geo-data of the Italian Trentino region. Specifically, we illustrate how 161 core geographic datasets were released by leveraging on the geo-catalogue application within the existing geo-portal. We discuss the lessons we learned from deploying and using the application as well as from the released datasets.
Trentino government linked open geo-data: a case study
Our work is settled in the context of the public administration domain, where data can come from different entities, can be produced, stored and delivered in different formats and can have different levels of quality. Hence, such a heterogeneity has to be addressed, while performing various data integration tasks. We report our experimental work on publishing some government linked open geo-metadata and geo-data of the Italian Trentino region. Specifically, we illustrate how 161 core geographic datasets were released by leveraging on the geo-catalogue application within the existing geo-portal. We discuss the lessons we learned from deploying and using the application as well as from the released datasets.
Laurens De Vocht
Laurens De Vocht
Laurens De Vocht
Hong Kong University of Science and Technology
Hong Kong University of Science and Technology
Hong Kong University of Science and Technology
SRI International
SRI International
SRI International
Christina Lantzaki
Christina Lantzaki
Christina Lantzaki
Kai-Uwe Sattler
Kai-Uwe Sattler
Kai-Uwe Sattler
Fondazione Bruno Kessler
Fondazione Bruno Kessler
Fondazione Bruno Kessler
Jens Lehmann
Jens Lehmann
Jens Lehmann
Martin Hepp
Martin Hepp
Martin Hepp
Fiona McNeill
Fiona McNeill
Fiona McNeill
Akihiko Ohsuga
Akihiko Ohsuga
Akihiko Ohsuga
University of Koblenz and Landau
University of Koblenz and Landau
University of Koblenz and Landau
Valentina Tamma
Valentina Tamma
Valentina Tamma
Denilson Barbosa
Denilson Barbosa
Denilson Barbosa
Data.gov, U.S. General Services Administration
Data.gov, U.S. General Services Administration
Data.gov, U.S. General Services Administration
Aalto University
Aalto University
Aalto University
Peter Boncz
Peter Boncz
Peter Boncz
Charis Kontoes
Charis Kontoes
Charis Kontoes
Chinese Academy of Sciences
Chinese Academy of Sciences
Chinese Academy of Sciences
CWI Amsterdam
CWI Amsterdam
CWI Amsterdam
Bojan Bozic
Bojan Bozic
Bojan Bozic
University of California, Santa Barbara
University of California, Santa Barbara
University of California, Santa Barbara
Dimitris Zeginis
Dimitris Zeginis
Dimitris Zeginis
Giovanni Semeraro
Giovanni Semeraro
Giovanni Semeraro
Pasquale Lops
Pasquale Lops
Pasquale Lops
Vuk Milicic
Vuk Milicic
Vuk Milicic
Carlos Pedrinaci
Carlos Pedrinaci
Carlos Pedrinaci
Sam Coppens
Sam Coppens
Sam Coppens
Feroz Farazi
Feroz Farazi
Feroz Farazi
Jean Paul Calbimonte
Jean Paul Calbimonte
Jean Paul Calbimonte
Hiroyuki Toda
Hiroyuki Toda
Hiroyuki Toda
Freddy Lecue
Freddy Lecue
Freddy Lecue
Yuzhong Qu
Yuzhong Qu
Yuzhong Qu
Pavel Klinov
Pavel Klinov
Pavel Klinov
Yue Ma
Yue Ma
Yue Ma
Joanne Luciano
Joanne Luciano
Joanne Luciano
Alexandra Poulovassilis
Alexandra Poulovassilis
Alexandra Poulovassilis
The primary challenge of machine perception is to define efficient computational methods to derive high-level knowledge from low-level sensor observation data. Emerging solutions are using ontologies for expressive representation of concepts in the domain of sensing and perception, which enable advanced integration and interpretation of heterogeneous sensor data. The computational complexity of OWL, however, seriously limits its applicability and use within resource-constrained environments, such as mobile devices. To overcome this issue, we employ OWL to formally define the inference tasks needed for machine perception - explanation and discrimination - and then provide efficient algorithms for these tasks, using bit-vector encodings and operations. The applicability of our approach to machine perception is evaluated on a smart-phone mobile device, demonstrating dramatic improvements in both efficiency and scale.
An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices
The primary challenge of machine perception is to define efficient computational methods to derive high-level knowledge from low-level sensor observation data. Emerging solutions are using ontologies for expressive representation of concepts in the domain of sensing and perception, which enable advanced integration and interpretation of heterogeneous sensor data. The computational complexity of OWL, however, seriously limits its applicability and use within resource-constrained environments, such as mobile devices. To overcome this issue, we employ OWL to formally define the inference tasks needed for machine perception - explanation and discrimination - and then provide efficient algorithms for these tasks, using bit-vector encodings and operations. The applicability of our approach to machine perception is evaluated on a smart-phone mobile device, demonstrating dramatic improvements in both efficiency and scale.
An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices
An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices
Segreteria SIAT, Autonomous Province of Trento
Segreteria SIAT, Autonomous Province of Trento
Segreteria SIAT, Autonomous Province of Trento
Oktie Hassanzadeh
Oktie Hassanzadeh
Oktie Hassanzadeh
Joseph Benik
Joseph Benik
Joseph Benik
Axel-Cyrille Ngonga Ngomo
Axel-Cyrille Ngonga Ngomo
Axel-Cyrille Ngonga Ngomo
University of California, Davis
University of California, Davis
University of California, Davis
Cartic Ramakrishnan
Cartic Ramakrishnan
Cartic Ramakrishnan
Boris Motik
Boris Motik
Boris Motik
Mark A. Musen
Mark A. Musen
Mark A. Musen
Alberto Musetti
Alberto Musetti
Alberto Musetti
SRBench: A Streaming RDF/SPARQL Benchmark
SRBench: A Streaming RDF/SPARQL Benchmark
We introduce SRBench, a general-purpose benchmark primarily designed for streaming RDF/SPARQL engines, completely based on real-world data sets from the Linked Open Data cloud. With the increasing problem of too much streaming data but not enough tools to gain knowledge from them, researchers have set out for solutions in which Semantic Web technologies are adapted and extended for publishing, sharing, analysing and understanding streaming data. To help researchers and users comparing streaming RDF/SPARQL (strRS) engines in a standardised application scenario, we have designed SRBench, with which one can assess the abilities of a strRS engine to cope with a broad range of use cases typically encountered in real-world scenarios. The data sets used in the benchmark have been carefully chosen, such that they represent a realistic and relevant usage of streaming data. The benchmark defines a concise, yet comprehensive set of queries that cover the major aspects of strRS processing. Finally, our work is complemented with a functional evaluation on three representative strRS engines: SPARQLStream, C-SPARQL and CQELS. The presented results are meant to give a first baseline and illustrate the state-of-the-art.
SRBench: A Streaming RDF/SPARQL Benchmark
We introduce SRBench, a general-purpose benchmark primarily designed for streaming RDF/SPARQL engines, completely based on real-world data sets from the Linked Open Data cloud. With the increasing problem of too much streaming data but not enough tools to gain knowledge from them, researchers have set out for solutions in which Semantic Web technologies are adapted and extended for publishing, sharing, analysing and understanding streaming data. To help researchers and users comparing streaming RDF/SPARQL (strRS) engines in a standardised application scenario, we have designed SRBench, with which one can assess the abilities of a strRS engine to cope with a broad range of use cases typically encountered in real-world scenarios. The data sets used in the benchmark have been carefully chosen, such that they represent a realistic and relevant usage of streaming data. The benchmark defines a concise, yet comprehensive set of queries that cover the major aspects of strRS processing. Finally, our work is complemented with a functional evaluation on three representative strRS engines: SPARQLStream, C-SPARQL and CQELS. The presented results are meant to give a first baseline and illustrate the state-of-the-art.
Junfeng Pan
Junfeng Pan
Junfeng Pan
Social and Collaborative Semantics
Lalana Kagal
Lalana Kagal
Lalana Kagal
Michael Sioutis
Michael Sioutis
Michael Sioutis
Ryutaro Ichise
Ryutaro Ichise
Ryutaro Ichise
Valentina Presutti
Presutti
Valentina Presutti
Valentina Presutti
Valentina
Dimitris Vrakas
Dimitris Vrakas
Dimitris Vrakas
Jeanne Holm
Jeanne Holm
Jeanne Holm
Jennifer Sleeman
Jennifer Sleeman
Jennifer Sleeman
Discovering Concept Coverings in Ontologies of Linked Data Sources
Discovering Concept Coverings in Ontologies of Linked Data Sources
Despite the increase in the number of linked instances in the Linked Data Cloud in recent times, the absence of links at the concept level has resulted in heterogenous schemas, challenging the interoperability goal of the Semantic Web. In this paper, we address this problem by finding alignments between concepts from multiple Linked Data sources. Instead of only considering the existing concepts present in each ontology, we hypothesize new composite concepts defined as disjunctions of conjunctions of (RDF) types and value restrictions, which we call restriction classes, and generate alignments between these composite concepts. This extended concept language enables us to find more complete definitions and to even align sources that have rudimentary ontologies, such as those that are simple renderings of relational databases. Our concept alignment approach is based on analyzing the extensions of these concepts and their linked instances. Having explored the alignment of conjunctive concepts in our previous work, in this paper, we focus on concept coverings (disjunctions of restriction classes). We present an evaluation of this new algorithm to Geospatial, Biological Classification, and Genetics domains. The resulting alignments are useful for refining existing ontologies and determining the alignments between concepts in the ontologies, thus increasing the interoperability in the Linked Open Data Cloud.
Despite the increase in the number of linked instances in the Linked Data Cloud in recent times, the absence of links at the concept level has resulted in heterogenous schemas, challenging the interoperability goal of the Semantic Web. In this paper, we address this problem by finding alignments between concepts from multiple Linked Data sources. Instead of only considering the existing concepts present in each ontology, we hypothesize new composite concepts defined as disjunctions of conjunctions of (RDF) types and value restrictions, which we call restriction classes, and generate alignments between these composite concepts. This extended concept language enables us to find more complete definitions and to even align sources that have rudimentary ontologies, such as those that are simple renderings of relational databases. Our concept alignment approach is based on analyzing the extensions of these concepts and their linked instances. Having explored the alignment of conjunctive concepts in our previous work, in this paper, we focus on concept coverings (disjunctions of restriction classes). We present an evaluation of this new algorithm to Geospatial, Biological Classification, and Genetics domains. The resulting alignments are useful for refining existing ontologies and determining the alignments between concepts in the ontologies, thus increasing the interoperability in the Linked Open Data Cloud.
Discovering Concept Coverings in Ontologies of Linked Data Sources
Reinvent Technology
Reinvent Technology
Reinvent Technology
L3S Research Center
L3S Research Center
L3S Research Center
SAP AG
SAP AG
SAP AG
Stuart Brown
Stuart Brown
Stuart Brown
Semantic Web Science Association
Semantic Web Science Association
Semantic Web Science Association
ISTI-CNR
ISTI-CNR
ISTI-CNR
Korea Institute of Science and Technology Information
Korea Institute of Science and Technology Information
Korea Institute of Science and Technology Information
Alessandro Bozzon
Alessandro Bozzon
Alessandro Bozzon
Michael Watzke
Michael Watzke
Michael Watzke
Fabien Gandon
Fabien Gandon
Fabien Gandon
Anupriya Ankolekar
Anupriya Ankolekar
Anupriya Ankolekar
A Formal Semantics for Weighted Ontology Mappings
Ontology mappings are often assigned a weight or confidence factor by matchers. Nonetheless, few semantic accounts have been given so far for such weights. This paper presents a formal semantics for weighted mappings between different ontologies. It is based on a classificational interpretation of mappings: if O1 and O2 are two ontologies used to classify a common set X , then mappings between O1 and O2 are interpreted to encode how elements of X classified in the concepts of O1 are re-classified in the concepts of O2, and weights are interpreted to measure how precise and complete re-classifications are. This semantics is justifiable by extensional practice of ontology matching. It is a conservative extension of a semantics of crisp mappings. The paper also includes properties that relate mapping entailment with description logic constructors.
A Formal Semantics for Weighted Ontology Mappings
Ontology mappings are often assigned a weight or confidence factor by matchers. Nonetheless, few semantic accounts have been given so far for such weights. This paper presents a formal semantics for weighted mappings between different ontologies. It is based on a classificational interpretation of mappings: if O1 and O2 are two ontologies used to classify a common set X , then mappings between O1 and O2 are interpreted to encode how elements of X classified in the concepts of O1 are re-classified in the concepts of O2, and weights are interpreted to measure how precise and complete re-classifications are. This semantics is justifiable by extensional practice of ontology matching. It is a conservative extension of a semantics of crisp mappings. The paper also includes properties that relate mapping entailment with description logic constructors.
A Formal Semantics for Weighted Ontology Mappings
Rik Van de Walle
Rik Van de Walle
Rik Van de Walle
Massimo Paolucci
Massimo Paolucci
Massimo Paolucci
A Machine Learning Approach for Instance Matching Based on Similarity Metrics
The Linking Open Data (LOD) project is an ongoing effort to construct a global data space, i.e. the Web of Data. One important part of this project is to establish owl:sameAs links among structured data sources. Such links indicate equivalent instances that refer to the same real-world object. The problem of discovering owl:sameAs links between pairwise data sources is called instance matching. Most of the existing approaches addressing this problem rely on the quality of prior schema matching, which is not always good enough in the LOD scenario. In this paper, we propose a schema-independent instance-pair similarity metric based on several general descriptive features. We transform the instance matching problem to the binary classification problem and solve it by machine learning algorithms. Furthermore, we employ some transfer learning methods to utilize the existing owl:sameAs links in LOD to reduce the demand for labeled data. We carry out experiments on some datasets of OAEI2010. The results show that our method performs well on real-world LOD data and outperforms the participants of OAEI2010.
A Machine Learning Approach for Instance Matching Based on Similarity Metrics
A Machine Learning Approach for Instance Matching Based on Similarity Metrics
The Linking Open Data (LOD) project is an ongoing effort to construct a global data space, i.e. the Web of Data. One important part of this project is to establish owl:sameAs links among structured data sources. Such links indicate equivalent instances that refer to the same real-world object. The problem of discovering owl:sameAs links between pairwise data sources is called instance matching. Most of the existing approaches addressing this problem rely on the quality of prior schema matching, which is not always good enough in the LOD scenario. In this paper, we propose a schema-independent instance-pair similarity metric based on several general descriptive features. We transform the instance matching problem to the binary classification problem and solve it by machine learning algorithms. Furthermore, we employ some transfer learning methods to utilize the existing owl:sameAs links in LOD to reduce the demand for labeled data. We carry out experiments on some datasets of OAEI2010. The results show that our method performs well on real-world LOD data and outperforms the participants of OAEI2010.
Chris Welty
Chris Welty
Chris Welty
Luc Moreau
Luc Moreau
Luc Moreau
Coffee Break
Yulan He
Yulan He
Yulan He
Archana Venbakam
Archana Venbakam
Archana Venbakam
Sandro Hawke
Sandro Hawke
Sandro Hawke
University of Chile
University of Chile
University of Chile
Yuki Yamagata
Yuki Yamagata
Yuki Yamagata
Instance-Based Matching of Large Ontologies Using Locality-Sensitive Hashing
In this paper, we describe a mechanism for ontology alignment using instance based matching of types (or classes). Instance-based matching is known to be a useful technique for matching ontologies that have different names and different structures. A key problem in instance matching of types, however, is scaling the matching algorithm to (a) handle types with a large number of instances, and (b) efficiently match a large number of type pairs. We propose the use of state-of-the art locality-sensitive hashing (LSH) techniques to vastly improve the scalability of instance matching across multiple types. We show the feasibility of our approach with DBpedia and Freebase, two different type systems with hundreds and thousands of types, respectively. We describe how these techniques can be used to estimate containment or equivalence relations between two type systems, and we compare two different LSH techniques for computing instance similarity.
Instance-Based Matching of Large Ontologies Using Locality-Sensitive Hashing
In this paper, we describe a mechanism for ontology alignment using instance based matching of types (or classes). Instance-based matching is known to be a useful technique for matching ontologies that have different names and different structures. A key problem in instance matching of types, however, is scaling the matching algorithm to (a) handle types with a large number of instances, and (b) efficiently match a large number of type pairs. We propose the use of state-of-the art locality-sensitive hashing (LSH) techniques to vastly improve the scalability of instance matching across multiple types. We show the feasibility of our approach with DBpedia and Freebase, two different type systems with hundreds and thousands of types, respectively. We describe how these techniques can be used to estimate containment or equivalence relations between two type systems, and we compare two different LSH techniques for computing instance similarity.
Instance-Based Matching of Large Ontologies Using Locality-Sensitive Hashing
Orri Erling
Orri Erling
Orri Erling
Microsoft Research
Microsoft Research
Microsoft Research
Tomáš Knap
Tomáš Knap
Tomáš Knap
Andrew Sellers
Andrew Sellers
Andrew Sellers
Thomas Krennwallner
Thomas Krennwallner
Thomas Krennwallner
Evelyne Viegas
Evelyne Viegas
Evelyne Viegas
Shashank Tyagi
Shashank Tyagi
Shashank Tyagi
Yannis Kalfoglou
Yannis Kalfoglou
Yannis Kalfoglou
Blank Node Matching and RDF/S Comparison Functions
Blank Node Matching and RDF/S Comparison Functions
In RDF, a blank node (or anonymous resource or bnode) is a node in an RDF graph which is not identified by a URI and is not a literal. Several RDF/S Knowledge Bases (KBs) rely heavily on blank nodes as they are convenient for representing complex attributes or resources whose identity is unknown but their attributes (either literals or associations with other resources) are known. In this paper we show how we can exploit blank nodes anonymity in order to reduce the delta (diff) size when comparing such KBs. The main idea of the proposed method is to build a mapping between the bnodes of the compared KBs for reducing the delta size. We prove that finding the optimal mapping is NP-Hard in the general case, and polynomial in case there are not directly connected bnodes. Subsequently we present various polynomial algorithms returning approximate solutions for the general case. For making the application of our method feasible also to large KBs we present a signature-based mapping algorithm with n logn complexity. Finally, we report experimental results over real and synthetic datasets that demonstrate significant reductions in the sizes of the computed deltas. For the proposed algorithms we also provide comparative results regarding delta reduction, equivalence detection and time efficiency.
Blank Node Matching and RDF/S Comparison Functions
In RDF, a blank node (or anonymous resource or bnode) is a node in an RDF graph which is not identified by a URI and is not a literal. Several RDF/S Knowledge Bases (KBs) rely heavily on blank nodes as they are convenient for representing complex attributes or resources whose identity is unknown but their attributes (either literals or associations with other resources) are known. In this paper we show how we can exploit blank nodes anonymity in order to reduce the delta (diff) size when comparing such KBs. The main idea of the proposed method is to build a mapping between the bnodes of the compared KBs for reducing the delta size. We prove that finding the optimal mapping is NP-Hard in the general case, and polynomial in case there are not directly connected bnodes. Subsequently we present various polynomial algorithms returning approximate solutions for the general case. For making the application of our method feasible also to large KBs we present a signature-based mapping algorithm with n logn complexity. Finally, we report experimental results over real and synthetic datasets that demonstrate significant reductions in the sizes of the computed deltas. For the proposed algorithms we also provide comparative results regarding delta reduction, equivalence detection and time efficiency.
Cynthia Chang
Cynthia Chang
Cynthia Chang
Walter Bender
Walter Bender
Walter Bender
Centre for Research and Technology Hellas
Centre for Research and Technology Hellas
Centre for Research and Technology Hellas
Efstratios Kontopoulos
Efstratios Kontopoulos
Efstratios Kontopoulos
Hiroko Kou
Hiroko Kou
Hiroko Kou
Achille Fokoue
Achille Fokoue
Achille Fokoue
We present Strabon, a new RDF store that supports the state of the art semantic geospatial query languages stSPARQL and GeoSPARQL. To illustrate the expressive power offered by these query languages and their implementation in Strabon, we concentrate on the new version of the data model stRDF and the query language stSPARQL that we have developed ourselves. Like GeoSPARQL, these new versions use OGC standards to represent geometries where the original versions used linear constraints. We study the performance of Strabon experimentally and show that it scales to very large data volumes and performs, most of the times, better than all other geospatial RDF stores it has been compared with.
Strabon: A Semantic Geospatial DBMS
Strabon: A Semantic Geospatial DBMS
Strabon: A Semantic Geospatial DBMS
We present Strabon, a new RDF store that supports the state of the art semantic geospatial query languages stSPARQL and GeoSPARQL. To illustrate the expressive power offered by these query languages and their implementation in Strabon, we concentrate on the new version of the data model stRDF and the query language stSPARQL that we have developed ourselves. Like GeoSPARQL, these new versions use OGC standards to represent geometries where the original versions used linear constraints. We study the performance of Strabon experimentally and show that it scales to very large data volumes and performs, most of the times, better than all other geospatial RDF stores it has been compared with.
Kemafor Anyanwu
Kemafor Anyanwu
Kemafor Anyanwu
Karin Breitman
Karin Breitman
Karin Breitman
Chen Cheng
Chen Cheng
Chen Cheng
Fausto Guinchiliglia
Fausto Guinchiliglia
Fausto Guinchiliglia
INRIA
INRIA
INRIA
In-Use Track
Lunch
Mikko Rinne
Mikko Rinne
Mikko Rinne
Romina Spalazzese
Romina Spalazzese
Romina Spalazzese
Karl Aberer
Karl Aberer
Karl Aberer
Emir Muñoz
Emir Muñoz
Emir Muñoz
schema.org update
Siemens
Siemens
Siemens
Ralf Heese
Ralf Heese
Ralf Heese
Thomas Hubauer
Thomas Hubauer
Thomas Hubauer
Anastasios Kementsietsidis
Anastasios Kementsietsidis
Anastasios Kementsietsidis
University of Bari
University of Bari
University of Bari
Stefan Schlobach
Stefan Schlobach
Stefan Schlobach
Claus Stadler
Claus Stadler
Claus Stadler
Stefan Mirea
Stefan Mirea
Stefan Mirea
Michael Martin
Michael Martin
Michael Martin
Bo Fu
Bo Fu
Bo Fu
Christophe Guéret
Christophe Guéret
Christophe Guéret
Oracle
Oracle
Oracle
Ontotext
Ontotext
Ontotext
We propose a general framework to attach the licensing terms to the data where the compatibility of the licensing terms concerning the data affected by a query is verified, and, if compatible, the licenses are combined into a composite license. The framework returns the composite license as licensing term about the data resulting from the query.
Towards Licenses Compatibility and Composition in the Web of Data
Towards Licenses Compatibility and Composition in the Web of Data
Towards Licenses Compatibility and Composition in the Web of Data
We propose a general framework to attach the licensing terms to the data where the compatibility of the licensing terms concerning the data affected by a query is verified, and, if compatible, the licenses are combined into a composite license. The framework returns the composite license as licensing term about the data resulting from the query.
Prasenjit Mitra
Prasenjit Mitra
Prasenjit Mitra
Toshio Uchiyama
Toshio Uchiyama
Toshio Uchiyama
Daniel Schwabe
Daniel Schwabe
Daniel Schwabe
Matthew Horridge
Matthew Horridge
Matthew Horridge
George Garbis
George Garbis
George Garbis
Alcatel-Lucent Bell Labs France
Alcatel-Lucent Bell Labs France
Alcatel-Lucent Bell Labs France
Alex Stolz
Alex Stolz
Alex Stolz
Michiel Hildebrand
Michiel Hildebrand
Michiel Hildebrand
University of Bremen
University of Bremen
University of Bremen
The use of social media has been rapidly increasing in the last years. Social media such as Twitter has become an important source of information for a variety of people. The public availability of data describing some of these social networks has led to a lot of research in this area. Link prediction, user classification and community detection are some of the main research areas related to social networks. In this paper we present a user modeling framework that uses Wikipedia as a frame to model user interests inside a social network. Our fine grained model of user interests reflects the areas a user is interested in as well as the level of expertise a user has in a certain field.
TwikiMe! - User profiles that make sense
The use of social media has been rapidly increasing in the last years. Social media such as Twitter has become an important source of information for a variety of people. The public availability of data describing some of these social networks has led to a lot of research in this area. Link prediction, user classification and community detection are some of the main research areas related to social networks. In this paper we present a user modeling framework that uses Wikipedia as a frame to model user interests inside a social network. Our fine grained model of user interests reflects the areas a user is interested in as well as the level of expertise a user has in a certain field.
TwikiMe! - User profiles that make sense
TwikiMe! - User profiles that make sense
Evan Wei Xiang
Evan Wei Xiang
Evan Wei Xiang
Sudeshna Das
Sudeshna Das
Sudeshna Das
Marta Corubolo
Marta Corubolo
Marta Corubolo
Thomas Bosch
Thomas Bosch
Thomas Bosch
Sara Magliacane
Sara Magliacane
Sara Magliacane
Bernie Innocenti
Bernie Innocenti
Bernie Innocenti
3rd Workshop on the Multilingual Semantic Web
Create-Net
Create-Net
Create-Net
University of Arkansas for Medical Sciences
University of Arkansas for Medical Sciences
University of Arkansas for Medical Sciences
Eric Charton
Eric Charton
Eric Charton
Ontology Engineering and Optimization
John Yu
John Yu
John Yu
SW Journal Lunch
Anastasia Analyti
Anastasia Analyti
Anastasia Analyti
Information Extraction
University of Pittsburgh
University of Pittsburgh
University of Pittsburgh
Derivo GmbH
Derivo GmbH
Derivo GmbH
Austrian Institute of Technology
Austrian Institute of Technology
Austrian Institute of Technology
Yong Yu
Yong Yu
Yong Yu
Ruben Verborgh
Ruben Verborgh
Ruben Verborgh
Thomas Lukasiewicz
Thomas Lukasiewicz
Thomas Lukasiewicz
Concordia University
Concordia University
Concordia University
University of Electro-Communications
University of Electro-Communications
University of Electro-Communications
Ontology Mapping
Ondrej Svab-Zamazal
Ondrej Svab-Zamazal
Ondrej Svab-Zamazal
Oak Ridge National Laboratory
Oak Ridge National Laboratory
Oak Ridge National Laboratory
University of Waikato
University of Waikato
University of Waikato
National Institute of Information and Communications Technology
National Institute of Information and Communications Technology
National Institute of Information and Communications Technology
3rd Workshop on Ontology Patterns
Scalability and Parallel Processing
David Wood
David Wood
David Wood
Paul Alexander
Paul Alexander
Paul Alexander
Andreas Hotho
Andreas Hotho
Andreas Hotho
Alasdair J. G. Gray
Alasdair J. G. Gray
Alasdair J. G. Gray
Royal Military College of Canada
Royal Military College of Canada
Royal Military College of Canada
Han Wang
Han Wang
Han Wang
Stefan Dietze
Stefan Dietze
Stefan Dietze
Cross Lingual Semantic Search by Improving Semantic Similarity and Relatedness Measures
Since 2001, the semantic web community has been working hard towards creating standards which will increase the accessibility of available information on the web. Yahoo research recently reported that 30% of all HTML pages contain structured data such as microdata, RDFa, or microformat. Although multilinguality of the web is a hurdle in information access, the rapid growth of the semantic web enables us to retrieve fine grained information across the language barrier. In this thesis, firstly, we focus on developing a methodology to perform cross-lingual semantic search over structured data (knowledge base), by transforming natural language queries into SPARQL. Secondly, we focus on improving the semantic similarity and relatedness measures, to overcome the semantic gap between the vocabulary in the knowledge base and the terms appearing in the query. The preliminary results are evaluated against the QALD-2 test dataset, which achieved a F1 score of 0.46, an average precision of 0.44, and an average recall of 0.48.
Cross Lingual Semantic Search by Improving Semantic Similarity and Relatedness Measures
Cross Lingual Semantic Search by Improving Semantic Similarity and Relatedness Measures
Since 2001, the semantic web community has been working hard towards creating standards which will increase the accessibility of available information on the web. Yahoo research recently reported that 30% of all HTML pages contain structured data such as microdata, RDFa, or microformat. Although multilinguality of the web is a hurdle in information access, the rapid growth of the semantic web enables us to retrieve fine grained information across the language barrier. In this thesis, firstly, we focus on developing a methodology to perform cross-lingual semantic search over structured data (knowledge base), by transforming natural language queries into SPARQL. Secondly, we focus on improving the semantic similarity and relatedness measures, to overcome the semantic gap between the vocabulary in the knowledge base and the terms appearing in the query. The preliminary results are evaluated against the QALD-2 test dataset, which achieved a F1 score of 0.46, an average precision of 0.44, and an average recall of 0.48.
Maria-Esther Vidal
Maria-Esther Vidal
Maria-Esther Vidal
Opher Etzion
Opher Etzion
Opher Etzion
Andreas Thalhammer
Andreas Thalhammer
Andreas Thalhammer
University of Trento
University of Trento
University of Trento
University of Zaragoza
University of Zaragoza
University of Zaragoza
Erik Mannens
Erik Mannens
Erik Mannens
Semantic Web Company GmbH
Semantic Web Company GmbH
Semantic Web Company GmbH
Shu Rong
Shu Rong
Shu Rong
SAP Labs USA
SAP Labs USA
SAP Labs USA
Daniel Miranker
Daniel Miranker
Daniel Miranker
Ora Lassila
Ora Lassila
Ora Lassila
MIT Sloan School of Management
MIT Sloan School of Management
MIT Sloan School of Management
Dezhao Song
Dezhao Song
Dezhao Song
NTT
NTT
NTT
Thomas Schandl
Thomas Schandl
Thomas Schandl
Yasuhiro Fujiwara
Yasuhiro Fujiwara
Yasuhiro Fujiwara
Marco Luca Sbodio
Marco Luca Sbodio
Marco Luca Sbodio
Josiane Xavier Parreira
Josiane Xavier Parreira
Josiane Xavier Parreira
Pieterjan De Potter
Pieterjan De Potter
Pieterjan De Potter
Personalization techniques aim at helping people dealing with the ever growing amount of information by filtering it according to their interests. However, to avoid the information overload, such techniques often create an over-personalization effect, i.e. users are exposed only to the content systems assume they would like. To break this "personalization bubble" we introduce the notion of serendipity as a performance measure for recommendation algorithms. For this, we first identify aspects from the user perspective, which can determine level and type of serendipity desired by users. Then, we propose a user model that can facilitate such user requirements, and enables serendipitous recommendations. The use case for this work focuses on TV recommender systems, however the ultimate goal is to explore the transferability of this method to different domains. This paper covers the work done in the first eight months of research and describes the plan for the entire PhD trajectory.
Personalization techniques aim at helping people dealing with the ever growing amount of information by filtering it according to their interests. However, to avoid the information overload, such techniques often create an over-personalization effect, i.e. users are exposed only to the content systems assume they would like. To break this "personalization bubble" we introduce the notion of serendipity as a performance measure for recommendation algorithms. For this, we first identify aspects from the user perspective, which can determine level and type of serendipity desired by users. Then, we propose a user model that can facilitate such user requirements, and enables serendipitous recommendations. The use case for this work focuses on TV recommender systems, however the ultimate goal is to explore the transferability of this method to different domains. This paper covers the work done in the first eight months of research and describes the plan for the entire PhD trajectory.
Burst the Filter Bubble: Using Semantic Web to Enable Serendipity
Burst the Filter Bubble: Using Semantic Web to Enable Serendipity
Burst the Filter Bubble: Using Semantic Web to Enable Serendipity
Alpen-Adria-Universität Klagenfurt
Alpen-Adria-Universität Klagenfurt
Alpen-Adria-Universität Klagenfurt
Chris Mellish
Chris Mellish
Chris Mellish
Bernardo Magnini
Bernardo Magnini
Bernardo Magnini
Felix Sasaki
Felix Sasaki
Felix Sasaki
Thomas Gottron
Thomas Gottron
Thomas Gottron
CRIM
CRIM
CRIM
Jie Bao
Jie Bao
Jie Bao
Ian Millard
Ian Millard
Ian Millard
Quality Reasoning in the Semantic Web
Quality Reasoning in the Semantic Web
Assessing the quality of data published on the Web has been identified as an essential step in selecting reliable information for use in tasks such as decision making. This paper discusses a quality assessment framework based on semantic web technologies and outlines a role for provenance in supporting and documenting such assessments.
Quality Reasoning in the Semantic Web
Assessing the quality of data published on the Web has been identified as an essential step in selecting reliable information for use in tasks such as decision making. This paper discusses a quality assessment framework based on semantic web technologies and outlines a role for provenance in supporting and documenting such assessments.
University of Hanover
University of Hanover
University of Hanover
Gabriela Montoya
Gabriela Montoya
Gabriela Montoya
Timothy Lebo
Timothy Lebo
Timothy Lebo
Poster, Demo, and SWC Minute Mandess
Tokyo Institute of Technology
Tokyo Institute of Technology
Tokyo Institute of Technology
Kavi Mahesh
Kavi Mahesh
Kavi Mahesh
University of Texas at Austin
University of Texas at Austin
University of Texas at Austin
Serena Villata
Serena Villata
Serena Villata
Emanuele Della Valle
Emanuele Della Valle
Emanuele Della Valle
University of L'Aquila
University of L'Aquila
University of L'Aquila
Shenghui Wang
Shenghui Wang
Shenghui Wang
Hans Uszkoreit
Hans Uszkoreit
Hans Uszkoreit
Harald Sack
Harald Sack
Harald Sack
Yan Kang
Yan Kang
Yan Kang
Federated and Stream Query Processing
Lin Clark
Lin Clark
Lin Clark
University of Maryland, Baltimore County
University of Maryland, Baltimore County
University of Maryland, Baltimore County
Steffen Staab
Steffen Staab
Steffen Staab
Fabrice Lacroix
Fabrice Lacroix
Fabrice Lacroix
Paul Groth
Paul Groth
Paul Groth
Université libre de Bruxelles
Université libre de Bruxelles
Université libre de Bruxelles
Makoto Nakatsuji
Makoto Nakatsuji
Makoto Nakatsuji
Evaluation of reasoning with ontologies
Wim Peters
Wim Peters
Wim Peters
Dimitris Plexousakis
Dimitris Plexousakis
Dimitris Plexousakis
Sergio Tessaris
Sergio Tessaris
Sergio Tessaris
Tom Heath
Tom Heath
Tom Heath
Andrea Giovanni Nuzzolese
Andrea Giovanni
Andrea Giovanni Nuzzolese
Nuzzolese
Andrea Giovanni Nuzzolese
Kristian Slabbekoorn
Kristian Slabbekoorn
Kristian Slabbekoorn
Ivan Cantador
Ivan Cantador
Ivan Cantador
Oshani Seneviratne
Oshani Seneviratne
Oshani Seneviratne
Jit Biswas
Jit Biswas
Jit Biswas
Vinay Chaudhri
Vinay Chaudhri
Vinay Chaudhri
Search, question answering and entity summarization
Feng Cao
Feng Cao
Feng Cao
Nigam Shah
Nigam Shah
Nigam Shah
Tudor Groza
Tudor Groza
Tudor Groza
Amal Zouaq
Amal Zouaq
Amal Zouaq
Ioannis Papoutsis
Ioannis Papoutsis
Ioannis Papoutsis
Wolf Siberski
Wolf Siberski
Wolf Siberski
Stefan Decker
Stefan Decker
Stefan Decker
ISI/USC
ISI/USC
ISI/USC
E.A. Draffan
E.A. Draffan
E.A. Draffan
Gerhard Friedrich
Gerhard Friedrich
Gerhard Friedrich
National Observatory of Athens
National Observatory of Athens
National Observatory of Athens
Mariano Rodriguez-Muro
Mariano Rodriguez-Muro
Mariano Rodriguez-Muro
University of Bologna
University of Bologna
University of Bologna
Damian Gessler
Damian Gessler
Damian Gessler
The original vision of the Semantic Web was to encode semantic content on the web in a form with which machines can reason. But in the last few years, we've seen many new Internet-based applications (such as Wikipedia, Linux, and prediction markets) where the key reasoning is done, not by machines, but by large groups of people. This talk will show how a relatively small set of design patterns can help understand a wide variety of these examples. Each design pattern is useful in different conditions, and the patterns can be combined in different ways to create different kinds of collective intelligence. Building on this foundation, the talk will consider how the Semantic Web might contribute to - and benefit from - these more human-intensive forms of collective intelligence.
The Semantic Web and Collective Intelligence
The Semantic Web and Collective Intelligence
The original vision of the Semantic Web was to encode semantic content on the web in a form with which machines can reason. But in the last few years, we've seen many new Internet-based applications (such as Wikipedia, Linux, and prediction markets) where the key reasoning is done, not by machines, but by large groups of people. This talk will show how a relatively small set of design patterns can help understand a wide variety of these examples. Each design pattern is useful in different conditions, and the patterns can be combined in different ways to create different kinds of collective intelligence. Building on this foundation, the talk will consider how the Semantic Web might contribute to - and benefit from - these more human-intensive forms of collective intelligence.
The Semantic Web and Collective Intelligence
Nokia
Nokia
Nokia
Nenad Stojanovic
Nenad Stojanovic
Nenad Stojanovic
Chito Jovellanos
Chito Jovellanos
Chito Jovellanos
Alexander Ivanyukovich
Alexander Ivanyukovich
Alexander Ivanyukovich
Claude Bernard University Lyon 1
Claude Bernard University Lyon 1
Claude Bernard University Lyon 1
Dario Cerizza
Dario Cerizza
Dario Cerizza
Milan Dojchinovski
Milan Dojchinovski
Milan Dojchinovski
Free University of Berlin
Free University of Berlin
Free University of Berlin
Paulo Costa
Paulo Costa
Paulo Costa
To realize the Smart Cities vision, applications can leverage the large availability of open datasets related to urban environments. Those datasets need to be integrated, but it is often hard to automatically achieve a high-quality interlinkage. Human Computation approaches can be employed to solve such a task where machines are ineffective. We argue that in this case not only people's background knowledge is useful to solve the task, but also people's physical presence and direct experience can be successfully exploited. In this paper we present UrbanMatch, a Game with a Purpose for players in mobility aimed at validating links between points of interest and their photos; we discuss the design choices and we show the high throughput and accuracy achieved in the interlinking task.
Linking Smart Cities Datasets with Human Computation - the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation - the case of UrbanMatch
To realize the Smart Cities vision, applications can leverage the large availability of open datasets related to urban environments. Those datasets need to be integrated, but it is often hard to automatically achieve a high-quality interlinkage. Human Computation approaches can be employed to solve such a task where machines are ineffective. We argue that in this case not only people's background knowledge is useful to solve the task, but also people's physical presence and direct experience can be successfully exploited. In this paper we present UrbanMatch, a Game with a Purpose for players in mobility aimed at validating links between points of interest and their photos; we discuss the design choices and we show the high throughput and accuracy achieved in the interlinking task.
Linking Smart Cities Datasets with Human Computation - the case of UrbanMatch
Stony Brook University
Stony Brook University
Stony Brook University
Themistoklis Herekakis
Themistoklis Herekakis
Themistoklis Herekakis
University of Sheffield
University of Sheffield
University of Sheffield
Szymon Klarman
Szymon Klarman
Szymon Klarman
Giovanni Tummarello
Giovanni Tummarello
Giovanni Tummarello
Steffen Stadtmüller
Steffen Stadtmüller
Steffen Stadtmüller
Raghava Mutharaju
Raghava Mutharaju
Raghava Mutharaju
Provenance for SPARQL queries
Determining trust of data available in the Semantic Web is fundamental for applications and users, in particular for linked open data obtained from SPARQL endpoints. There exist several proposals in the literature to annotate SPARQL query results with values from abstract models, adapting the seminal works on provenance for annotated relational databases. We provide an approach capable of providing provenance information for a large and significant fragment of SPARQL 1.1, including for the first time the major non-monotonic constructs under multiset semantics. The approach is based on the translation of SPARQL into relational queries over annotated relations with values of the most general m-semiring, and in this way also refuting a claim in the literature that the OPTIONAL construct of SPARQL cannot be captured appropriately with the known abstract models.
Determining trust of data available in the Semantic Web is fundamental for applications and users, in particular for linked open data obtained from SPARQL endpoints. There exist several proposals in the literature to annotate SPARQL query results with values from abstract models, adapting the seminal works on provenance for annotated relational databases. We provide an approach capable of providing provenance information for a large and significant fragment of SPARQL 1.1, including for the first time the major non-monotonic constructs under multiset semantics. The approach is based on the translation of SPARQL into relational queries over annotated relations with values of the most general m-semiring, and in this way also refuting a claim in the literature that the OPTIONAL construct of SPARQL cannot be captured appropriately with the known abstract models.
Provenance for SPARQL queries
Provenance for SPARQL queries
Wells Fargo
Wells Fargo
Wells Fargo
In the 1990s, as the World Wide Web became not only world wide but also dense and ubiquitous, workers in the artificial intelligence community were drawn to the possibility that the Web could provide the foundation for a new kind of AI. Having survived the AI Winter of the 1980s, the opportunities that they saw in the largest, most interconnected computing platform imaginable were obviously compelling. With the subsequent success of the Semantic Web, however, our community seems to have stopped talking about many of the issues that researchers believe led to the AI Winter in the first place: the cognitive challenges in debugging and maintaining complex systems, the drift in the meanings ascribed to symbols, the situated nature of knowledge, the fundamental difficulty of creating robust models. These challenges are still with us; we cannot wish them away with appeals to the open-world assumption or to the law of large numbers. Embracing these challenges will allow us to expand the scope of our science and our practice, and will help to bring us closer to the ultimate vision of the Semantic Web.
In the 1990s, as the World Wide Web became not only world wide but also dense and ubiquitous, workers in the artificial intelligence community were drawn to the possibility that the Web could provide the foundation for a new kind of AI. Having survived the AI Winter of the 1980s, the opportunities that they saw in the largest, most interconnected computing platform imaginable were obviously compelling. With the subsequent success of the Semantic Web, however, our community seems to have stopped talking about many of the issues that researchers believe led to the AI Winter in the first place: the cognitive challenges in debugging and maintaining complex systems, the drift in the meanings ascribed to symbols, the situated nature of knowledge, the fundamental difficulty of creating robust models. These challenges are still with us; we cannot wish them away with appeals to the open-world assumption or to the law of large numbers. Embracing these challenges will allow us to expand the scope of our science and our practice, and will help to bring us closer to the ultimate vision of the Semantic Web.
Tackling Climate Change: Unfinished Business from the Last "Winter"
Tackling Climate Change: Unfinished Business from the Last "Winter"
Tackling Climate Change: Unfinished Business from the Last "Winter"
Avigdor Gal
Avigdor Gal
Avigdor Gal
Vassilis Christophides
Vassilis Christophides
Vassilis Christophides
Jay Banerjee
Jay Banerjee
Jay Banerjee
Paolo Bouquet
Paolo Bouquet
Paolo Bouquet
Coffee Break
Polytechnic University of Bari
Polytechnic University of Bari
Polytechnic University of Bari
Metaweb
Metaweb
Metaweb
Olivier Curé
Olivier Curé
Olivier Curé
Ilianna Kollia
Ilianna Kollia
Ilianna Kollia
Franz, Inc.
Franz, Inc.
Franz, Inc.
Cecile Bothorel
Cecile Bothorel
Cecile Bothorel
Craig Knoblock
Craig Knoblock
Craig Knoblock
Bryan Thompson
Bryan Thompson
Bryan Thompson
DeFacto - Deep Fact Validation
One of the main tasks when creating and maintaining knowledge bases is to validate facts and provide sources for them in order to ensure correctness and traceability of the provided knowledge. So far, this task is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents. In this article, we present DeFacto (Deep Fact Validation) - an algorithm for validating facts by finding trustworthy sources for it on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of webpages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact.
DeFacto - Deep Fact Validation
One of the main tasks when creating and maintaining knowledge bases is to validate facts and provide sources for them in order to ensure correctness and traceability of the provided knowledge. So far, this task is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents. In this article, we present DeFacto (Deep Fact Validation) - an algorithm for validating facts by finding trustworthy sources for it on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of webpages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact.
DeFacto - Deep Fact Validation
Davy Van Deursen
Davy Van Deursen
Davy Van Deursen
Driving Innovation with Open Data and Interoperability
Driving Innovation with Open Data and Interoperability
Data.gov, a flagship open government project from the US government, opens and shares data to improve government efficiency and drive innovation. Sharing such data allows us to make rich comparisons that could never be made before and helps us to better understand the data and support decision making. The adoption of open linked data, vocabularies and ontologies, the work of the W3C, and semantic technologies is helping to drive Data.gov and US data forward. This session will help us to better understand the changing global landscape of data sharing and the role the semantic web is playing in it. This session highlights specific data sharing examples of solving mission problems from NASA, the White House, and many other governments agencies and citizen innovators.
Data.gov, a flagship open government project from the US government, opens and shares data to improve government efficiency and drive innovation. Sharing such data allows us to make rich comparisons that could never be made before and helps us to better understand the data and support decision making. The adoption of open linked data, vocabularies and ontologies, the work of the W3C, and semantic technologies is helping to drive Data.gov and US data forward. This session will help us to better understand the changing global landscape of data sharing and the role the semantic web is playing in it. This session highlights specific data sharing examples of solving mission problems from NASA, the White House, and many other governments agencies and citizen innovators.
Driving Innovation with Open Data and Interoperability
ISTC-CNR
ISTC-CNR
ISTC-CNR
Antoine Isaac
Antoine Isaac
Antoine Isaac
Birte Glimm
Birte Glimm
Birte Glimm
Ken Barker
Ken Barker
Ken Barker
Daniele Dell'Aglio
Daniele Dell'Aglio
Daniele Dell'Aglio
Shonali Krishnaswamy
Shonali Krishnaswamy
Shonali Krishnaswamy
Hasso Plattner Institute
Hasso Plattner Institute
Hasso Plattner Institute
Coffee Break
Fernando Bobillo
Fernando Bobillo
Fernando Bobillo
Vienna University of Technology
Vienna University of Technology
Vienna University of Technology
Gerd Gröner
Gerd Gröner
Gerd Gröner
Christian Bizer
Christian Bizer
Christian Bizer
Stephan Grimm
Stephan Grimm
Stephan Grimm
Magnus White
Magnus White
Magnus White
Coffee Break
University of Southampton
University of Southampton
University of Southampton
Technical University of Madrid
Technical University of Madrid
Technical University of Madrid
Myunggwon Hwang
Myunggwon Hwang
Myunggwon Hwang
Cardiff University
Cardiff University
Cardiff University
Paulo Pinheiro Da Silva
Paulo Pinheiro Da Silva
Paulo Pinheiro Da Silva
Iowa State University
Iowa State University
Iowa State University
Umberto Straccia
Umberto Straccia
Umberto Straccia
Kendall Clark
Kendall Clark
Kendall Clark
Mohamed Morsey
Mohamed Morsey
Mohamed Morsey
Coffee Break
University of Ulm
University of Ulm
University of Ulm
Patrick Sinclair
Patrick Sinclair
Patrick Sinclair
Evaluations and Experiments Track
National Institute of Informatics
National Institute of Informatics
National Institute of Informatics
Konrad Höffner
Konrad Höffner
Konrad Höffner
Lunch
Victor de Boer
Victor de Boer
Victor de Boer
Manuel Atencia
Manuel Atencia
Manuel Atencia