Browsing by Autor "Diego Collarana"

Now showing 1 - 10 of 10

A neuro-symbolic system over knowledge graphs for link prediction
(IOS Press, 2023) Ariam Rivas; Diego Collarana; María Torrente; María-Esther Vidal
Neuro-Symbolic Artificial Intelligence (AI) focuses on integrating symbolic and sub-symbolic systems to enhance the performance and explainability of predictive models. Symbolic and sub-symbolic approaches differ fundamentally in how they represent data and make use of data features to reach conclusions. Neuro-symbolic systems have recently received significant attention in the scientific community. However, despite efforts in neural-symbolic integration, symbolic processing can still be better exploited, mainly when these hybrid approaches are defined on top of knowledge graphs. This work is built on the statement that knowledge graphs can naturally represent the convergence between data and their contextual meaning (i.e., knowledge). We propose a hybrid system that resorts to symbolic reasoning, expressed as a deductive database, to augment the contextual meaning of entities in a knowledge graph, thus, improving the performance of link prediction implemented using knowledge graph embedding (KGE) models. An entity context is defined as the ego network of the entity in a knowledge graph. Given a link prediction task, the proposed approach deduces new RDF triples in the ego networks of the entities corresponding to the heads and tails of the prediction task on the knowledge graph (KG). Since knowledge graphs may be incomplete and sparse, the facts deduced by the symbolic system not only reduce sparsity but also make explicit meaningful relations among the entities that compose an entity ego network. As a proof of concept, our approach is applied over a KG for lung cancer to predict treatment effectiveness. The empirical results put the deduction power of deductive databases into perspective. They indicate that making explicit deduced relationships in the ego networks empowers all the studied KGE models to generate more accurate links.
ABEL: Artificial Buddy for Effective Learning
(Springer Science+Business Media, 2026) T. Y. Emmy Lai; Ann-Kathrin Bernards; Dena Baghery; Marlena Flüh; Tobias Lang; Héctor Allende-Cid; Diego Collarana
Embedding Knowledge Graphs Attentive to Positional and Centrality Qualities
(Springer Science+Business Media, 2021) Afshin Sadeghi; Diego Collarana; Damien Graux; Jens Lehmann
Formal Concept Analysis for Semantic Compression of Knowledge Graph Versions
(Centre National de la Recherche Scientifique, 2021) Damien Graux; Diego Collarana; Fabrizio Orlandi
Recent years have witnessed the increase of openly available knowledge graphs online. These graphs are often structured according to the W3C semantic web standard RDF. With this availability of information comes the challenge of coping with dataset versions as information may change in time and therefore deprecates the former knowledge graph. Several solutions have been proposed to deal with data versioning, mainly based on computing data deltas and having an incremental approach to keep track of the version history. In this article, we describe a novel method that relies on aggregating graph versions to obtain one single complete graph. Our solution semantically compresses similar and common edges together to obtain a final graph smaller than the sum of the distinct versioned ones. Technically, our method takes advantage of FCA to match graph elements together. We also describe how this compressed graph can be queried without being unzipped, using standard methods.
GRAFT - Graph Retrieval Augmented Generation Fine-Tuning Approach
(2025) Moritz Busch; Giuliana Defilippis; Philipp Weiß; Christian Beecks; Stefan Decker; Diego Collarana
KNOWLEDGE GRAPH CONSTRUCTION FROM MATERIALS SCIENCE LITERATURE USING LARGE LANGUAGE MODELS AND ADVANCED DATA PREPROCESSING
(2026) Nasrin Mohammadi; Max Dreger; Diego Collarana; Mohammad J. Eslamibidgoli; Kourosh Malek; M. Eikerling
In this work, we present a pipeline for automated knowledge graph construction from materials science literature using large language models (LLMs). The proposed method performs entity and relationship extraction guided by a data model based on the logic of the Elementary Multiperspective Material Ontology (EMMO), structuring the output into a machine-interpretable graph format. The pipeline integrates several key components, including prompt-based extraction, a hierarchical chunking strategy that leverages document structure and section headers, and post-processing steps such as normalization, LLM-assisted deduplication, and alignment of node identifiers. A central focus of this study is the evaluation of different chunking strategies. Specifically we compare fixed-size splitting with a hierarchical chunking approach that incorporates document structure and header information. Our results show that hierarchical chunking consistently outperforms fixed-size chunking across both entity and relationship extraction tasks, achieving higher precision, recall, and F1 scores through more context-aware segmentation. Extracted entities and relationships are aligned with a curated ground truth dataset through manual verification to ensure semantic correctness. Overall, these findings indicate that LLMs, when combined with domain-specific ontological guidance and well-designed pre-and post-processing, can effectively extract high quality knowledge graphs from complex materials science literature. This benefits materials scientists and researchers by reducing manual curation effort and accelerating data-driven materials discovery.
QALD-9-ES: A Spanish Dataset for Question Answering Systems
(2023) Javier Soruco; Diego Collarana; Andreas Both; Ricardo Usbeck
Knowledge Graph Question Answering (KGQA) systems enable access to semantic information for any user who can compose a question in natural language. KGQA systems are now a core component of many industrial applications, including chatbots and conversational search applications. Although distinct worldwide cultures speak different languages, the number of languages covered by KGQA systems and its resources is mainly limited to English. To implement KGQA systems worldwide, we need to expand the current KGQA resources to languages other than English. Taking into account the recent popularity that Large-Scale Language Models are receiving, we believe that providing quality resources is key to the development of future pipelines. One of these resources is the datasets used to train and test KGQA systems. Among the few multilingual KGQA datasets available, only one covers Spanish, i.e., QALD-9. We reviewed the Spanish translations in the QALD-9 dataset and confirmed several issues that may affect the KGQA system’s quality. Taking this into account, we created new Spanish translations for this dataset and reviewed them manually with the help of native speakers. This dataset provides newly created, high-quality translations for QALD-9; we call this extension QALD-9-ES. We merged these translations into the QALD-9-plus dataset, which provides trustworthy native translations for QALD-9 in nine languages, intending to create one complete source of high-quality translations. We compared the new translations with the QALD-9 original ones using language-agnostic quantitative text analysis measures and found improvements in the results of the new translations. Finally, we compared both translations using the GERBIL QA benchmark framework using a KGQA system that supports Spanish. Although the question-answering scores only improved slightly, we believe that improving the quality of the existing translations will result in better KGQA systems and therefore increase the applicability of KGQA w.r.t. the Spanish language domain.
Semantic Intelligence: Graph RAG-Driven Agents for Time Series Analytics
(Springer Science+Business Media, 2025) Alexander Graß; Charles D. Pack; Diego Collarana; Stefan Decker; Christian Beecks
Spatial concept learning and inference on geospatial polygon data
(Elsevier BV, 2022) Patrick Westphal; Tobias Grubenmann; Diego Collarana; Simon Bin; Lorenz Bühmann; Jens Lehmann
Study-Buddy: A Knowledge Graph-Powered Learning Companion for School Students
(Springer Science+Business Media, 2023) Fernanda Martínez; Diego Collarana; Davide Calvaresi; Martin Arispe; Carla Florida; Jean-Paul Calbimonte