Repository logo
Andean Publishing ↗
New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Autor "Max Dreger"

Filter results by typing the first few letters
Now showing 1 - 1 of 1
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    Item type: Item ,
    KNOWLEDGE GRAPH CONSTRUCTION FROM MATERIALS SCIENCE LITERATURE USING LARGE LANGUAGE MODELS AND ADVANCED DATA PREPROCESSING
    (2026) Nasrin Mohammadi; Max Dreger; Diego Collarana; Mohammad J. Eslamibidgoli; Kourosh Malek; M. Eikerling
    In this work, we present a pipeline for automated knowledge graph construction from materials science literature using large language models (LLMs). The proposed method performs entity and relationship extraction guided by a data model based on the logic of the Elementary Multiperspective Material Ontology (EMMO), structuring the output into a machine-interpretable graph format. The pipeline integrates several key components, including prompt-based extraction, a hierarchical chunking strategy that leverages document structure and section headers, and post-processing steps such as normalization, LLM-assisted deduplication, and alignment of node identifiers. A central focus of this study is the evaluation of different chunking strategies. Specifically we compare fixed-size splitting with a hierarchical chunking approach that incorporates document structure and header information. Our results show that hierarchical chunking consistently outperforms fixed-size chunking across both entity and relationship extraction tasks, achieving higher precision, recall, and F1 scores through more context-aware segmentation. Extracted entities and relationships are aligned with a curated ground truth dataset through manual verification to ensure semantic correctness. Overall, these findings indicate that LLMs, when combined with domain-specific ontological guidance and well-designed pre-and post-processing, can effectively extract high quality knowledge graphs from complex materials science literature. This benefits materials scientists and researchers by reducing manual curation effort and accelerating data-driven materials discovery.

Andean Library © 2026 · Andean Publishing

  • Accessibility settings
  • Privacy policy
  • End User Agreement
  • Send Feedback