52°North awards the Student Innovation Prize to Simeon Wetzel for a successful implementation of his concept for geodata search and discovery in SDIs.
The latest edition of the 52°North Student Innovation Prize has rewarded the proposal and implementation of “LLM-based Assistants for Data Discovery in SDIs” by Simeon Wetzel, a PhD student at the TU Dresden. It presents an innovative approach to overcome existing challenges in geospatial metadata search architectures.
In his approach, Simeon points out the limitations of traditional metadata search and discovery, which rely on lexical methods implemented through metadata catalogs or portals such as CKAN and GeoNetwork. Such systems often fail to address vocabulary mismatches and user unfamiliarity with specific terminologies, particularly in interdisciplinary contexts. In addition, metadata quality issues and incomplete representation of datasets frequently hinder the identification of data that meets user requirements. To address these gaps, the proposed framework takes advantage of cutting-edge advances in neural networks and language models. By integrating open-source Large Language Models (LLMs) such as Llama or Mixtral, the framework combines a chatbot interface with a semantic search index tailored to geodata and metadata. This dual approach enables the system to more accurately capture user intent through interactive dialog and to perform semantic searches that analyze both metadata and actual geodata attributes. For instance, when searching for “hospitals with emergency rooms,” the framework goes beyond matching metadata keywords. It identifies relevant geodata attributes and uses dialog-driven interactions to further refine search criteria, such as requirements for specialty departments. This ensures a closer match between user needs and available datasets.
Simeon’s proposed framework promises significant benefits to both the geoinformatics and research data communities. By facilitating intuitive and context-aware searches, it lowers the barriers to entry for users from diverse backgrounds and increases the accessibility of high-quality geodata. Moreover, it creates new opportunities for integrating semantic technologies and interactive AI systems into existing geospatial infrastructures, towards more intelligent and user-centric research data portals.
Workshop LLM-based Assistants for Data Discovery in SDIs
52°North recently held a workshop in which Simeon presented his concept using Large Language Models as well as AI/ML patterns to automatically derive meaningful metadata from data sets. Participants had the opportunity to examine the technologies and exchange ideas.
Simeon will also give a presentation of his work at the upcoming FOSSGIS 2025 conference in Münster, Germany.