AI Strategy for Earth System Data
The KI:STE developments will facilitate the use of ML and AI methods for spatial data analysis applications.
Artificial Intelligence (AI) methods are rapidly evolving and increasingly being used in the context of environmental data. However, this often occurs in isolated solutions. The environmental and earth system sciences have yet to establish the systematic use of modern AI methods. In particular, there is a discrepancy between the requirements of solid and technically sound environmental data analysis and the applicability of modern AI methods such as Deep Learning for researchers.
The KI:STE project aimed to facilitate and evaluate the use of AI for remote sensing of Earth Observation data for a range of applications. The fields studied in the project ranged from air quality to clouds and radiation, to landslides and natural hazards, and water that drives vegetation, closing the loop with air quality. A key focus was not only to adopt and apply AI concepts to these areas, but also to train several PhD students and build an e-learning platform. This made the algorithms and tools developed more accessible to a wider audience, from scientists to practitioners.
The KI:STE project was completed at the end of 2023. The GeoNode dashboard features were extended to include multi-line plots and sunburst plots. Further improvements were made to the map view to increase the usability for a multitude of geodata as used and produced in KI:STE. Our team also implemented an integration of data sources that are not available though interoperability standards. GeoNode was originally developed as a classic server-side solution.
In order to meet the KI:STE requirements of running in a cloud environment, several changes were necessary from the software project setup to the service composition. Modifications were made to support Kubernetes managed environments and Helm charts were developed. In addition, the project needed to be adapted to a fully containerized setup and was extended to support different build pipelines for ease of development. The enhanced GeoNode software was deployed in cooperation with Ambrosys on the combined KISTE AWS tier. To evaluate the setup, GeoNode was deployed via Helm charts and datasets such as AQ-Bench, Meteosat and Sentinel scenes were imported and provided.
A first draft of the OGC Connected Systems API (CS API) based on pygeoapiwas designed and prototypically implemented in KI:STE. This development is done in synergy with several other research projects (MINKE, EMODnet Ingestion III). In the current sample application, the CS API connects to the TOAR database hosted at the Super Computing Centre Jülich. Although the standard is still in its early stages, stations can now e.g. be accessed in standardized GeoJSON and SensorML formats and easily visualized in existing tools. With the advancement of the CS API, the TOAR database could be easily integrated into various clients supporting the CS API without additional implementation effort.
In addition, the use of data provided in the RDI in the data analysis workflows should be facilitated. 52°North investigated AI-based tools, such as CodeGPT, that can correct or even provide source code. To this end, we tested CodeGPT on a variety of tasks with varying degrees of complexity, from the rather simple task of removing whitespace from a character string, to complex tasks such as the analysis of OpenStreetMap data, data conversions and the intersection of geometries.
Partners
Forschungszentrum Jülich GmbH, Germany
Jülich Supercomputing Centre (JSC) und Institut für Bio- und Geowissenschaften – Agrosphäre (IBG-3), Germany
Universität zu Köln, Institut für Geophysik und Meteorologie, Germany
Universität Bonn, Institut für Geodäsie und Geoinformatik, Germany
RWTH Aachen, Aachen Institute for Advanced Study in Computational Engineering Science, Germany
Ambrosys GmbH Gesellschaft für Management komplexer Systeme, Potsdam, Germany