Author
Jesús Gorroñogoitia (Atos), Jorge Fernández Fabeiro (Atos), Lucas Pelegrin Caparrós (Atos), Indika Kumara (JADS/UVT), Dragan Radolović (XLAB), Nejc Bat (XLAB)
Kamil Tokmakov (USTUTT), Kalman Meth (IBM), Giovanni Quattrocchi (POLIMI), Paul Mundt (ADPT)
Date

This deliverable reports on the status of the development, at M24, of the SODALITE Runtime Layer and the integration of its components with the rest of the SODALITE platform. This is the second of three deliverables in this series, to be released annually during the project lifetime. This deliverable complements D3.1 1 and D4.2 2 , and the interested reader is encouraged to read these deliverables to get a better understanding of the overall technology stack of the SODALITE platform.

The Runtime Layer offers three main features: (1) the orchestration of the deployment of applications on heterogeneous infrastructures, (2) the collection of runtime monitoring information, and (3) the adaptation of applications for performance improvement.

The main focus of the deliverable is to present the new features that have been incorporated into the Runtime Layer since the last release in M12 D5.1 3 , with the focus on the innovation they bring, their internal architecture within the Runtime Layer, the main functional aspects they offer, the current status of their development, the analysis of their QA assessment, and the planned developments for next releases in M30 and M36.

  • The Orchestration Layer: The M24 Runtime Layer release supports the deployment of orchestrated, containerized applications in Cloud infrastructures managed by AWS, OpenStack or Kubernetes, as well as on HPC clusters managed by SLURM, TORQUE/PBS Pro schedulers. The access to the Orchestration layer is protected by the adoption of the SODALITE IAM Authentication​. Orchestration has been extended to support multiplatform, hybrid data management, by adopting stream-driven, data transfer technology adopted from the RADON project 4 , as a result of our mutual collaboration. This IaC data management feature will be extended to support HPC environments in SODALITE.
  • The monitoring layer: has been significantly redesigned to support dynamic monitoring of targets on both Cloud infrastructures, such as OpenStack, and on HPC clusters, such as those managed by TORQUE/PBS Pro and SLURM schedulers, on Edge and also on their interconnecting network. Moreover, Monitoring supports the broadcasting of alert notifications to subscribers, such as those in Refactoring, upon the detection of QoS violations.
  • The Refactoring Layer: Deployment refactorer was integrated with the SODALITE monitoring infrastructure to support the adaptation of the deployment topology of an application in response to monitoring data and alerts. The machine learning (ML) pipeline for building ML models for predicting the performance of many deployment alternatives of an application has been implemented and evaluated. These predictiction models enable the selection and switching among deployment variants at runtime. The policy-based deployment adaptation was improved to support the various event-based adaptation use cases. The dynamic discovery of nodes has been improved to support node (TOSCA) policies. Node Manager implementation and evaluation was completed. Node Manager provides runtime resource management at three levels: cluster-level smart load balancing, machine-level supervision of resource contention, and container-level control theoretical vertical scalability.

Partial integration of the Runtime Layer components, mostly for orchestration drivers, some monitoring and refactoring components have been completed in M24 release.

Next steps towards the release of the final version of the Runtime Layer (M30, M36) are focusing on the complete integration of the entire Runtime Layer, the support of deployment to additional target infrastructures such as OpenFaaS and Google Cloud, the automation of dynamic monitoring configuration upon deployment, the complete implementation of the alerting management, the implementation of specialized monitoring dashboards, the support of all redeployment adaptation scenarios and several improvements in refactoring features.