Using Federated Learning in YOUTH-Gems. What have we learned so far?

July 8, 2025

Imagine if researchers could work together to predict mental health challenges early, without ever seeing your private data. This can actually be possible, thanks to something called federated learning!

In this article, Esmeralda Ruiz, postdoc at the University of Barcelona and co-lead of WP6 highlights how the integration of a federated learning platform is helping Youth-GEMs reshape the prediction of mental health difficulties in vulnerable youth, while ensuring the highest standards of data privacy and security.

What is a federated platform and why do we need one?

The main goal of WP6 is to predict mental health difficulties in vulnerable youth by leveraging advanced AI techniques while ensuring the highest standards of data privacy and security.

To support this, federated learning is being introduced as a complementary approach to centralised model development. Instead of transferring sensitive data across institutions, federated learning allows models to be trained locally where the data resides, sharing only the learned parameters or results of the analysis. This approach is particularly valuable in contexts where data sharing is restricted by privacy regulations, legal constraints, or institutional policies. It enables collaboration across institutions while ensuring that sensitive mental health data remains securely within its original environment.

Additionally, the use of a federated data platform will facilitate the reuse of mental health datasets by implementing the FAIR data principles – making data Findable, Accessible, Interoperable, and Reusable. This not only improves the efficiency and transparency of research but also strengthens inter-institutional cooperation and creates a more diverse and comprehensive dataset to support robust, privacy-preserving mental health prediction models.

How does federated learning work?

Federated learning is a privacy-preserving machine learning approach that allows multiple organisations to collaborate on training a shared model without exchanging raw data.

Here’s how it works:

A global model is initialized and distributed to all participating sites (e.g., hospitals, clinics, or research centres).
Each site trains the model locally using its own private data, which stays securely on-site.
After training, each site sends back only the model updates to a central server – no personal or raw data is shared.
The central server aggregates these updates from all sites to create a combined global model.
The updated model is then sent back to the sites, where the cycle can continue for further refinement.

This process enables the development of high-quality predictive models while respecting data privacy, complying with legal and ethical standards, and maintaining data security across institutions.

To make it user-friendly within a secure environment, a platform created by other EU-funded projects (e.g., Datatools4Heart), called Fedder, has been used. We have developed specific algorithms required for the Youth-GEMS project to perform the simulation through this platform. These include tree-based models with fairness considerations and performance metrics, which guide the aggregation process on the central server. This ensures equitable and high-performing predictions across all demographic and geographic groups, addressing data disparities between nodes.

Participating partners

Each research center will train the model locally on its own data. In this context, each site is referred to as a client. The University of Barcelona will host the central server, which will collect and aggregate model updates from all nodes, then redistribute the combined model. This setup ensures privacy while enabling collaborative learning across institutions. The centres participating in the federated simulation are:

Maastricht University (client)
Hospital Universitario Gregorio Marañón (client)
University of Split, School of Medicine (client)
University of Barcelona (server)

Why do we need federated learning in Youth-GEMS?

Federated learning is key to making collaboration easier and safer across different institutions in Youth-GEMS. It allows each research centre to contribute to building a powerful AI model without sharing sensitive personal data.

This is especially important in mental health research, where privacy and data protection are critical. By keeping data at each local site and only sharing model updates, we reduce the need for complex data transfer agreements and lengthy approval processes. This makes it easier to use existing data and to include new institutions in the future, since data never has to leave its original location.

What are the preliminary results of the simulation?

To test the practical application, a simulation of the federated learning platform has been launched within the Youth-GEMS project. This simulation helps evaluate the technical setup, performance, and privacy-preserving capabilities before scaling up to include more centres.

The simulation is still in its initial stages, and we are integrating the novel Youth-GEMS methods into the Fedder platform (Figure 1). At this stage, two research sites are being used, and they are not yet allocated to the final nodes, but this step is straightforward.

Figure 1. Creation of the Youth-GEMS federated analysis Project through the Fedder platform

Some preliminary results are shown in Figure 2, where we observe that the results trained locally at each research centre are similar to those after the combination of parameters on the server, as expected (0.69 local AUCs versus 0.70 federated AUC). This demonstrates succesful results even with just two local nodes.

Figure 2. Results of the Youth-GEMS federated using two nodes

What’s next for federated platform?

We will finish integrating the Youth-GEMS methods into the Fedder platform and complete the installation at our partner sites to finalise the simulation. This will enable the expansion of the project to other external centres that wish to collaborate with Youth-GEMS in the future, while preserving security and privacy.

This article was written in collaboration with: