TUM Thesis Proposal: Automated Machine Learning in Collaborative Distributed Machine Learning

Automated Machine Learning in Collaborative Distributed Machine Learning

Type: Bachelor, Master

Date: Immediately

Supervisor: Sascha Rank, Kevin Armbruster, Niclas Kannengießer

In collaborative distributed machine learning (CDML) [2], multiple parties train ML models collaboratively while keeping their training data local. CDML is often used to train neural networks. In federated learning [3], a prominent CDML concept, a central server distributes a global neural network to a set of clients. Clients train the neural network with local training data and transmit intermediate results (e.g., gradients) to the central server. The central server aggregates the received intermediate results (e.g., by averaging gradients) and updates the global neural network. This process is repeated for several training rounds to iteratively improve the global neural network.

Before training a neural network, different hyperparameters must be defined, including the number and types of layers and learning rate. To find optimal hyperparameter configurations, automated machine learning (AutoML) methods, such as grid search and Bayesian optimization, offer valuable support [1]. AutoML methods are envisioned to automate development of machine learning (ML) models. However, AutoML methods for conventional (centralized) ML often cannot directly be applied to CDML systems. To be effective, for example, Bayesian optimization requires complete training data to be available to a single party. This requirement is in conflict with CDML, in which parties keep their training data local.

Specialized AutoML methods were developed to enable AutoML in CDML systems [e.g. 4]. However, the performance of such AutoML methods in CDML are unclear and many open issues exist that inhibit hyperparameter optimization in CDML systems. This gives rise to a variety of interesting research avenues that we will be happy to supervise:

Comparison of Neural Architecture Search Methods: You will systematically compare different methods for automated neural architecture search (NAS) in a CDML system of your choice, such as federated learning. The methods for automated NAS will be implemented and compared in a benchmark with different datasets in various FL settings.
Human-in-the-loop: You will investigate how practitioners interact with AutoML tools to leverage human expertise and AutoML methods in combination to cope with challenges in ML development arising from CDML.
AutoML in Industrial CDML Systems: You will investigate how practitioners integrate AutoML tools into their workflows in industrial CDML systems by conducting literature analyses or interviews.

If you are interested in CDML, AutoML, and ML development, this thesis is a great opportunity to work on a relevant and impactful topic!

Practical Impact: Help shape the future of ML development by exploring the effectiveness of different AutoML methods in CDML systems.
Hands-on Development: You will gain experience in AutoML, ML development, and CDML.
Cutting-Edge Topic: AutoML and CDML are rapidly evolving---your research will contribute to an emerging field.

Interested in one of the topics, or do you have your own ideas that fall into the umbrella topic? Do not hesitate to reach out to us.

References

[1] Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. Automated machine learning: methods, systems, challenges. Springer Nature, 2019.

[2] David Jin et al. “Collaborative Distributed Machine Learning”. In: ACM Computing Surveys 57.4 (2025), pp. 1– 36. issn: 0360-0300, 1557-7341. doi: 10.1145/3704807. url: dl.acm.org/doi/10.1145/3704807.

[3] Peter Kairouz et al. “Advances and open problems in federated learning”. In: Foundations and trends® in machine learning 14.1–2 (2021), pp. 1–210.

[4] Jianchun Liu et al. “Finch: Enhancing federated learning with hierarchical neural architecture search”. In: IEEE Transactions on Mobile Computing 23.5 (2023), pp. 6012–6026.

Recommended Readings

Hutter, F., Kotthoff, L., & Vanschoren, J. (2019). Automated machine learning: methods, systems, challenges (p.219). Springer Nature.
Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and trends® in machine learning, 14(1–2), 1-210.
Liu, J., Yan, J., Xu, H., Wang, Z., Huang, J., and Xu, Y. (2023). Finch: Enhancing federated learning with hierarchical neural architecture search. IEEE Transactions on Mobile Computing, 23(5), 6012-6026.
Zhu, H., and Jin, Y. (2021). Real-time federated evolutionary neural architecture search. IEEE transactions on evolutionary computation, 26(2), 364-378.
Zhu, H., Zhang, H., & Jin, Y. (2021). From federated learning to federated neural architecture search: a survey. Complex & Intelligent Systems, 7(2), 639-657.