Our group has five papers at the 2023 International Conference on Machine Learning (ICML). Congratulations to all co-authors.
- Simon Geisler, Yujia Li, Daniel Mankowitz, Ali Taylan Cemgil, Stephan Günnemann, Cosmin Paduraru
Transformers Meet Directed Graphs
While transformers have become vital for text, images, audio, video, and undirected graphs, the use of directed graphs has been surprisingly underexplored. In this work, we propose two structure- and direction-aware positional encodings for directed graphs based on graph spectral theory and random walks. The added directionality information proves useful for a variety of tasks, including source code understanding and correctness testing of sorting networks. With a 14.7% improvement over the prior state of the art on the Open Graph Benchmark Code2, our positional encodings for directed graphs are poised to become a crucial building block for a range of applications.
- Nicholas Gao, Stephan Günnemann
Generalizing Neural Wave Functions
Neural network-based wave function are shown great promise in modeling the behavior of molecules. While neural wave functions result in state-of-the-art accuracies, training is expensive and has to be redone for each molecular structure. In this work, we tackle the problem by proposing a method to adapt neural wave functions to arbitrary molecules. We achieve this by identifying two key desiderata a generalized wave function should fulfill. With these desiderata in mind, we construct a neural network solution satisfying both. In both computational chemistry and machine learning, we are the first to demonstrate that a single wave function can solve the Schrödinger equation of molecules with different atoms jointly.
- Marin Biloš, Kashif Rasul, Anderson Schneider, Yuriy Nevmyvaka, Stephan Günnemann
Modeling Temporal Data as Continuous Functions with Stochastic Process Diffusion
Time series data is obtained by measuring some real-world system. In many domains, we will find data that follows some underlying continuous process, for example, the temperature or the load of a system over time. Although the values are observed as separate events, we know the temperature always exists and its evolution over time is smooth, not jittery, What we want to do is model the generative process of such data and leverage this model for other crucial time series tasks such as forecasting. As it turns out, capturing the true generative process proves to be difficult, especially math the inherent stochasticity. This work proposes 1) treating time series as continuous functions, where specific points lie on the underlying function, and 2) extending the diffusion framework model to functions. We successfully apply this idea in probabilistic forecasting, imputation, and modeling stochastic processes.
- Tom Wollschläger, Nicholas Gao, Bertrand Charpentier, Mohamed Amine Ketata, Stephan Günnemann
Uncertainty Estimation for Molecules: Desiderata and Methods
Graph Neural Networks promise advancements in computational chemistry, yet face challenges when applied to out-of-distribution (OOD) samples. Uncertainty estimation (UE) may aid in such situations by communicating the model’s certainty about its prediction. We identify six key criteria for UE in molecular force fields. A comprehensive review reveals that no existing UE methods meet all these criteria. To bridge this gap, we introduce the Localized Neural Kernel (LNK), a Gaussian Process-based extension to GNNs. The LNK not only satisfies all these criteria but also enhances the model's calibration. It outperforms other methods in out-of-equilibrium detection while preserving high predictive accuracy.
- Arthur Kosmala, Johannes Gasteiger, Nicholas Gao, Stephan Günnemann
Ewald-based Long-Range Message Passing for Molecular Graphs
Graph neural networks have succeeded at modeling complex molecular interactions. This has been made possible by the spatial message passing regime, where messages decay with distance and focus on local interactions. While this has been an useful inductive bias, it also impedes the learning of long-range interactions such as electrostatics. To address this drawback, we propose Ewald message passing: a nonlocal Fourier space scheme which decays in frequency rather than real space. We test the approach with four baseline models and two datasets containing diverse periodic (OC20) and aperiodic structures (OE62). We observe robust improvements in energy mean absolute errors across all models and datasets, averaging 10% on OC20 and 16% on OE62.