Combinatorial Structures and Algorithms, Institute of Theoretical Computer Science, Department of Computer Science, ETH Zürich

Seijin Kobayashi

ETH Zürich
Seijin Koayashi
Institut für Theoretische Informatik
OAT Z 11
Andreasstrasse 5
8050 Zürich

I am a Machine Learning PhD student under the supervision of Prof. Angelika Steger at ETH Zurich's Institute of Theoretical Computer Science. My academic journey began with a bachelor's degree in applied mathematics from École Polytechnique in 2014. In 2016, I obtained my Master's degree in computer science from ETH Zurich. Following my Master's, I spent three years working as a Software Engineer at Google Zurich, to finally come back to academia in 2020.
My approach to research thus combines a strong foundation in mathematics and engineering. I am passionate about both theoretical and experimental aspects, with a focus on unveiling the inductive biases inherent in various deep learning models and algorithms. My ultimate goal is to enhance our understanding of these tools and contribute to their practical improvement - and I am particularly excited in drawing inspiration from human intelligence (as well as limitations) to guide my research.
My recent interest lies in developping learning algorithms that can systematically generalize in various settings, including in supervised learning and reinforcement learning, as well as a mechanistic understanding of in-context learning capabilities in modern neural network architectures.
In my free time, I indulge in the beauty of Swiss nature, but also pursue mastering handcrafts such as pottery.

Publications

* – equal contributions

Uncovering mesa-optimization algorithms in Transformers
J. von Oswald*, E. Niklasson*, M. Schlegel*, S. Kobayashi, N. Zucchet, N. Scherrer, N. Miller, M. Sandler, B. Agüera y Arcas, M. Vladymyrov, R. Pascanu, J. Sacramento

Gated recurrent neural networks discover attention
N. Zucchet*, S. Kobayashi*, Y. Akram*, J. von Oswald, M. Larcher, A. Steger, J. Sacramento

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis
A. Meulemans*, S. Schug*, S. Kobayashi*, N. Daw, G. Wayne
37th Conference on Neural Information Processing Systems (NeurIPS) , 2023.

Meta-Learning via Classifier(-free) Diffusion Guidance
E. Nava, S. Kobayashi*, Y. Yin, R. K. Katzschmann, B. F. Grewe
Transactions on Machine Learning Research (TMLR), 2023.

The least-control principle for learning at equilibrium
A. Meulemans*, N. Zucchet*, S. Kobayashi*, J. von Oswald, J. Sacramento
36th Conference on Neural Information Processing Systems (NeurIPS), 2022.

Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent Kernel
S. Kobayashi*, P. Vilimelis Aceituno, J. von Oswald
36th Conference on Neural Information Processing Systems (NeurIPS), 2022.

Learning where to learn: Gradient sparsity in meta and continual learning
J. Von Oswald*, D. Zhao*, S. Kobayashi, S. Schug, M. Caccia, N. Zucchet, J. Sacramento
35th Conference on Neural Information Processing Systems (NeurIPS), 2021.

Posterior meta-replay for continual learning
C. Henning*, M. Cervera*, F. D'Angelo, J. Von Oswald, R. Traber, B. Ehret, S. Kobayashi, B. F Grewe, J. Sacramento
35th Conference on Neural Information Processing Systems (NeurIPS), 2021.

Neural networks with late-phase weights
J. von Oswald*, S. Kobayashi*, A. Meulemans, C. Henning, B. F. Grewe, J. Sacramento
International Conference on Learning Representations (ICLR), 2021.

Workshop papers

* – equal contributions

On the reversed bias-variance tradeoff in deep ensembles
S. Kobayashi*, J. Von Oswald*, B. F. Grewe
ICML 2021 Workshop on Uncertainty and Robustness in Deep Learning, 2021.

Meta-learning via hypernetworks
D. Zhao, S. Kobayashi, J. Sacramento, J. von Oswald
NeurIPS Workshop on Meta-Learning, 2020.

Check Google scholar for an updated and complete publication history.