ETH Zürich
Simon Schug
Institut für Theoretische Informatik
OAT Z 11
Andreasstrasse 5
8092 Zürich
Attention as a Hypernetwork
Simon Schug, Seijin Kobayashi, Yassir Akram, João Sacramento, Razvan Pascanu
Preprint
When can transformers compositionally generalize in-context?
Seijin Kobayashi*, Simon Schug*, Yassir Akram*, Florian Redhardt, Johannes von Oswald, Razvan Pascanu, Guillaume Lajoie, João Sacramento
NGSM workshop at ICML 2024
Discovering modular solutions that generalize compositionally
Simon Schug*, Seijin Kobayashi*, Yassir Akram, Maciej Wołczyk, Alexandra Proca, Johannes von Oswald, Razvan Pascanu, João Sacramento, Angelika Steger
ICLR 2024
Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis
Alexander Meulemans*, Simon Schug*, Seijin Kobayashi*, Nathaniel Daw, Gregory Wayne
NeurIPS 2023
Online learning of long-range dependencies
Nicolas Zucchet*, Robert Meier*, Simon Schug*, Asier Mujika, João Sacramento
NeurIPS 2023
A contrastive rule for meta-learning
Nicolas Zucchet*, Simon Schug*, Johannes von Oswald*, Dominic Zhao, João Sacramento
NeurIPS 2022
Random initialisations performing above chance and how to find them
Frederik Benzing, Simon Schug, Robert Meier, Johannes von Oswald, Yassir Akram, Nicolas Zucchet, Laurence Aitchison, Angelika Steger
OPT2022 Workshop at NeurIPS 2022
Presynaptic stochasticity improves energy efficiency and helps alleviate the stability-plasticity dilemma
Simon Schug*, Frederik Benzing*, Angelika Steger
eLife 10: e69884
Learning where to learn: Gradient sparsity in meta and continual learning
Johannes von Oswald*, Dominic Zhao*, Seijin Kobayashi, Simon Schug, Massimo Caccia, Nicolas Zucchet, João Sacramento
NeurIPS 2021
Task-Agnostic Continual Learning via Stochastic Synapses
Simon Schug*, Frederik Benzing*, Angelika Steger
Workshop on Continual Learning at ICML 2020