A common critique of machine learning research is that the results of many papers are based on trial-and-error, often representing only minor modifications of existing approaches and without any proper justification of the design decisions. The typical justification for publication is that a model resulted in a good score on some established benchmark, with the validity of these benchmarks being a topic on its own. Even from an engineering perspective this is obviously highly undesirable.
I was therefore very delighted by a preliminary reading of linear oscillatory state-space models presented by T. Konstantin Rusch and Daniela Rus at ICLR 2025. This publication combines several desirable qualities.
- It is based on a justified theoretical model
- The approach strongly inspired by physics and neurobiology
- It exercises mathematical rigour1
- It includes empirical evidence
- It provides a reference implementation
- The approach could have substantial impact in multiple domains
- It is actually an enjoyable read!
Leaving aside the formal perspective, the ability to model very long-range interactions on multivariate sequences with high accuracy using a theoretically well-grounded model makes the publication appealing especially for domains where explainability is important.
1. I have not verified the claims, but as opposed to theoretical contributions in other publications, given sufficient time the proofs actually appear rather digestable mainly drawing from Calculus and Linear Algebra