Repo: https://github.com/StanfordBDHG/OpenTSLM
Foundation models excel at text, images, audio, and video, but lack temporal reasoning capabilities over time-series data streams that run the real world: vitals, prices, telemetry, grid loads, clickstreams, machine logs, business processes.
Time Series Language Models (TSLMs) are open foundation models, supporting time‑series as a native modality, next to text, letting users ask questions, get explanations, and recommendations, all in natural language.
The OpenTSLM White Paper released today demonstrates state-of-the-art temporal reasoning performance. Unlike prior approaches, the cross-attention architecture scales to long time-series remaining viable at scale.
The results:
- Sleep staging: 4.4× accuracy with a model 200× smaller (~880× efficiency)
- Activity recognition: ~6× accuracy with 200× smaller (~1,000× efficiency)
- ECG interpretation: ~2× accuracy with 200× smaller (~400× efficiency)
— first model to process 12-lead ECG signals and text simultaneously with chain-of-thought reasoning validated by cardiologists.
For the first time, foundation models can handle multiple time-series streams of varying lengths concurrently, integrate them with textual context, and produce interpretable explanations (verified by domain experts, clinicians).
This work is the result of a growing collaboration between researchers from Stanford, ETH Zurich, UIUC, University of St. Gallen, University of Washington, Google, and Amazon.
It points to the next foundation model frontier: temporal intelligence that unlocks proactive healthcare, adaptive robotics, resilient infrastructure, and new forms of human-AI collaboration.