2025prediction.pdf

Online Sensorimotor Sequence‑based Learning Using Prediction Trees

Lucas Fournier, in collaboration with Jean‑Charles Quinton¹, Mathieu Lefort² and Frédéric Armetta²

¹ Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, UMR 5224

² Univ. Lyon, UCBL, CNRS, INSA Lyon, LIRIS, UMR 5205

This work was carried out within the scope of the Multimodal deep SensoriMotor Representation learning (MeSMRise) research project (ANR-23-CE23-0021-01, 04/01/24–09/30/28).

Abstract

Artificial agents now outperform humans on many visual and linguistic benchmarks, yet their abilities remain case‑specific due in part to the absence of action‑grounded structure central to human cognition. This gap suggests that perception, action, and prediction must form a tightly coupled feedback loop and that perception be reframed not as passive stimulus reception, but as an active process emerging from the mastery of sensorimotor contingencies—systematic relationships between actions and their perceptual consequences.

We therefore ask: Can a simple, online structure capture meaningful sensorimotor regularities and foster autonomous object discovery through prediction alone? To address this question, we explore a minimalist memory mechanism that incrementally builds predictive sequences based on sensorimotor experience. We validate our approach in a Tetris-inspired environment, demonstrating that the model learns a compact and accurate predictive representation of its world from raw interaction.

Introduction

Artificial perception is often trained passively on large datasets, which limits generalisation beyond specific benchmarks. In contrast, biological cognition develops by interacting with the world, learning regularities through action–perception cycles. This work studies whether an online, sequence‑based representation can encode such regularities and support autonomous structure discovery without extrinsic rewards. To this end, we make the following contributions.

Background

Our approach is grounded in established theories of embodied cognition and online learning. We first articulate the core design principles that guide our model, then review the relevant literature that informs these principles.

Design Principles

We articulate three principles guiding the model.