Predictive coding is a model of visual processing which suggests that the brain is a generative model of input, with prediction error serving as a signal for both learning and attention. In this work, we show how the equivariant capsules learned by a Topographic Variational Autoencoder can be extended to fit within the predictive coding framework by treating the slow rolling of capsule activations as the forward prediction operator. We demonstrate quantitatively that such an extension leads to improved sequence modeling compared with both topographic and non-topographic baselines, and that the resulting forward predictions are qualitatively more coherent with the provided partial input transformations.
T. Anderson Keller, Max Welling
Oral presentation at: ICCV 2021 VIPriors Workshop