Convolutional networks can model the functional modulation of the MEG responses associated with feed-forward processes during visual word recognition.
Journal:
eLife
PMID:
40359126
Abstract
Traditional models of reading lack a realistic simulation of the early visual processing stages, taking input in the form of letter banks and predefined line segments, making them unsuitable for modeling early brain responses. We used variations of the VGG-11 convolutional neural network (CNN) to create models of visual word recognition that starts from the pixel-level and performs the macro-scale computations needed for the detection and segmentation of letter shapes to word-form identification of large vocabulary of 10k Finnish words, regardless of letter size, shape, or rotation. The models were evaluated based on an existing magnetoencephalography (MEG) study where participants viewed regular words, pseudowords, noise-embedded words, symbol strings, and consonant strings. The original images used in the study were presented to the models and the activity in the layers was compared to MEG evoked response amplitudes. Through a few alterations to make the network more biologically plausible, we found an CNN architecture that can correctly simulate the behavior of three prominent responses, namely the type I (early visual response), type II (the 'letter string' response), and the N400m. In conclusion, starting a model of reading with convolution-and-pooling steps enables the flexibility and realism crucial for a direct model-to-brain comparison.