Neural network language models as models of human language processing and cognition more generally

Language

Feb 21

Only published papers are included; for preprints, see Papers.
Last Updated: May 2024

In recent years, availability of massive language corpora, advances in machine learning, and increases in computing power have led to engineering breakthroughs in artificial intelligence, including prominently in language. As a result, efforts have sprouted across labs and domains to leverage models from AI as models of human perception, cognition, and motor control, by directly relating model representations to human behavior and neural responses. We are actively working in this space, so more to come soon.

Language models capture human neural responses to linguistic input

This paper uses early word-embedding models (like GloVe) to decode linguistic meaning from neural activity. We show that a decoder that is trained on imaging (fMRI) data collected while participants process individual word meanings can decode semantic vector representations from imaging data collected while participants read sentences about a variety of topics.

All Publications

Toward a universal decoder of linguistic meaning from brain activation

Pereira, F., Lou, B., Pritchett, B., Ritter, S., Gershman, S.J., Kanwisher, N., Botvinick, M. & Fedorenko, E. 2018. Nature Communications, 9(1), 963. DOI: 10.1038/s41467-018-03068-4. PMID: 29511192. PMC5840373.
PDF | Supplementary Information

This paper evaluates >40 neural network language models in a model-to-brain encoding framework and shows that representations from unidirectional transformer models, like GPT-2, capture a non-trivial amount of variance in neural responses to sentences. Further, models that perform better on the next-word prediction task fare better in predicting human responses, which suggests that optimizing for predictive representations may be a shared objective of in silico and biological language systems.

All Publications

The neural architecture of language: Integrative modeling converges on predictive processing

Schrimpf, M., Blank, I.*, Tuckute, G.*, Kauf, C.*, Hosseini, E., Kanwisher, N., Tenenbaum, J.^ & Fedorenko, E.^ 2021. PNAS, 118(45), e2105646118. DOI: 10.1073/pnas.2105646118. PMID: 34737231. PMC8694052.
PDF | Supplementary Information | tweeprint

This paper builds on the Schrimpf et al. (2021) paper and attempts to isolate the features of the model representations that matter the most for the model-to-brain match. It finds that word meanings and compositional meaning matter more than syntactic structure.

All Publications

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network

Kauf, C.*, Tuckute, G.*, Levy, R., Andreas, J., & Fedorenko E. 2023. Neurobiology of Language, 5(1), 7-42. DOI: 10.1162/nol_a_00116. PMID: 38645614. PMC11025651.
PDF | Supplementary Information | tweeprint

Language models capture human neural responses to computer code

This paper explores what properties of computer code are represented by code models and the brain during simulated execution. We find a correspondence between encodings of critical features including iteration and conditional evaluation, allowing for successful decoding of program structure from the brain using code model proxy representations.

All Publications

Convergent representations of computer programs in human and artificial neural networks

Srikant, S.*, Lipkin, B.*, Ivanova, A. A., Fedorenko, E. & O'Reilly, U. M. 2022. Advances in Neural Information Processing Systems, 35, 18834-18849.
PDF | Supplementary Information

Methodological considerations in relating model representations to human neural responses

This paper formulates guidelines for when different mapping models might be most appropriate in research that attempts to relate model representations to human brain responses.

All Publications

Beyond linear regression: mapping models in cognitive neuroscience should align with research goals

Ivanova, A. A., Schrimpf, M., Anzellotti, S., Zaslavsky, N., Fedorenko, E. & Isik, L. 2022. Neurons, Behavior, Data Analysis, and Theory. DOI: 10.51628/001c.37507.
PDF | tweeprint

Model interpretability in AI versus neuroscience

This opinion piece discusses some differences in the construal of model interpretability in AI versus neuroscience and talks about ways to leverage the synergies between the fields.

All Publications

Interpretability of artificial neural network models in artificial intelligence versus neuroscience

Kar, K., Kornblith, S., & Fedorenko, E. 2022. Nature Machine Intelligence, 4, 1065-1067. DOI: 10.1038/s42256-022-00592-3.
PDF | tweeprint

ScienceSites

Neural network language models as models of human language processing and cognition more generally

Language models capture human neural responses to linguistic input

Language models capture human neural responses to computer code

Methodological considerations in relating model representations to human neural responses

Model interpretability in AI versus neuroscience

The internal architecture of language

Language processing of diverse languages, including in bilinguals, multilinguals, and polyglots

Ev Fedorenko's Language Lab at MIT