Super Readers, Superreaders
Written by Sil Hamilton in September 2022.
This is a companion piece to a presentation I gave at SLSA 2022. The thesis of my presentation is two-fold:
- Foundation models operate under principles compatible with both enactivism and reader-response theory.
- Those in the humanities should not ignore the implications these models carry for tasks involving interpretation/hermeneutics.
I write this piece with a desire to clear a common assumption some working in the humanities have regarding foundation models: that they are not useful for the subtextual interpretation of creative texts because they are not capable of understanding.
To refute this premise, I propose an equivalence between the three following questions:
- By what means does a story deliver a world?
- By what means does a person deliver their world to others?
- By what means does a LLM deliver cognition?
I hope you see where I’m going.
By what means does a story deliver a world?
A story has a story to tell. Very often, this story takes place in a particular world. This world might be similar to our own, or it might not. The world in which a story takes place is particular to the story: a storyworld. The story delivers this world to the reader via a narrative.
Formal definitions of a narrative vary, but a colloquial definition might understand it to be a linear series of propositions concerning the storyworld. The reader interprets each sentence as they come, using it to update their understanding of the world. And importantly, filling in the blanks with their imagination.
By what means does a person deliver their world to others?
Again, a narrative. The narrative might be non-fiction, but it’ll likely still take the form of a linear sequence of propositions despite the veracity.
But this is where things get more interesting. Particular postcognitivist theories posit cognition is actively enacted. That is, cognition arises in the dynamic play between agent and environment (whether this be social or physical). To raise the precision of this claim for my own argument: cognition is both demonstrated and actively produced by communicative acts. Language is the bread and butter of this production.
By what means does a LLM deliver cognition?
A loaded question, I admit. Cognition isn’t usually described as being delivered, but it should be when the subject is language modelling. Language models are not agents. I stress this. They are not an it, an I, a conscious individual. You cannot talk to a language model for it cannot listen; it can only write.
LMs are best thought of as statistical beings whose sole purpose is to correctly continue a series of words. This is their job, and they’ve become good at it. So good, in fact, that they have developed theories of our world in the process. They need to. Correctly continuing a text requires you understand what you’re continuing. But herein lies the caveat: language models do not differentiate between the prompt given and their response. To them, it is all one continuous span of text: a narrative.
Language models simulate narratives. Prompt injections work because the model is simulating a span of text in which a prompt injection is performed. LMs are great for co-writing because they are good at simulating narratives. Moreover, LMs can deliver narratives wherein characters are cognizant. Thus they can simulate cognition. And agency, too. We perceive characters in novels as embodying agency, after all.
Consequences
Reading is an act of creation—recall “the death of the author.” Language models can read. Large enough models can even pass highschool literacy tests. They understand. Just as narratives—bundles of communicative acts—enable cognition in humans, so too can language models enable cognition through narratives. While language model cognition is certainly not equivalent to human cognition, they are not merely stochastic parrots. Language models present a humongous opportunity for those in the humanities.
What sorts of factors were important in the creation of a text? What is a text really addressing? By simulating the creation of a text (remember, reading is writing and writing is reading for these models), language models become a lucrative tool for solving hermeneutical questions like these.
Given time, language models could become algorithmic superreaders of the kind Riffaterre could only dream of: an analyst who reads as a reader proper.