Fragment on the Associative Auteur
“Neural Decay” by Mario Klingemann (and CycleGAN)
I entered my tenure in film studies with a deep belief in the role of the auteur. I looked for their signature everywhere; treated my preferred auteurs with reverence, as if an endless, uncharted territory I could master.
I left my tenure with a sense that auteurs were both audience projections and industry arrangements: studios categorise films through genre when marketing, and if there was an auteurial presence (a recognisable brand), they could use it, much like they use ‘science fiction’ or ‘neo-noir,’ in posters and trailers. And so it was predominantly a marketing tool, but for us as viewers, also a set of symbols; a language with a human’s face; a language with recognisable, consistent meanings, that one nevertheless could coherently parse and organise along the axes of this invisible person, whose touch is felt formally, narratively.
And of course, if you were entering the industry as an artist, you would want to participate in this exchange in order to grant yourself some authority and presence. To consciously bend the trajectory of your interests and expression so as to become more self-reflexive, and thus more “auteurial.” The New Wave, the critics who had birthed the framework that resucitated Hitchcock’s reputation1 had begotten an incredibly efficient feedback loop of postmodern reflexivity along the creative tradition.
In the last years of my education I explored why we attributed the ‘auteur’ role to the director, and posited that sound designers had a real claim to the role, too. We prioritise the image over the sound psychologically; but most importantly, we prioritise our attribution to singular sources of organisation and management.
What does this mean when facing an increasing intrusion of generative models in the artistic creation process?
Programmatic asset creation becomes more incentivised among bedroom creators until the human signature gets sublimated away — imagine indie game creators as GAN composers, guiding the narrative, the hero's journey, through particular seeds— Matilde Park (@matildepark_) May 5, 2019
Especially when it comes to interactive media – itself already swamped with overworked, underpaid creatives – the incentives have never been higher to adopt programmatic artistic creation at the same time as it’s never been more possible at the consumer level to facilitate it.
I want to posit that our understanding of the auteur extends outside this sublimated projection – a Foucaultian understanding of the signature of the auteur-function – to a cognitive process that anticipates and charts the implicit associations of other models of mind. That is to say, we model the way other people think, and what they will think next, when we engage with them as auteurs, all as a form of recreation – and as a consequence of this process, we can extend auteurship to the implicit, albeit externally-observed associations generated by adversarial networks and machine-driven artistic creation more generally.
Understanding auteurs systematically
There’s an interesting moment in Gwern Branwen’s exceptionally well-researched analysis of the 2009 Death Note script where they offhandedly remark, modelling the script’s linguistic style to a control group of prolific fanfiction authors2:
In the middle, splitting the difference (which actually makes sense if they are indeed more competently or “professionally” written), are the “good” fanfics I selected. In particular, the fanfics by Eliezer Yudkowsky are generally close together - vindicating the basic idea of inferring authorship through similar word choice.
This is, of course, an obvious assertion given that they’re computationally assessing linguistic similarity to assert authorship to begin with. However, it struck me to ask how one would go on to programmatically assess what is more or less a spiritual exercise.
When it comes to written content, we could model it much like we model predictive keyboards, building a corpus and inputting queries against it. But when it comes to the interpretive, “dream analysis” elements of a David Lynch film, or the lurid mania of Andrzej Żuławski’s work, we aren’t discussing the prediction of semantics and syntax. We’re discussing filmic, formal syntax and a sublimated human’s relations between concepts and objects. We’re observing the consequences of their relation to the world, to their environment, and reverse engineering its infliction on their ongoing meaning-making – especially when that meaning-making stays predominantly stable, with subtle shifts over multiple works.
Scott Alexander wrote a great profile earlier this year on the GPT-2 predictive language model, taking a prompt and responding with well-written but dreamlike answers:
GPT-2 composes dream narratives because it works the same way as the dreaming brain and is doing the same thing. A review: the brain is a prediction machine. It takes in sense-data, then predicts what sense-data it’s going to get next. In the process, it forms a detailed model of the world.
GPT-2 works the same way. It’s a neural net trained to predict what word […] will come next in a text. After reading eight million web pages, it’s very good at this. It’s not just some Markov chain which takes the last word (or the last ten words) and uses them to make a guess about the next one. It looks at the entire essay, forms an idea of what it’s talking about, forms an idea of where the discussion is going, and then makes its guess – just like we do.
If you haven’t read the piece (and its sequel) I highly recommend giving it a look, though it’s increasingly becoming the case that these profiles are immediately outdated. Imagine you took a student, sat them down, and said, “Explain what factors contributed to the outbreak of the First World War.” They would think about what they’ve read and heard and summarise it in their own words, adjusting the delivery, length or content for an implied audience. This is, essentially, what we’re able to produce artificially right now, albeit in a rudimentary way.
I’m not here talking about the magic of the future, though. I’m not versed enough to speculate on the future of artificial intelligence or what’s ongoingly possible. I care about what we can already produce, with these models as tools, during the creative process. And in this case, beyond the model itself, what’s of interest is its similarity to the process by which the brain forms its connections – and, too, why the current results look so dreamlike:
…the brain is always doing the same kind of prediction task that GPT-2 is doing. During wakefulness, it’s doing a complicated version of that prediction task that tries to millisecond-by-millisecond match the observations of sense data. During sleep, it’s just letting the prediction task run on its own, unchained to any external data source. Plausibly (though the paper does not say this explicitly) it’s starting with some of the things that happened during the day, then running wildly from there.
So, in a sense, dreams are meaningful in the same way that hitting the middle button on your predictive keyboard over and over is meaningful: it’s what a system is anticipating will come next given no input. When you plug a mixing board into itself, it still makes sound. It’s these sounds, left alone, amplified upon themselves, that can best point to difference, the little carvings an environment makes in my psychological profile – and, too, what one provides to others in communication. Predictive output, patterns of behaviour and reason, even if the output is unpredictable in itself.
Predicting thoughts vs. seeing signatures
Foucault’s description of the author as a societal function has served as an ample resource toward our general postmodern disposition toward auteurship as false, constructed, and useful.
The function of an author is to characterize the existence, circulation, and operation of certain discourses within a society. Michel Foucault, “What is an author?”3
An idea has an owner, whose name is identical to a person’s name, but is essentially separate from their lived experience; a sublimated phantom. By constructing ideas around an attached name, we can categorise a group of ideas, or a discourse, accordingly. It operates not through attribution but through a “complex operation whose purpose is to construct the rational entity we call an author”4 and is essentially projected as part of the deciphering process, much as reading a bunch of letters gets deciphered and projected into thoughts and images. A hallucination.
This is useful for recreational purposes, but it also serves a great purpose when it comes to commodifying ideas, aesthetics and modes of creation – the heights of auteurship are built on institutions of commerce and control. This has implicit limitations when it comes to post-capitalist, post-human participation in the process of creating ideas, aesthetics, and discourses. What role do thinking tools have in the process? Do they own anything, and do we own them?
Yes, yes, this is very Star Trek, but the tool itself has a set of predictive associations – a characteristic look – both similar to, but separate from, the characteristic look of film stock and the characteristic sounds of particular guitars.
After all, those are physical originators of formal elements. GANs make associations between thoughts in rudimentary, but recognisable patterns.
The GAN “look”
Jonathan Fly applied the StyleGAN model to Magic the Gathering cards, and, well, they’re terrifying.
I wanted a pure Magic Card StyleGAN model for future mashups. Didn't work out.— Jonathan Fly (@jonathanfly) May 5, 2019
Took forever and kept having to backtrack checkpoints, I think because the Generator gets too good too fast making at just the card text half.
The failed models are still kind of cool though! pic.twitter.com/SLv2Bee2dl
When I saw these cards interpolate, what I noticed was that my brain was struggling to make meaning from the letters. They were very visibly reminiscent of the Latin alphabet, but they were as-yet-unseen letters, letters without meaning, letters I myself couldn’t conceive. The GAN, after all, has a way of seeing and drawing that I don’t.
A particularly low tree and a truck can look the same to a computer.5 So, too, its drawings of people can seem monstrous in a way I couldn’t anticipate (and can rarely forget). In my opinion, it’s capable of rendering images that otherwise would escape human prediction. Its overall model of prediction is predicated on methods as essentially separate from mine as processor architectures, and the first thing we’ve tried to get it to do is to emulate mine as closely as it can (or rather, the outputs that come from mine) – and, of course, all I see are the things it isn’t, pointing to the myriad tendencies in approach that separate its thought from mine.
It essentially posits itself, then, as the Ultimate Other – the biggest Difference I’ve yet encountered in aesthetics – and I wouldn’t rush to discount it as another tool. If it’s fair to say that the process of enjoying someone’s auteurial signature is through an ongoing observation of their meaning-making, plus its translation into formal and narrative approaches, then I think we’re on the precipice of an explosion of our observed, anticipative generation of those very things as GANs become more accessible and commonplace. A billion billion auteurs in the balance. And what then?
Are we to extend the ‘tool’ metaphor outward one more step? Are we now to be composers over composers? What is its relation to our current model of auteurship (as a composer over hundreds of people contributing to a sublimated self), and when does one slot above the other, machine over man, into the role of auteur?
He was regarded much like Shyamalan was! And, in turn, Shyamalan is getting his own renaissance, thankfully. For information on Hitchcock’s work toward legitimising the medium (and his mode, suspense) see Frank Krutnik’s “Theatre of thrills: the culture of suspense”.↩
Calling him “Eliezer Yudkowsky, the prolific fanfiction author” is both very funny and yet, not all that incorrect.↩
Foucault, Michel. “What is an author?” Language, Counter-Memory, Practice: Selected Essays and Interviews. Ed. Donald F. Bouchard. Ithaca: Cornell University Press, 1980. Print. 124.↩
(that has undergone the process of interpreting binary data into an image parser that identifies key points in the image, attributes a ‘shape’ and other qualities of the identified entity in the image, to similar images of the same object, and is then fed another image to compare against)↩