Ways to think about Generative AI

Thoughtform of the Music of Gounod (1905)

Kandinsky insisted that in his time, the viewer had become inured to any kind of inner meaning, what he called Stimmung, in art. This deeper apprehension, he contended, was impeded by the fact that art was preoccupied with representing forms in the natural world. The art of the future, he said, would break through this barrier through the use of pure color and shape — in a word, abstraction. Richard Smoley, The Future of Thought Forms

Generative AI (GAI) is a general purpose computing tool that eats language. It blends oral culture with software programming. We can now program in natural language: we can say what we want in English, and the computer will do it. Over time, GAI has the potential to shift the order of computing abstractions and become it’s own platform.

On the first anniversary of the public release of ChatGPT, it is quite difficult to see where GAI may take us. By definition, the level of its abstraction means it operates at the root of language itself. It holds no niche, yet may seep across every niche. To help me think about this, I’m listing some favourite frames, metaphors, dimensions and questions from the past year of reading and playing with GAI in the hope they may serve as an “imagination stack” to guide a continuous “sideways glancing” at GAI.

Most conversations about AI are a hunt for the right metaphor. Ben Evans

If GAI eats language, how then do I think about it? What is their intention to know? Is it us versus them? Is it an animal that reads minds? Do I think soley in words? If I do not, then I can hide myself from their consumption. So what are the “embodied entities of the imaginal realm, the visible, vibrating states” of me that will remain generally invisible to the GAI that eats language?

If we were to adopt the pessimist’s pose, these silly, fumbling questions might seem like all we have left. But taken seriously—all ideas necessitate the optimist’s stance when they are labeled serious—this wordplay can transform itself into a sort of cleromancy to guide future actions. And if you can see it that way, as a placebo made of words, then any outcome can work for us to provide the next unfolding.

The deeper the abstraction, the more that it’s apprehension needs wordplay, reversals, sleight-of-hands and metaphor to break down rote forms of thought. As such, here’s a list of lenses for glancing at GAI:

The notion of correctness
Infinite interns versus automated experts
Clones, doppelgängers and mirrors
Not-Oracles
Companions
Social companion
Knowledge-work sidekicks (infinite interns)
Think like an artist
When creation is slow
Prompts are bugs, not features
It’s not the AI stupid, it’s the UI (another kind of mirror)
Scaling simplicity
Framing questions
What won’t work
Ficciones
Tools or animals?
What products can’t exist without AI?

GAI is capable with specificity but terrible at accretion. While it excels in precision for specific, well-defined tasks, humans frequently outperform GAI in complex, long-tail problems where understanding the context is crucial and wholistic completion is required. (Perhaps this will change when agents become tenable, but again, there will be a spectrum of correctness, and one would not want to underestimate productionised bugs.)

This distinction has practical implications. For domains where the error rate of GAI doesn’t matter or is obvious to the human, they are worthwhile adding to the creative mix as an immediate means of deriving novelty and increasing the speed of iteration. As we’ve seen with first DALL•E, Midjourney, Lexica, Firefly and the thousand other seemingly-spontaneous GAN products, text-to-image AI is already it’s own platform.

Many of the use cases for generative AI are not within domains that have a formal notion of correctness. The Economic Case for Generative AI and Foundation Models

If correctness is important to the final output, then you’ll probably need to add humans, who are cheaper when the job involves perception. However, this same notion means that incorporating humans in the loop—to scale that last 20%, which, as any student of systems knows, always becomes another 80%—makes it difficult to scale a company around this current requirement.

In the first year of ChatGPT, we learned that GAI is almost always incorrect, to some degree. And we also learned that being wrong was extremely interesting.

Machine learning doesn’t automate experts — it gives you infinite interns. That probably applies to generative models as well. Ben Evans

No, they’re not taking your job. Yet. But you can already use them to replace parts of your current job so you can do better things. But only if you have either the time and wherewithall to tinker, or the prior knowhow to know how to delegate.

The GAI we have now does not automate the work of people. Instead they work best as augmentations for a person who already has deep expertise in some domain. They’re flawed, but en-mass they provide momentum, allowing people to “get past the hard part” with writer’s block, code rubber-ducking and so on.

This means that if you have taste within a domain, you can start to think…

TODO!!! Me v Sari on taste.

This framing extends upon the notion of correctness, and indicates it’s usefulness as a knowledge-work sidekick, which has it’s own section below.

Unlike Kim Jong Un, I have no need of a double. Nor do I think I’m worth cloning. There are, perhaps, aspects of myself worth cloning: just as I want good advice, I can also probably give good advice to others. These parts might be worth cloning to certain people. And certainly, if we take this vein, ChatGPT already consists of millions of clones of parts of people.

GAI is in many ways a replication of our thoughts and words. The notion of cloning (genetically identical copies of people) and mirrors (evil twins) doppelgängers bring into focus deep-seated assumptions of human identity and the perception of the Other.

I recommend watching Solaris as your entry point to thinking about cloning and its implications: if you can be doubled, what is you?

There is no Hari. She’s dead. You’re just a reproduction, a mechanical reproduction. A copy. A matrix. Kris, Solaris

We don’t need other worlds. We need a mirror.

The image in a mirror seems like a clone of light: a two-dimensional projection of three-dimensional objects. Not quite the real thing, mirrors symbolize the thin line between reality and illusion, questioning what is true and what is merely a reflection. Like film is a dream that mirrors us, GAI also captures the light of our past selves:

I saw all the mirrors on earth and none of them reflected me… Jorge Luis Borges, The Aleph

The observer effect involves a change in the state of what is being observed, such as a particle’s position or momentum. In contrast, a mirror simply reflects light without altering the physical state of the object being reflected. But our relationship with GAI will more than likely be akin to the observed particle: what we see of ourselves in the GAI mirror will change us, just as any phenomenon attended to can change us. Except that the quality of this mirror is not yet known and perhaps f an altogether different resolution.

No phenomenon is a real phenomenon until it is an observed phenomenon. John Wheeler

What here is substance, not just form?

Arthur C. Clarke’s story Technical Error comes to mind, where a physical accident transforms a person into his mirror image. This is the story that Clarke’s Third Law comes from:

Any sufficiently advanced technology is indistinguishable from magic.

Another film-as-allegory is Tenet where the protagonist realises he has created a “turnstile” only after he duplicates/inverts himself continuously across time. How might AI be a turnstile that inverts entropy, destroying our past? That’s an interesting question, especially when we seem to live in an eternal present reconstituted by social media culture every day.

As of my knowledge cut-off in September 2021, I am not aware of any recent landslides in Batang Kail, Malaysia ChatGPT, February 2023

And consider Twin Peaks 3, the inscrutable masterpiece which hinges on the refracting doppelgänger as malicious spirit. The uncanny, incarnate:

You did good. You follow human nature perfectly. Dale Cooper’s doppelgänger

Ever wonder what it might be like if your imaginary friend was real? Consider the mutated idea of Tulpas from Twin Peaks 3, which sure sounds programmatic:

Tulpas were conjured duplicates of individuals. The tulpas were manufactured from a seed and organic material from the template — such as hair — and they could retain memories from their templates.

Mirrors, doppelgangers, clones. Symmetries of metaphor. Perhaps we can say that GAI might become the left hand to our right, some better angels of our nature? Might GAI be chiral in it’s non-superposable reflection of us?

On Earth, the amino acids characteristic of life are all “left-handed” in shape, and cannot be exchanged for their right-handed doppelgänger. Meanwhile, all sugars characteristic of life on Earth are “right-handed.” The opposite hands for both amino acids and sugars exist in the universe, but they just aren’t utilized by any known biological life form Must the Molecules of Life Always be Left-Handed or Right-Handed?

AI is nothing without people Lapsus Lima

Mention of oracles indicates a projection of imagination. Which is fantastic! We want that. What matters is the awareness around such thinking, especially if we ourselves do not understand how LLM “black boxes” really work.

But if it was an oracle, it would be super-intelligent. It isn’t right now. It’s more like a really smart idiot that does exactly what you tell it—able to expound specific knowledge at great speed but unable to accrete any agency for itself, one must take care to placate one’s own biases before asking.

I don’t want to comment on AI safetyism except that if I were to extend an opinion, I’d say it feels like an “influencer cult for nihilism”, which says more about us than AI.

Ah! Now let me contradict myself: “more about ourselves than AI” is an emerging theme here. We could say doppelgangers, mirroes, eighteenth-century typography, the taste of coffee, and Stansilaw Lem’s prose, are all indeed oracles when you look at them in the right way… But it takes us to look at them!

ChatGPT looks intelligent because we are intelligent. We are filling in a lot more blanks than we realize — grounding everything in our bodied experience, giving it meaning. Brother Phar

Any supposed oracle is a not-oracle, unless you’re willing to look sideways at said talisman as an embodied entity of the “imaginal realm, visible and vibrating states of what remains generally invisible”. The connection need only be conceptual, speculative even, to be useful. From simple attendance to some form of one’s own thought, through to systematic divination processes — take any sign you wish, and it can tell you something about your next intention, if you wish. Nobody else can answer the questions you have of yourself except for yourself. Certainly not some AI.

There is a process that involves radically increasing sensitization to signs. […] And that’s something you can confirm for yourself as something that’s incredibly strange and outside the logos and the known that we think of things, yet it somehow has these effects. Scott Mannion in conversation with Nick Land

I was just thinking of Solaris, which I always thought about as this story about contacting a truly alien alien. Now it’s like, well, this is a little bit of what we’re doing with virtual reality and AI. It’s like, what would happen if you could actually talk to your dreams, if you could revive people? You could have the mimicry of consciousness, the appearance of consciousness, without a consciousness. Jacob Mikanowski, Conversations with Tyler

I think the “companions” metaphor takes two main forms:

Augmentation companion. For knowledge-work such as code, writing, etc. Also called assistant, daemon.
Social companion. For social friendship, socialisation, etc. Also called friend, therapist, etc.

All companions, like Generative AI generally, currently work within the notion of correctness where being correct simply means “appealing to or engaging the user.” That is, it doesn’t matter if they’re wrong. It only matters if they’re interesting.

Here we are now, entertain us Nirvana, Smells Like Teen Spirit

What might we expect of a socialised GAI? I think of a court jester, the original social medium, or a travelling minstrel, singing for his supper.

What might we expect of friendship GAI? Always supportive, always there for us, no matter what: a friend in need is a friend indeed.

I would include any text-to-image (T2I) GAI, like Midjourney or Runway, to be mostly a social media “toy” at this stage — something to show and share in amazement. Incorrectness and hallucination, as we’ve seen, are features of T2I GAI in this toy stage, which is often how true use cases begin, as Chris Dixon said. In this initial toy form, it has exceptional social media shareability, and may expand from here toward being a general purpose knowledge-work tool in time.

A social companion could also take the form of a therapist. At this point it gets blurry because any human role could, in theory, be emulated based on the best available knowledge, so the distinction between social companion and knowledge-work sidekick falls over at this point.

The social comapnion model might also take the form of GAI as listener, as teacher, “to keep the student safe, and to protect him so that he can regress”.

If you were to use Mask work literally as ‘therapy’, and to try and psychoanalyse the content of scenes, then I’ve no doubt you could produce some amazing conflicts, and really screw everyone up. Mask work, or any spontaneous acting, can be therapeutic because of the intense abreactions involved; but the teacher’s job is to keep the student safe, and to protect him so that he can regress. This is the opposite of the Freudian view that people regress in search of greater security. In acting class, students only regress when they feel protected by a high-status teacher. Keith Johnstone, Impro

Cloning a loved one seems like a good idea. But as Proust, Solaris, Black Mirror and Her indicate, a plausible sounding idea is a long way from a good idea.

If all you know about a startup idea is that it sounds plausible, you have to assume it’s bad. Paul Graham, Start Up Ideas

So if I could have a trusted soul from all eternity to speak with, who would it be? Who do we condition our prompt spells on? “You are Heraclitus”.

A sidekick—aka copilot, assistant, daemon, Sancho Panza or your own personal Jesus—for any form of knowledge work will give people who are good at what they do a 10x lever. If you have the knowhow and the right questions, you can increase your cycle speed by an order of magnitude.

On the correctness scale, the sidekick only needs to be roughly right according to the needs of the person. Not being completely correct is more than OK in this situation because knowledge-work is creative work: the sidekick is working for a person who is always in their own loop. Drafting and iteration in the feedback loops of creative work are essential. For example, when writing, I usually write far too much, then slowly edit it down. It’s the editing process that hurts! Being able to ask for assistance in this process helps me move through my drafts much faster.

You get the LLM to draft some code for you that’s 80% complete/correct. You tweak the last 20% by hand. Steve Yegge, Cheating is all you need

Another way of thinking about sidekicks is as a “floating point for cognition”: a tool that potentially gives “us a chance to have a much higher resolution model of the world”.

It’s going to seem odd and old-fashioned very soon if you can’t interact meaningfully with a “document” — if that document is limited to a specific set of words, a specific presentation, a specific form. This is more than just a copilot you can ask questions of — the entire form and use of the document will be open to this semantic realm, fluid and adaptive in an intelligent way. Sam Schillace, A few more thoughts on the future of documents

That sounds a lot like having a concierge at hand, for whatever task you might need:

Great salespeople don’t sell; they help. They listen, understand what you want to achieve, and help you achieve it. A better title would be “concierge.” Great salespeople help customers make progress in their lives, on their terms. Bob Moesta, Demand Side Sales 101

They can further be, as Steve Jobs said of personal computers, a “bicycle for the mind”. This makes sense because, when all is said and done, people don’t like being told what to do, but they do like the feeling of riding like the wind.

It’s not an animal. Nor an oracle. It’s a tool.

All of this means that to make use of Generative AI, [[AI makes taste even more important|taste and good ideas even more important]].

Like John Lennon said, give me a tuba and I’ll get something out of it Frank Costello, The Departed

When new user behaviours emerge, which tend to underlie market shifts because they often start as “fringe secular movements the incumbents don’t understand, or don’t care about”. So, what’s happening at the margins? What scenius is doing weird things with this?

In this sense, it is helpful to think like an artist. How might a film maker, writer or poet make sense of this medium? How would they write about it in their stories? How are you going to get something out of this new tuba?

Attempted answers to any such questions are doomed without some formal process of cleromancy to instigate new perceptions. Don’t answer them directly, let them simmer, participate in disassembling rote thought by watching films or reading fiction, and try to glance sideways. What would Cy Twombly say? Can you hire Agnes Martin as your prompt artist-in-residence in your mind? (Scroll back up to “Clones, doppelgängers and mirrors” for more ideas.)

Time for prompt engineers to read Impro, watch films, go see live theatre and dial up the humour setting. Films are “possible worlds” stories. Theatre improvisation are “possible worlds” alive.

It seems essential that thinking like a script-writer, or a film director, or a novelist is just as important to creating AI-first products because “right now, LLM tools, and auto agents specifically, are more a people problem than a math or AI problem”.

Wallace needs my imagination to maintain a stable product Dr. Ana Stelline, Blade Runner 2049

If LLMs use natural language, then why do we have to learn prompt engineering? Seems like a bug.

Before posting any prompt, it should be required that you test out just asking the question without all the “imagine you are a divine being with special powers” stuff. It is text based but hardly needs all these extra words. Steven Sinofsky

That is, until you start to tinker. Then it becomes very obvious that what we say is often “picked up” by other people and gaps in understanding are filled in by all our micro-behaviours and confirmatory conversation (this transmission process is essential to creativity: memes mutate with every communication). It’s quite difficult to communicate exactly what you mean, especially to a computer with micro-behaviours.

In most cases, the meaning of a word is its use Wittgenstein

Prompting is a good lesson in understanding how little our words contain the true meaning we intend. How we say is just as important. As such, theatre improvisation is a good way of thinking about the potential of prompting:

The improviser has to understand that his first skill lies in releasing his partner’s imagination. Keith Johnstone, Impro

Eventually, however, I would expect the difficulty of communicating with the machine—which prompting overcomes—to go away. Although we would hope that the ability to suggest from imagination will be crucial to any interaction.

So there are considerable gaps to fill in our human < > machine interfacing. And that’s why the user interface is important.

How do you explore a space you don’t know anything about, without being fooled or lost?

As new technology emerges, we use our habitual understanding of prior technology to frame uses cases for it. In the case of Generative AI, we’d do well to rinse our minds of precepts.

We don’t know the shape of those apps yet, and we are barely at the point where we understand what the platform is, how to use it, and what best coding practices even are. So this is a good time to ask the “what if” questions, not the “why not” ones. Sam Schillace

Ask “what if” questions, not “why not” ones (I stole most of these questions from Ben Evans’ Not Even Wrong):

It becomes very cheap and very good, who would use it?
What’s your theory of why things will change or why they won’t?
Are you proposing a change in human nature, or a different way of expressing it?
If it did, what could that be?
If that were to happen, what would have to change?

Suppose I think of a story and you guess what it is. Keith Johnstone, Impro

Corruptions of form, duplicates of form… Echoes… It refracts everything. Can’t you see? Josi, Annihilation

GAI is a new computing abstraction where you can say what you want in English, and the computer will do it. But do what? Well, possibly anything. And that’s the wicked problem. Because it is a new platform, the largest opportunities lie within imagining products that can’t exist without AI. And because of the nature of AI, we can only do that by imagining ourselves anew.

GPT-3 is as much evidence of machine intelligence as a mirror, or a radio, is evidence of machine intelligence. What GPT-3 is revealing, or rather reflecting, instead is the vastness, depth, and diversity of human intelligence. […] The real work, the work of seeing and understanding, is being done by the human looking in the mirror. John Manoochehri, GPT-3 and the Digital Turk

The “things I wanted but didn’t know what I wanted” category is the most adjacent possible category. But what, then, is a category, Heraclitus?

Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do. Donald Knuth

Topics:

framing