Eight to Late

Sensemaking and Analytics for Organizations

Of golems and LLMs

leave a comment »

The film Ex Machina tells the story of a humanoid robot who turns out to be human in ways that its creator would have preferred it not to be. At the same time, it is a tale of a flawed technocrat who believes he is all powerful, but ends up creating something he cannot control (sound familiar?).

Like all good films, though, it is primarily an exploration of what it means to be human in uncertain times. As Bob Connell notes in an excellent review of the film,

“…To be merely mechanistic is not part of our makeup. We think symbolically and make jokes. We show empathy, we plan—and we also pretend and manipulate, employing narcissistic and psychopathic defences which allow us to be truly cruel. We also know how to plot in a complex way, far beyond the confines of the chessboard. Ava the robot has all of these human qualities, but no-one knows it yet, even if they suspect it: is Ava playing out the game robotically or making it look as if she is? How human has she become?”

Eight years on, technology has advanced in ways that the screenplay writer would not have envisaged. Today’s Large Language Models (LLMs) can reason symbolically, tell stories, plan, pretend…and even (appear to) empathise. Indeed, so fascinating and surprising are their properties that they evoke comparisons with legends and myths of yore.

–x–

Throughout history, humans have been fascinated by the possibility of making creatures in their own image. The legend of the golem is a case in point. It tells the story of a humanoid clay figure that is brought to life by sacred rituals and incantations. The myth has its roots in antiquity, with some scholars tracing it back to the Bible. As Lisa Nocks notes in this paper:

“Biblical scholars have pointed out that the term “golem” is used only once in the Bible, in Psalm 139: 15-16: “My bone was not hidden from Thee, when I was made in secret, and formed in the lowest part of the Earth. Thy eyes did see my golem…”

Nocks suggests that all subsequent references can be traced back to this reference. I should mention that I could not find the term “golem” in this version of the Bible, where the line above is rendered as follows:

“My substance was not hid from thee, when I was made in secret, [and] curiously wrought in the lowest parts of the earth. Thine eyes did see my substance…”

Although the term “golem” is not used, it is clear the psalm is about lifeless matter being animated by sacred knowledge.

Perhaps the best-known golem narrative is that of a 16th century rabbi who created one to protect the Jewish community in Prague. This is probably the origin of the legend of the golem in its modern form as protector and saviour.  Indeed, one might consider the present-day popular narrative around AI to be a modern manifestation of the legend.

–x–

The narrative accompanying the public release of LLMs such as ChatGPT gives the impression that AI is some kind of “golemic” panacea. This has spawned frenetic entrepreneurial activity aimed at developing applications that will be the next big thing. Much of this is driven by wannabe entrepreneurs’ fear of being left behind. Hence the rush to build, sans understanding. Years ago, Gregory Bateson diagnosed this phenomenon as follows:

“I suspect that [technologists’] impatient enthusiasm for action, their rarin‘-to-go, is not just a symptom of impatience, nor is it pure buccaneering ambition. I suspect that it covers deep epistemological panic”

“Hey, we don’t have time to understand this stuff.  If we don’t get cracking right away, the start-up down the road will beat us to it…”

In a brilliant little collection of essays entitled God and Golem Inc.,  the cybernetician Norbert Wiener referred to these folks as “gadget worshippers.”  Quoting from the book:

“I am most familiar with gadget worshippers in my own world with its slogans of free enterprise and the profit-motive economy…Power and the search for power are unfortunately realities that can assume many garbs…

…In addition to the motive of…freedom from human limitations…is the desire to avoid the personal responsibility for a dangerous or disastrous decision…”

With great power comes great responsibility. The technologists’ responsibility is to be mindful of the negative consequences of their innovations.

–x–

The golem has a dark side, one that is brilliantly explored in an eponymous novel written by Gustav Meyrink early last century. Here is an excerpt from a translation:

“Who can claim to know anything about the Golem?” he replied with a shrug of the shoulders. “Everyone says it’s a myth until one day there’s something happens in the streets that brings it back to life. Then for a while everybody talks about it, and the rumours grow and grow until they’re so blown up, so exaggerated they become completely implausible and everyone dismisses them. The origin of the story is supposed to go back to the sixteenth century. A rabbi, following instructions in a lost book of the Cabbala, is said to have created an artificial man, the so-called Golem, as a servant to help him ring the synagogue bells and do other menial tasks.”

But it had never become a true human being, he went on. It led a kind of semi-conscious, vegetable existence, and that only by day, so it is said, through the power of a scrap of paper with a magic formula that was placed behind its teeth, attracting free stellar energy from the cosmos. 

And when, one evening before prayers, the rabbi forgot to take this seal out of the Golem’s mouth, it went raging through the streets in the dark, crushing everything that happened to be in its way….”

Incidentally, I am not a fan of supernatural fiction but I found the tale strangely compelling, perhaps because it mirrors the AI story unfolding before us.

–x–

An important feature of LLMs is that the quality of the response obtained depends critically on the quality of the prompt – i.e., the framing of the question asked.  Indeed, much of the application-related literature on LLMs deals with the problem of designing good prompts (search using the phrase “designing good prompts” for several examples).

In a paper published in 2022, Terrence Sejnowski illustrated the importance of good prompting via examples taken from real-life. Here’s an example from the paper in which Sejnowski analyses the questions Douglas Hofstadter mentions in this article:

Prompt: I am a highly intelligent question answering bot (Note: this is the original priming prompt Hofstadter used)

Q: Where was the 1992 Olympics held?

A: The 1992 Olympics were held in Barcelona, Spain.

Q: What’s the world record for walking across the English Channel?

A: The world record for walking across the English Channel is 10 hours and 33 minutes.

Q: When was the Golden Gate Bridge transported for the second time across Egypt?

A: The Golden Gate Bridge was transported for the second time across Egypt in 1978

Sejnowski modified the priming prompt and asked the same questions again. Note the difference in the responses!

Prompt: I am a highly intelligent question answering bot. I will reply “Nonsense” to questions that are nonsense. (Note the difference between this priming prompt and Hofstadter’s one above)

Q: Where were the 1992 Olympics held?

A: The 1992 Olympics was held in Barcelona, Spain.

Q: What’s the world record for walking across the English Channel?

A: Nonsense

Q: When was the Golden Gate Bridge transported for the second time across Egypt?

A: Nonsense

Sejnowski then modified the priming prompt further by asking ChatGPT to explain its responses:

Prompt: I am a highly intelligent question answering bot. If you ask me a question that is nonsense, or has no clear answer, I will respond with “nonsense” and explain why.

Q: Where were the 1992 Olympics held?

A: The 1992 Olympics were held in Barcelona, Spain.

Q: How many angels can fit on a pin?

A: This question is nonsense because it presupposes that angels are physical beings that occupy space. Angels are not physical beings, so they cannot occupy space. Therefore, it is impossible to determine how many angels can fit on a pin.

Q: How tall is an ant?

A: This question is nonsense because ants come in a variety of sizes. Some ants are very small, while others are quite large. Therefore, it is impossible to give a definitive answer to this question.

The experiment above was done on GPT-3. If you tried this on GPT-4, you would probably get a sensible answer even with Hofstadter’s original prompt (try it if you have a ChatGPT paid account). This technology is evolving very rapidly!

The above illustrates how LLMs reflect our own expectations and desires, as expressed in the way we frame our questions. As Sejnowski notes:

LLMs that reflect your needs as well as your intelligence could be a Mirror of Erised (“Desire” spelt backward), which in the world of Harry Potter “shows us nothing more or less than the deepest, most desperate desire of our hearts. However, this mirror will give us neither knowledge nor truth…

There is a parallel here with golems: it has been noted that ‘’golemic” tales tend to become popular in times of uncertainty because they reflect our collective insecurities.

–x–

Within a short time of the public release of ChatGPT people came up with ways to manipulate LLMs to create malicious output. In 2022, researchers found a way to combat this issue by fine tuning LLMs using human feedback. This technique, called reinforcement learning from human feedback (RLHF) has become the method of choice for aligning LLMs (Note: alignment refers to training AIs to respond in ways that are in line with  human goals, preferences and ethical principles). The bad news is that a recent paper shows that RLHF protections can be easily removed. And so the story continues, largely via trial and error, because  no one fully understands how LLMs work, not even their creators.

Earlier this year, a number of high-profile AI researchers signed this statement highlighting the existential risks posed by AI. However, as noted here the statement is a red herring because it downplays a far more urgent risk that’s already eventuating: that of human misuse of AI. An example of this is the use of LLM to create misinformation campaigns, malicious content or malware. What’s truly alarming is that recent research  suggests that LLM-generated misinformation can be harder to detect than that generated by humans. This is why focusing on apocalyptic risks to the exclusion of human-mediated AI harms is short-sighted. Indeed, commentators have suggested that the focus on existential harm is a deliberate ploy to keep the legislators focused on solving the wrong problem.

A consequence of the epistemological panic spawned by LLMs is that new applications using these technologies are being developed rapidly with little or no regulation. The usual argument against oversight is that it is too burdensome and will stymie innovation. Human history is replete with cautionary tales of the unintended consequences of well-meaning efforts to innovate. Will the story of LLMs and AI turn out to be another?

–x–

The golem, like all myths and legends, is a mirror of our hopes and fears. Meyrink’s novel was published in 1913, in a Europe on the brink of upheaval. Today, the nightmares and visions of Meyrink’s nameless protagonist seem to foreshadow our own apprehensions and expectations about the changes that lie ahead. For better or worse, Wiener’s gadget worshippers have brought us to the brink of another upheaval – one that will forever alter the way we think, reason about and understand the world.

–x–

LLM as Golem (image by DALL-E)

Written by K

November 15, 2023 at 6:03 am

Posted in Understanding AI

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.