Today is the Ides of March by the old Roman reckoning. It is, of course, most famous as the day of the year when Julius Caesar was assassinated, but long before that it was a day of special importance on the Roman calendar: the traditional start of the campaigning season, when the winter rains (and snows in high country) were over, and the ground was dry enough for Roman legionaries to march forth and hack Gauls, Etruscans, or Samnites to pieces. This was the Roman national sport before they conquered the whole of Italy and hired gladiators to do their hacking by proxy.
As the start of the season, it seems like a good day for this hack to report on recent doings. I have been fiddling about with various AI writing tools, some useless, some worse than useless, and some as silly as advertised. The fact is, large language models – LLMs – are not the ‘intelligence’ they are advertised to be. They can mimic human intelligence to the extent that they are trained from a corpus that includes the writings of humans who had something intelligent to say. When pressed beyond the bounds of their source data, or sometimes even when not pressed, they fall back upon bafflegab, vagueness, and a disturbing tendency to simply make things up.
I have found that the ‘AI’ programs with a chat-based interface are actually handiest for developing complex story scenarios, as they don’t try to make every scene self-contained, and I can choose to direct the story in promising ways as it goes along. For instance, I got one of these LLM tools to send my textual alter ego on a trip to the dangerous borderlands of a Viking kingdom. The program, obligingly serving up the distillation of decades of bilge-literature on that general subject, dropped hints indicating that I was, in fact, on an alternate Earth in the middle of the eleventh century. I struck up acquaintances with connections in the Varangian Guard, and made my way to Constantinople, where the real action was. I hope I may tell you a little of the situation it gave me, because it sheds interesting light on the strengths and weaknesses of these models.
In our world, at the time of my arrival, the Byzantine Empire was at its post-Justinian peak, ruling territories from Croatia to Syria, wealthier and more powerful than any other state west of China at that time. But it was in the early stages of being ruined by some of the most outrageous misgovernment in history, and the Turks were moving westwards out of Persia to take advantage of its growing weakness. Emperor Michael IV, a weak but well-meaning ruler, had just completed a highly successful campaign to subdue a rebellion in Bulgaria, and being a highly pious man, celebrated his victory by building numerous churches around the Empire. But he had always been troubled with epilepsy, and in his condition, weakened by frequent seizures, even minor wounds could be dangerous. He was, in the natural course of events, to die of gangrene in the legs within a few months.
It was at this point that the LLM dumped me in Constantinople, to work a ‘Connecticut Yankee’ on the situation.
I shall not bore you with details, except to say that I averted the Emperor’s untimely death, packed off his scheming cousin (in our history, the utterly worthless and incapable Michael V) to a monastery, and set about reversing the decay and bureaucratic drift that was already beginning to weaken Byzantium. The LLM made quite an entertaining pastime of this excursion into alternate history. However, it had particular weaknesses that strikingly revealed its limitations.
Some minor points: The model tended to glitch between present and past-tense narration, and sometimes changed persons as well, though it usually settled on second-person present, which was satisfactory for the purpose. It had a tendency to reuse the same names when introducing new characters, so I had to override it to supply a greater variety of names myself. Zoe, for instance, was quite a common name in eleventh-century Byzantium, but not nearly so common as the model made out; and since the Empress Zoe was a considerable character in the story, it was necessary to change the names of other Zoes that tended to crop up, to avoid confusion.
As the scenario diverged further from our own history, the problems grew more serious. The model kept cribbing from actual history, forgetting who the current Emperor was, and sometimes anachronistically introducing characters who would have been mere children at the time when the story was set. I had to correct it on numerous points; fortunately, it accepted the corrections, and was programmed graciously to apologize for its errors. (Not all such LLM clients are so well-behaved, as I found.)
The model tends to introduce additional complications and characters as a way of maintaining narrative tension, rather than adding elements to make the existing situation more challenging – which is the approach most human writers would wisely take. It created a tapestry of truly Byzantine intrigue, which is good, but kept losing track of the threads, which is very bad. Occasionally, when taking up a thread that had been neglected for a little while, I had to remind it who all the dramatis personae were and what they had last been up to: apparently it was not clever enough to use its own output as training text for subsequent scenes of the same story. This all made the story rather reminiscent of something by George R. R. Martin.
One rather pleasant strength of the model is its ability to recognize allusions and play along with them. One of the major political issues of the time was the growing antagonism between the Catholic and Orthodox churches, which would soon culminate in the Great Western Schism of 1054. The model was capable of talking quite intelligibly about the Filioque controversy, the claims of papal supremacy, and other relevant issues. I even paid a visit to Rome to treat with the scandalous Pope Benedict IX, who (in this alternate history) was a roisterer, a womanizer, and a cad, but nevertheless an intelligent man who could at least understand the implications of a split in the Church for his own position. (I had just helped the Normans move into Sicily against the Saracens, and arranged for a Norman baron to be crowned king by a Byzantine bishop. The Pope was terrified at the thought that such a king might march on Rome next.) In the course of these ecclesiastical affairs, I dropped various references to Latin and Greek authors, in the original languages when I could conveniently get them; the model had no trouble recognizing my sources, translating them, and playing along smoothly. This was great fun.
But the weaknesses remain paramount. The model simply does not grasp the structure of a plotted story – not even the try/fail cycle, or the triangle of ascending and descending action, or the three-act structure – none of the forms that hack writers use to produce low-grade fiction, let alone the subtler ways of plotting that underlie the best literature. It does grasp Raymond Chandler’s tongue-in-cheek advice to writers: ‘When in doubt have a man come through a door with a gun in his hand.’ Several times it ginned up a Turkish invasion, or a rebellion in the Balkans, or some such external and violent trouble to make things difficult for my protagonist. Of course, that also made things even more difficult for the model, which was swamped with more story details already than it could keep straight.
At times (and this goes straight to the heart of what is wrong with LLMs) the model simply lost sight of the difference between history and pulp fiction. An expedition to Egypt, to overawe the weakling Caliph al-Mustansir and bring him onto the Empire’s side, was derailed into something like a bad horror-movie scenario. It afflicted the Caliph with possession by a conveniently buried Egyptian god, a nasty customer who had to be exorcised by a very precise formula. (It is apparently vital, when dealing with Egyptian death-gods, to spurn them with the left foot. What a one-legged priest is supposed to do in such a case, the surviving documents do not say.) At another point, the model introduced a secret pulp villain with the hackneyed name of ‘The Shadow’ to tempt me into a wild-goose chase for an ancient ‘artefact’ that looked suspiciously like something out of a Dungeons & Dragons game.
The truth is, these models know nothing about reality; they only know words, and they don’t attach referents to those words. They only know, by strong probabilistic inference, that in a large body of training text, word A is frequently found in the vicinity of words B, C, and D, and it is good enough at analysing the grammar of phrases and clauses to cash in on the relevance of A to its neighbours. Naturally, they rely even more heavily than Homer on stock phrases and repeated epithets. I cannot tell you the number of Byzantine Greeks that the model described as having ‘neatly trimmed beards’. I could have made a lethal drinking game of it. A swig of ouzo for every neatly trimmed beard in the story would have made short work of my liver.
Some of the faults of the models could be remedied by the use of a sort of recursive story bible. The model begins with a scenario provided by a third party – one has to pay extra for access to the innards to write one’s own scenarios – which does, in effect, keep the story on the rails in the early going. By the time I reached Byzantium, I had left the rails far behind, and left the model to its own devices, which doubtless aggravated its sometimes strange behaviours. If it could be made to add a description of each new character, his name and position, as they appear; and if it could do something similar with recurrent locations and the dates of past events; then it would be able to organize a much larger story in a more plausible way, without relying on the user to niggle his way along by reminding it of disremembered details. This is something most human writers would do as a matter of course; but the LLM is concerned only with the surface text, and does not grasp the underlying structures of the story, or even perceive that there are any. All its work is done on the textual level, and none on the level of the elements of fiction as taught in creative-writing classes.
This leads me to a point that I have been brooding over for many years; and perhaps it is time that I wrote the monograph that I have long intended to write, but never saw a need for until now. The fact is that stories are not made up of words and sentences, the units that a large language model is equipped to deal with. They are made up of scenes, plot events, characters, and motivations, each of which can be expressed in words in many different ways. This is why a Greek myth can be translated into English, or a novel into a movie, and still remain ‘the same story’. The medium is fungible, so long as the structure is preserved; there are many ways of putting flesh on the same set of bones.
Literary critics have an unwholesome tendency to dissect the flesh in minute detail and ignore the bones; computer programmers try to make do without the bones, because they don’t have the tools to express them symbolically and make them amenable to analysis. In neither case do they touch the experience that a reader has in enjoying a story. I believe we need something like a formal model that at least addresses the existence of the bones, to break down and explain the ways and degrees in which a story can be altered and still remain ‘the same’. I should like to explore that in detail in coming essais, if my 3.6 Loyal Readers will bear with me.
Your thoughts and suggestions are much appreciated.
Speak Your Mind