Malin Hay | Chattiness - The Spotted Cat Magazine

Several people I know now refer to ChatGPT as ‘Chat’. They give it human pronouns (Chat is usually a he) and ask it for restaurant recommendations, holiday schedules and relationship advice. Some go further, automating their office admin and getting it to summarise meetings and write reports. Passing off whole chunks of AI-generated text as your own work appears to be on the rise in the publishing world.

Last year, Hachette bought the rights to Shy Girl, a self-published horror novel by Mia Ballard, and released it in November to good sales. A Reddit post on r/horrorlit in February by a ‘book editor of twelve years’ picked out several passages that set alarm bells ringing:

The bows on my pigtails pull too tight, yanking the skin and stretching my head into something neat, into something pleasing, a quiet violence made beautiful.

My snout dips into the frosting, the sweetness rolling over my tongue, thick and sticky, a flood that chokes but insists on being swallowed. Beneath the pink gloss, the cake falls apart, crumbling into ash that coats my teeth, hollow sweetness that fills me with its nothing . . . His laughter cuts the air, sharp and jagged, a sound too big for the room.

Ballard denies using AI to write the book, blaming a freelance editor, but Hachette pulled it from publication in the UK and US.

The other week, the Commonwealth Short Story Prize winners were announced, and by arrangement the winning entries were published on Granta’s website. The winning story from the Caribbean, Jamir Nazir’s ‘The Serpent in the Grove’, contains sentences like this:

Coffee and cocoa leaned wild on a slope that wanted either rain in teeth or none at all. He knew every root that tripped a foot, the snake-curve of run-off, the brittle crumble after drought. He worked it alone and most days the land worked him back, a quiet quarrel older than his father and his father’s father.

The internet smelled a rat. Nazir, who seems to have few publications to his name, describes himself as an ‘organisational transformation and business expansion’ professional on LinkedIn. His long posts are about geopolitics and the ‘AI arms race’. One of them begins: ‘Let’s be clear: the “Cloud” is a physical, terrestrial liability. And AI is pushing it to its breaking point.’ The Commonwealth Prize, which had praised ‘The Serpent in the Grove’ for its ‘voice of restraint and quiet authority’, said that all the entrants had affirmed their work was their own and that the prize operated on the principle of trust. Granta says it will leave the story on its website until ‘definite evidence comes to light’.

I thought I didn’t use ChatGPT because I was too clever. I thought that not using ChatGPT made mecleverer. It turns out, though, that it made me very bad at spotting when a text was written by or with the assistance of AI. After the uproar over the Commonwealth Prize, I took a New York Timesquiz entitled ‘Who’s a better writer: AI or humans?’ I got three out of five correct –barely better than a coin toss. On Wikipedia’s ‘AI or not’ quiz, I got seven out of ten, but that was easier because none of the AI articles had footnotes.

I’m not the only Chat non-user who can’t tell when an LLM wrote something. Experimenters in the US last year showed nine subjects a series of articles, half written by humans and half generated by ChatGPT, Claude and other large language models. Asked to guess which of the texts were human, the four subjects who rarely or never used ChatGPT in their daily lives scored ‘at a similar rate to random chance’, while the five who used chatbots almost every day at work collectively misidentified only one in three hundred texts.

One of the problems with AI use seeping out of business and science writing and into the ‘literary’ world is that literary editors may be the worst equipped to identify AI writing. (It may also be easy to succumb to the pressure to go too far the other way – over-labelling work as AI-generated might be as bad as under-labelling it.) What are the main signs of AI writing? The more familiar tells include overuse of em dashes and the formulation ‘not x, but y’, which it has favoured since GPT-3. But none of the passages I quoted above contain either of those things, and they still have a distinct whiff of AI.

Some of the markers seem to be lexical: AIs like talking about sweetness, loudness, quiet, age and beauty. There is a lot of insisting in AI-generated texts, as well as a lot of promising, a lot of permitting and a lot of filling up. Another sign is the overuse of tricolons (‘something neat, something pleasing, a quiet violence made beautiful’). And bots often leave out definite articles from phrases where they’re not strictly necessary: ‘Coffee and cocoa leaned wild’, ‘rain in teeth’ or, from later in ‘The Serpent’, ‘Sita became obstacle by existing.’

There is a flatness or evenness to AI-generated texts: Wikipedia’s guide to detecting AI says that LLMs ‘tend to omit specific, unusual, nuanced facts’ and ‘replace them with more generic, positive descriptions’. The strange thing about this evenness is that it isn’t usually couched in neutral language: ‘a flood that chokes but insists upon being swallowed’ is violent, but not vivid. Even if you had your face shoved into a cake, it wouldn’t ‘flood’ your mouth. The prosody is so smooth that you feel a lack of pressure despite the description of gross or vile acts; it rings hollow. It fills you with its nothing.

A lot of people are arguing that the wary approach being taken by Granta and the Commonwealth Prize is inadequate, and that editors should be doing more to stop AI writing being published in the first place. This may be true, but what’s the best approach? Perhaps editors could memorise a list of tells to check submissions against, or spend remedial hours on ChatGPT and Claude – AI boot camp – to familiarise themselves with the cadences of LLM-speak.

There are AI tools that claim to detect AI content in writing, such as Pangram, which gives ‘The Serpent in the Grove’ a score of 100 per cent AI-generated. Where other AI detectors base their judgments on the perplexity of a text – basically, the predictability of a sequence of words – Pangram’s founder, Max Spero, says that his tool is based on gathering a large dataset of human-written texts, asking an AI to ‘mirror’ or reproduce them as closely as possible, and then contrasting the resulting texts to determine the patterns that distinguish AI from human writing. Pangram claims to have a 1 in 10,000 false positive rate, and Spero admits that it ‘does occasionally make mistakes’.

What about the cases where, say, 40 per cent of an article is AI-generated, or an AI has been used to edit and spell-check the work before submission? Both Shy Girl and ‘The Serpent in the Grove’ have sections which, in part because of their grammar mistakes, look as if they were written by a person.Is this functionally the same as a text written from start to finish by a bot, and should it be treated in the same way? I’m a Luddite who thinks it’s just as bad to use AI for some things as it is to use it for everything, but not everyone agrees. In any case, a text partly informed by AI use is harder to identify than one that was spewed out in ten seconds by Claude (though the people behind Pangram claim to be working on distinguishing more reliably between AI-written, partly AI-written and wholly human work).

The other problem is that, as time goes on and people become more and more reliant on generative AI in their daily lives, at school, university and work, human language is going to become more and more imitative of LLM-speak. Since at least Web 2.0, we’ve been trying to sound less and less distinctive. Influencers on Instagram narrate their day-in-the-life videos with the same affectless, globalised female uber-voice; LinkedIn and Reddit are overrun by bots trained on the slang and writing styles that were always a hallmark of those platforms. (Does anyone in real life use the word ‘friendo’?) Meanwhile, LLMs will get better at dodging the detectors and sounding more ‘real’. At some point, tools such as Pangram and human readers alike may struggle to find any distinction between meaningful human work and meaningless AI slop. And that isn’t just worrying – it’s terrifying.

View Original Article Here

Products You May Like

Articles You May Like