The crime of writing well

The crime of writing well

Posted on: 4 April 2026

I never wrote a first draft at primary school. Straight to the final version, in Italian and in English, and I don't think it had much to do with grammar lessons. It was something else, a facility with language I didn't build so much as discover I already had. The same thing happened when I spoke: something would come out that unsettled people, read as arrogance, generated a low-level suspicion I couldn't quite account for. The practical consequence was that I learned to perform a kind of deliberate incompetence, to sand down the edges, to seem less capable in order to be taken seriously.

At some point I did a short course in rhetoric, specifically to understand how to calibrate register for different audiences. I thought I was correcting a flaw. I sharpened the gift instead.

Years later I understood it was never my problem. It was the problem of every system that measures competence through its relative absence, every system calibrated to the average that breaks down when it encounters the exception.

There is a case I keep returning to. I wrote about it here some months ago, but the earlier piece was about the specific episode; what interests me now is the structure underneath. An Italian researcher submits a paper to an AI safety community. The subject: how to improve the epistemic quality of RAG systems, the retrieval engines that feed information to language models. The work is careful, methodology explicit, limitations declared with an almost pedantic honesty: "these are early results", "what I cannot yet claim is". The kind of thing you call good scientific practice.

Automatically rejected. Reason given: AI-generated content.

He rewrites. Same substance, different title, thirty-two revised versions. Rejected again. The wording this time is almost poetic in its absurdity: "Not obviously not Language Model." The burden of proof has inverted without anyone announcing it. You must now demonstrate that you have not written too well.

I spent the better part of two decades working with people who built systems, first in the creative industries during the digital transition of the nineties, then across sectors different enough that the patterns stop feeling like coincidence. One thing became clear early: every measurement system carries a structural blind spot. Not because the designers were careless. Because the system measures a proxy, and the proxy eventually betrays.

AI detectors measure perplexity: how predictable the sequence of words in a text happens to be. Language models produce low-perplexity sequences because they optimise for coherence. This works reasonably well on the average population of texts. The problem is that expressive clarity, precise vocabulary and logical structure produce exactly the same signal for entirely opposite reasons. Not because an algorithm has optimised, but because a person has worked hard. The low perplexity of competently generated mediocrity and the low perplexity of carefully crafted prose are indistinguishable to the detector. Same effect, incompatible causes.

Research by Liang and colleagues, published in Patterns, found something more uncomfortable still: popular detection tools misclassify over sixty-one percent of texts written by non-native English speakers as AI-generated, while native American student writing passes almost without incident. A Chinese or Italian researcher writing careful English gets flagged; a domestic undergraduate writing carelessly does not. The system designed to protect academic integrity discriminates by geography, without knowing it and without being accountable for it.

This is where the pattern widens into something more structurally interesting.

The same mechanism turns up across contexts that have nothing obvious in common. Anti-money-laundering systems at banks flag professionals with complex financial structures and irregular transaction patterns for entirely legitimate business reasons, which is precisely the profile of someone using complex structures for illegitimate ones. Spam filters block carefully written newsletters because quality copywriting resembles fraudulent copywriting far more closely than it resembles hastily dashed-off messages. Credit scoring systems penalise people who have never carried debt, because the absence of credit history is indistinguishable from the absence of creditworthiness. Plagiarism detection software flags those who cite correctly and extensively more readily than those who do not cite at all.

The thread is consistent. The system measures a proxy statistically correlated with the behaviour it wants to intercept. It works on the average distribution. Then it fails on the exception, not the deficient exception but the excellent one, the one that resembles a violation not because it is a violation but because it has pushed a variable in the same direction for radically different reasons.

There is a technical name for this in signal theory: false positive. But the clinical formula misses the systemic dimension. These are not random errors distributed across the population. They are errors systematically concentrated on those who do things well. The system selects against virtue in a structurally predictable way.

In the nineties I watched something similar play out during the shift to digital distribution in the creative sector. The new distribution infrastructure had verification mechanisms built to catch the irregular practices of physical distribution: unauthorised copies, double accounting, falsified reporting. Those mechanisms worked tolerably well against sloppy operators. They were devastating for those who had built precise reporting systems, because precision generated anomalous patterns relative to the sector average. The system read rigour as irregularity.

The perverse incentive this produces is what troubles me most. If writing clearly gets you rejected, the rational response is to write less clearly. If precise reporting puts you under scrutiny, the rational response is to report approximately. If a complex but legitimate financial structure gets you flagged, the rational response is to simplify it even when complexity serves a purpose. The system does not eliminate incorrect behaviour: it teaches everyone to imitate mediocrity in order to pass unnoticed.

Popper's test for whether a knowledge system functions is to ask what would falsify it. AI detectors are not falsifiable in any useful sense: if the text passes, it is human; if it does not, it is AI; if a human does not pass, it is because they write like AI. The category "human who writes very well" does not exist in the model. It cannot exist, because acknowledging it would mean admitting the metric is wrong.

This is not a technical problem that a better detector solves. It is an epistemological one: you are measuring origin through quality, and quality is not a reliable proxy for origin. You can refine the algorithm as much as you like, but as long as you are measuring the wrong signal you are only relocating where the systematic errors concentrate, not eliminating them.

When a control system fails systematically on a specific category of subjects, what happens is not simply an individual injustice. The category stops doing that thing. Clear writers learn to write less clearly. Rigorous researchers learn to be less honest about the limitations of their work. Virtuous behaviour gets selected out of the system, not by the market, not by peers, but by the control mechanism itself.

Which brings us to a detail that closes the circle rather neatly. Alongside the detectors, a parallel industry of "humanizer" tools has emerged: software that takes text classified as AI and renders it sufficiently imperfect to pass the filter. The business model is transparent. Sell the detector that produces the false positive, then sell the remedy for the false positive you created. Whoever builds the detector pays nothing when it misclassifies your work; they collect quite comfortably when you need to purchase the next product in the range. Taleb would call this absent skin in the game. I would call it a structural incentive not to fix the problem.

A system born to protect the quality of intellectual discourse ends up degrading it systematically. The paradox dissolves the moment you notice that the system was never measuring quality: it was measuring a statistical proxy for quality, calibrated to a mean distribution that did not include the excellent tails.

--

Postscript, which is worth more than the argument above: this piece, written with explicit instructions for an AI system to avoid the patterns typical of AI writing, was classified by a detector as one hundred percent artificial. The detector is correct in the technical sense: the patterns are there, because the patterns of well-constructed prose do not depend on who constructed it. It is wrong in the sense that matters: it knows nothing about origin, only how to measure quality. And it keeps calling quality a crime.