The Accusation Without Evidence Problem in Academic Publishing
I want to write about something that doesn’t get framed clearly enough.
When a manuscript is flagged for AI generation, what happens isn’t usually called an accusation.
It’s called a flag, a concern, a desk decision, a rejection. The language is procedural and soft.
But functionally, what’s happening is an accusation.
Someone is saying, on the basis of a tool’s output, that the author may have done something that violates academic norms. The author then has to respond, defend, or accept the consequence.
Without:
A clear statement of the accusation
Specific evidence
The opportunity to engage with that evidence
A path to appeal
This isn’t how we handle other kinds of accusations in academic publishing.
It’s worth asking why we’ve allowed it to work this way for this one.
What “Accusation” Actually Means Here
I’m using the word deliberately.
If a journal rejects a paper because the methods are unsound, that’s a critique. The author knows what’s being said. They can revise or argue.
If a journal rejects a paper because of “concerns regarding AI-generated content,” that’s a different kind of statement. It’s not just about the work. It’s about the author’s conduct.
The accusation, even when not stated explicitly, is that the author either:
Used AI to generate the writing
Did not disclose the use
Submitted the work as their own original product
That’s an integrity claim.
Integrity claims, in any other context, come with a process.
How Other Integrity Claims Are Handled
Compare what happens with other integrity issues in academic publishing.
If a paper is suspected of plagiarism:
Specific passages are identified.
The original sources are cited.
The author is given the opportunity to respond.
There’s a documented process for resolving the case.
If data integrity is questioned:
The specific concern is described.
The relevant figures or tables are identified.
The author can provide raw data, methodology, or clarification.
The journal often consults independent experts.
If image manipulation is suspected:
Specific images are flagged.
The manipulation is documented.
Forensic tools are used in a transparent way.
Authors can submit original files for verification.
In each case, the pattern is the same:
Specific claim
Specific evidence
Specific response process
For AI detection:
Vague claim
Opaque evidence
No standard response process
That’s a real gap.
The Evidentiary Standard
When I was first flagged, the rejection letter contained no evidence.
There was no:
Identification of which passages triggered the concern
Statement of the threshold or score
Citation of the tool used
Description of how the output was interpreted
The author guidelines didn’t reference any of these. The journal’s editorial policy didn’t either.
I had no information to respond to.
Compare this to plagiarism. If I’d been accused of plagiarism, the email would have included:
The specific passage
The source it allegedly came from
The percentage or pattern of overlap
An invitation to respond
Those things exist because, over time, the field developed expectations about what an integrity accusation has to include.
For AI detection, those expectations haven’t developed yet.
They should.
What an Honest Process Would Look Like
If a journal is going to flag a manuscript for AI generation, I’d expect:
A statement that the manuscript was screened, with the tool identified by category if not by name.
A description of what triggered the concern, ideally specific to passages or sections.
The numerical output of the tool, with whatever interpretation the editorial team applied.
An invitation to respond, with a reasonable timeframe.
A documented process for resolving the case.
This isn’t elaborate. It’s the same structure that’s standard for any other integrity question.
What I see, in practice, is:
A flag, often without explanation.
A decision, often without appeal.
A resolution that puts the burden on the author to either accept the rejection or argue against a system they can’t see.
That’s the accusation without evidence problem.
Why This Persists
I don’t think this state of affairs is the result of bad intent.
I think it’s the result of three things.
First, the technology is new, and the field hasn’t yet developed the procedural muscle for handling its outputs.
Second, the volume of submissions is high, and any process that adds steps for editors faces resistance.
Third, the tools themselves are opaque, which makes it hard for editors to provide the specifics they would otherwise be expected to provide.
Each of those is understandable.
None of them justifies the current state.
What Authors Are Doing in Response
I’ve talked to people who’ve been flagged.
Their responses fall into a few patterns.
Some accept the rejection and move on, treating it as one of the many subjective rejections that happen in the system.
Some try to appeal, and have varying success depending on the journal.
Some pre-empt by running their work through detection tools themselves before submission. If the score looks high, they revise, sometimes manually, sometimes with tools that humanize AI or the various paraphrasing systems that have become available.
A few have stopped submitting to journals where they’ve been flagged. They’ve redirected to journals or venues where they think the screening is less aggressive.
None of these responses is satisfying. They’re all adaptations to a process that isn’t really designed to be appealed.
The Procedural Cost
The cost of treating accusations as routine flags isn’t just borne by the flagged authors.
It’s borne by the integrity system itself.
If the field can’t distinguish between rigorous accusations and casual screenings, then the credibility of accusations more broadly is weakened.
Rigorous accusations of integrity violations are how the field self-corrects. They have to mean something.
When a tool’s output, with no human verification, can produce something that functionally operates as an accusation, then the meaning of accusations as such gets diluted.
That’s a slow corrosion.
It doesn’t show up in any single case.
It shows up across the system, over time, in the form of authors who don’t trust the process and editors who can’t fully defend it.
What I’d Like to See
The simplest version of the ask is procedural:
If you flag, document.
If you flag, explain.
If you flag, allow response.
These aren’t burdensome additions. They’re just what we already require for any other integrity question.
The current asymmetry, where AI flagging operates without the procedural infrastructure that everything else has, is unstable.
It will eventually have to be normalized, in one direction or another.
Either AI flags will be brought into the same procedural standards as other integrity questions.
Or other integrity questions will be brought down to the looseness of current AI flagging.
I have a strong preference for the first version.
I think most working researchers do.
The question is whether the institutional structures will move that way without being pushed.
Where This Lands
I started thinking of these as accusations partly because of how they feel.
Once you’ve been on the receiving end of one, the procedural framing starts to read as evasive.
A flag is an accusation. A rejection on the basis of a flag is a consequence of that accusation. The lack of explanation is the lack of evidence.
The polite language doesn’t change the underlying transaction.
What we’re doing, currently, is allowing a category of accusation to operate without the procedural standards we apply to every other category.
That’s not sustainable.
It’s also not, in any meaningful sense, fair.
I’d rather we say so out loud.

