Why can text generated by an open-source model not be reliably traced back to it?

Opportunity

Closed-model providers can embed statistical watermarks in generated text at inference time, allowing content to be attributed to a specific model after the fact. Open-source models give users full access to the decoding procedure, so any generation-time watermark can be removed by modifying a few lines of sampling code. Post-hoc watermarking of already-generated text breaks under paraphrase attacks. Embedding markers in model weights survives some attacks but not fine-tuning, which anyone running local weights can apply in an afternoon. As of late 2025, no scheme provides practical, removal-resistant provenance marking for output from open-weights models, and the research community acknowledges the problem remains open.

Why it matters

Without watermarking for open models, AI-generated text provenance is only traceable when the generator chooses to cooperate.

機会をどう評価するか

The Opportunity Score is my own read, not a measurement: how much it hurts, how often it bites, and how little exists to solve it today. Higher means I think it is more worth building.

深刻度8/10

How much pain it causes when it shows up.

頻度8/10

How often people actually run into it.

ホワイトスペース9/10

How little good tooling exists for it today.

解決する価値のある問題をもっと見る

タブを閉じた瞬間にすべてのAIアプリが自分のことを忘れるのはなぜか？

新しい分野の学習が今もなお、何を質問すべきかを知ることを前提としているのはなぜか？

専門家でない人が、AIの言ったことを確認できないのはなぜか？

モデルをベンチマークでテストしながら、なぜ感覚だけで本番に投入するのか？

なぜAIエージェントは自分自身のミスを記憶しないのか？

なぜモデルが実際に何で訓練されたかを監査できないのか？

← 解決する価値のあるすべての問題 About Anurag →