Why can someone watching my encrypted LLM traffic still infer what I asked?

Opportunity

Whisper Leak, disclosed in late 2025, demonstrated that analyzing packet timing and size patterns in encrypted streaming LLM responses classifies prompt topics with greater than 98% precision across 28 major providers. Some providers including OpenAI and Mistral deployed fixes, but those mitigations address token-length patterns only. A separate attack exploits speculative decoding: the number of tokens accepted per decoding step varies with output content, and that signal leaks through even padded connections because padding does not eliminate the acceptance-rate fluctuation. Proposed defenses such as token batching reduce attack accuracy by 50% but do not eliminate it, and random padding imposes up to 8.7x payload overhead with residual leakage. No provider has shipped a complete mitigation for the speculative decoding variant.

Why it matters

Any user querying a streaming LLM from a network that logs traffic is leaking the topic of their query regardless of TLS encryption, including users who believe they are communicating privately with a medical, legal, or financial assistant.

How I score the opportunity

The Opportunity Score is my own read, not a measurement: how much it hurts, how often it bites, and how little exists to solve it today. Higher means I think it is more worth building.

Severity8/10

How much pain it causes when it shows up.

Frequency8/10

How often people actually run into it.

Whitespace8/10

How little good tooling exists for it today.

Why can someone watching my encrypted LLM traffic still infer what I asked?

How I score the opportunity

More problems worth solving