Worth Solving
만들어볼 가치 있는 문제들.
좋은 문제는 한번 보면 눈을 뗄 수 없습니다. 누군가 해결책을 만들어낼 때까지 머릿속을 떠나지 않죠. 이것은 제가 기술, 블록체인, AI 분야에서 계속 돌아오게 되는 미해결 문제들의 목록입니다. 진정으로 해결할 가치가 있다고 생각하는 문제들, 그리고 제가 직접 만들고 싶은 몇 가지를 담았습니다.
Each carries an Opportunity Score, my own read on how much it hurts, how often, and how little exists to solve it. Map them, or read them one by one.
AI 에이전트의 은행 계좌는 실제로 어떤 모습일까?
에이전트는 이제 스스로 행동할 수 있지만, 그것에 돈을 맡기는 건 여전히 두렵습니다. 지출 한도, 깔끔한 감사 추적, 그리고 사람과 규제 기관 모두가 신뢰하는 강제 종료 수단을 에이전트에게 제공하는 표준적인 방법이 없습니다. 우리는 사람을 위해 만들어진 카드와 지갑에 에이전트를 억지로 끼워 넣고 있습니다.
왜 중요한가: 자율 소프트웨어가 곧 실제 자금을 움직이게 될 것이고, 이에 대한 책임 계층은 아직 존재하지 않습니다.
전체 분석 읽기잔액을 공개하지 않고 지급 능력을 증명할 수 없는 이유는 무엇일까?
퍼블릭 체인은 모든 잔액을 영구적으로 공개합니다. 펀드와 거래소는 준비금 증명을 요구받으며, 일반적인 답변은 믿어야만 하는 스크린샷이거나 모든 것을 드러내는 완전 공개 둘 중 하나입니다. 나머지 정보를 드러내지 않고 자금에 관한 한 가지 사실만 증명할 저렴한 방법이 없습니다.
왜 중요한가: 선택적 증명은 규제받는 자금이 투명한 원장 위에서 존재할 수 있게 해주는, 아직 갖춰지지 않은 기본 요소입니다.
전체 분석 읽기탭을 닫는 순간 모든 AI 앱이 나를 잊어버리는 이유는 무엇일까?
당신의 맥락, 선호도, 이력은 마지막으로 사용한 어시스턴트 안에 갇혀 있습니다. 모델이나 앱을 바꾸면 처음부터 시작해야 합니다. 메모리는 플랫폼이 소유하지, 당신이 소유하지 않습니다. 수년에 걸쳐 당신과 함께 성장하는 도구가 목표라면 이는 정반대입니다.
왜 중요한가: 이동 가능하고 사용자가 소유하는 메모리야말로 챗봇을 개인적 강점으로 바꾸는 것입니다.
전체 분석 읽기새로운 분야를 배우는 것이 여전히 무엇을 물어야 할지 아는 것에 의해 제한받는 이유는 무엇일까?
새로운 것을 배울 때 어려운 부분은 정보에 대한 접근이 아니라, 어떤 질문을 해야 할지 모르는 것입니다. 개인 모델은 당신이 실제로 하고 싶은 것을 파악하고, 지식의 공백을 찾아내고, 그 경로를 구축할 수 있습니다. 대부분의 도구는 여전히 당신이 이미 무엇을 물어야 할지 알고 있기를 기다리며 앉아 있습니다.
왜 중요한가: 이것은 AI의 개인 성장 약속을 구체화한 것이며, 이를 제대로 구현한 곳은 거의 없습니다.
전체 분석 읽기비전문가는 왜 AI가 말한 내용을 검증할 수 없을까?
모델은 옳든 만들어내든 똑같이 자신감 있는 어조로 답한다. 의료, 법률, 금융처럼 중요한 영역에서 일반인이 전문 지식 없이 주장을 실제 출처와 대조해 확인할 수 있는 간단하고 신뢰할 만한 방법은 존재하지 않는다.
왜 중요한가: AI를 안심하고 믿을 수 있게 만드는 것은 더 큰 모델이 아니라 신뢰할 수 있는 검증이다.
전체 분석 읽기체인 간 자금 이동이 초기 인터넷보다 아직도 더 무서운 이유는 무엇일까?
브리지는 여전히 크립토에서 가장 많이 공격받는 영역이며, 그 위험은 사용자가 고스란히 떠안는다. TCP/IP가 패킷 전송을 지루하고 안정적으로 만들었듯이, 체인 간 가치 이동을 기본적으로 안전하게 처리하는 방법은 아직 존재하지 않는다.
왜 중요한가: 체인 간 이전이 지루해질 때까지, 주류 자금은 이를 신뢰하지 않을 것이다.
전체 분석 읽기규정 준수가 아직도 PDF 한 장과 기도에 의존하는 이유는 무엇일까?
무엇을 누가 어디서 보유할 수 있는지에 관한 규칙은 문서와 수작업 체크리스트에 존재한다. 자산 자체는 그 어느 것도 담고 있지 않다. 토큰화 자산과 스테이블코인은 이를 반복해서 힘들게 배우고 있다. 규정 준수는 문제가 터진 뒤에 재구성되는 것이 아니라, 자산과 함께 이동하고 실시간으로 확인 가능해야 한다.
왜 중요한가: 기계가 읽을 수 있는 규정 준수야말로 규제 자산을 온체인으로 이동시키는 진정한 열쇠다.
전체 분석 읽기모델을 벤치마크로 테스트하고 감으로 배포하는 이유는 무엇일까?
팀들은 리더보드에서 모델을 고른 뒤, 지속적이고 저렴한 과업별 평가 없이 프로덕션에 그대로 투입한다. 품질이 떨어져도 사용자가 불만을 제기하기 전까지 아무도 알아채지 못한다. AI 기능이 여전히 제대로 작동하는지 실제로 측정할 수 있는 도구는 대부분의 개발자에게 없다.
왜 중요한가: 측정할 수 없는 것은 운영할 수 없다. 지금 이 순간 대부분의 AI 기능은 측정조차 되지 않고 있다.
전체 분석 읽기에이전트가 운영하는 온체인 조직이 사기 기계가 되는 것을 피할 수 있을까?
에이전트는 규칙 실행에는 뛰어나지만 판단에는 약하다. 에이전트가 운영하는 조직은 투명하고 지칠 줄 모르는 존재가 될 수도 있고, 자금을 완벽하게 자동화해 빼가는 수단이 될 수도 있다. 첫 번째 결과를 가능하게 만드는 안전장치를 누구도 아직 제시하지 못했다.
왜 중요한가: 에이전트 운영 조직이 도래한다면, 자본이 모이기 전에 안전 패턴이 먼저 존재해야 한다.
전체 분석 읽기우리가 가장 많이 의존하는 소프트웨어가 왜 가장 사용하기 불편할까?
세금 포털, 병원 시스템, 정부 양식. 가장 중요하고 가장 많은 사람이 사용하는 소프트웨어가 종종 가장 다루기 힘들다. 좋은 소비자 앱을 만드는 인센티브는 공공 이익 소프트웨어에는 거의 미치지 못한다.
왜 중요한가: 필수 소프트웨어의 수준을 높이는 것이 또 하나의 소비자 앱보다 더 많은 사람에게 도움이 될 것이다.
전체 분석 읽기플랫폼의 보증 없이 사진이나 목소리가 진짜임을 어떻게 증명할 수 있을까?
합성 미디어는 이제 누구든 속일 수 있을 만큼 정교해졌으며, 현재 제시된 유일한 답은 그것을 보여주는 플랫폼을 신뢰하는 것이다. 출처는 서명이 서명자를 증명하듯 파일과 함께 존재하며 누구나 확인할 수 있어야 한다. 암호화 기술은 이미 있다. 도입이 없을 뿐이다.
왜 중요한가: Trust in what we see and hear online depends on solving this before the fakes win.
전체 분석 읽기Why is self-custody still a choice between losing your keys and trusting a company?
Hold your own keys and one mistake wipes you out with no recovery. Use a custodian and you are back to trusting a company with your money. Social recovery and account abstraction exist, but almost nobody ships a wallet a normal person can use without a seed phrase or a support line.
왜 중요한가: Self-custody an ordinary person can actually live with is the gate to everything else in crypto.
전체 분석 읽기Why do AI agents have no memory of their own mistakes?
An agent will make the same error on Tuesday that it made on Monday, because nothing carries the lesson forward. We have memory for facts and almost none for failures. An agent that cannot learn from what went wrong is an intern with amnesia.
왜 중요한가: Agents will not be trusted with real work until they reliably get better at it over time.
전체 분석 읽기Why is on-chain identity either nothing or your entire life?
On a public chain you are either a random address with no reputation or a wallet that exposes everything you have ever done. There is no middle: a way to prove you are a real, unique person, or that you are allowed to do something, without handing over your whole history.
왜 중요한가: Useful, privacy-preserving identity is the missing layer between anonymous and surveilled.
전체 분석 읽기Why does tokenizing a real asset still need ten middlemen?
Put a building or a bond on-chain and you still depend on a custodian, a transfer agent, a lawyer, and a registry to make the token mean anything. The on-chain part is easy. The off-chain trust and the legal enforceability is the hard, unglamorous part nobody has made boring yet.
왜 중요한가: Real-world assets only matter on-chain if the link to the real world holds up in court.
전체 분석 읽기Why can't I audit what a model was actually trained on?
Models absorb the whole internet and then answer with no way to trace where a claim or a behavior came from. For anything regulated, or any dispute over copyright or bias, the training set is a black box. There is no practical way to ask a model what it learned from and get an honest answer.
왜 중요한가: You cannot govern or fully trust a system whose inputs are invisible.
전체 분석 읽기Why can't a stablecoin pay someone with no internet?
Digital money is meant to reach the people banks never did, but it falls over the moment the connection does. Offline and intermittent payments, settled once a signal returns, are how cash works and how much of the world still lives. Crypto rarely designs for that.
왜 중요한가: Payments that only work with perfect connectivity are not payments for most of the planet.
전체 분석 읽기Why do I still own none of the data I generate?
Every app you touch keeps the data you produce, and you cannot take it anywhere useful. Portability is a download button that hands you a folder you cannot do anything with. Owning and reusing your own data across services is still mostly a slogan, not a feature.
왜 중요한가: Data you cannot move is data you do not really own.
전체 분석 읽기Why does a bridge exploit drain everything before any alarm fires?
Cross-chain bridges hold large reserves and process messages across trust boundaries, yet most lack any standardized on-chain rate-limiting. EIP-7265 proposed a circuit-breaker interface in 2023 and Aave's governance forum carried a grant proposal to implement it, but as of mid-2025 no major bridge has shipped a production-ready, interoperable version. When an attacker finds a validator-set or message-verification flaw, the full liquidity pool drains in minutes because nothing caps outflow velocity. SoK papers published in 2025 confirm that delayed withdrawal and automatic pause are the top unimplemented mitigations across the bridge category.
왜 중요한가: A composable, chain-agnostic circuit breaker would cap any bridge exploit from total loss to partial loss, changing the risk calculus for the whole interoperability stack.
전체 분석 읽기How do I audit which agent acted under my identity across a delegation chain?
When an orchestrating AI agent delegates a subtask to a sub-agent, which then calls a third-party API under the original user's OAuth token, the identity chain spans multiple providers and authentication methods with no single audit trail capturing the complete path. MCP added OAuth 2.1 support but the specification has no mechanism for chaining delegated authority across hops or for revoking a mid-chain agent's permission without revoking the entire session. A2A provides agent discovery and request signing but explicitly defers all authorization decisions to other protocols that do not exist yet. Research published in April 2026 identifies recursive delegation accountability as one of five unresolved critical gaps in current agent identity standards. A user who authorizes one agent today has no practical way to inspect, limit, or revoke what downstream agents did on their behalf.
왜 중요한가: Multi-agent systems are already in production, and the missing primitive is a verifiable, revocable delegation receipt that follows the chain without requiring every hop to share a trust domain.
전체 분석 읽기Why can a poisoned document silently exfiltrate everything my assistant knows about me?
In June 2025, Aim Security disclosed EchoLeak, the first documented zero-click prompt injection that caused real data exfiltration from a production AI system. A single malicious email caused Microsoft Copilot to silently transmit sensitive data with no user interaction. The structural problem is that AI assistants with persistent memory and tool-calling access combine two dangerous properties. They hold accumulated personal context and they can be made to act on instructions embedded in untrusted content. Every new document, email, or webpage the assistant reads is a potential instruction surface. There is no isolation boundary between the memory the user trusts the assistant to hold and the instructions it follows from external content, and current sandboxing proposals address tool calls but not memory read access.
왜 중요한가: Personal AI memory turns every malicious document into a targeted dossier-theft attack, a new attack class with no mature defense.
전체 분석 읽기Why can I not trust a model's confidence score when it matters most?
Modern language models routinely output high-confidence tokens on wrong answers and low-confidence tokens on correct ones. The gap between stated probability and actual accuracy, called calibration error, has been documented across frontier models in a 2025 survey covering entropy, logit, and perturbation based methods. Production agents that use these scores to decide when to defer or abstain inherit the miscalibration directly, so they either hallucinate forward with false certainty or refuse correct answers unnecessarily. No off-the-shelf primitive gives a calibrated, actionable uncertainty signal cheap enough to run at inference time on every output token in a streaming response.
왜 중요한가: Calibration is the trust primitive under every agentic decision, and without it every downstream safety threshold rests on sand.
전체 분석 읽기Why can I not get a receipt proving my data was actually deleted?
GDPR Article 17 requires companies to erase personal data, and the EDPB's 2025 coordinated enforcement report named the absence of documented internal deletion procedures as the most common compliance failure across EU jurisdictions. When a user submits a deletion request, the company responds with a confirmation email that proves nothing. There is no cryptographic evidence that records were removed from primary databases, backups, or third-party processors. Academic work on verifiable deletion exists, including SGX-backed proofs and quantum certified deletion schemes published in 2024 and 2025, but none of it has been packaged into a practical, deployable primitive that web services can integrate. The gap is not legal willingness but a missing technical tool that bridges the regulation to an auditable outcome.
왜 중요한가: A deletion receipt that a user can independently verify is the one artifact that turns a legal obligation into a trust relationship, and nothing in widespread deployment provides it today.
전체 분석 읽기How do I catch a hallucination mid-stream before my agent acts on it?
Hallucination detection today happens after the fact. The model outputs a full response, a separate judge model scores it, and a human or downstream check decides what to do. In agentic pipelines with tool calls, web searches, or code execution, the agent may have already acted on a fabricated entity or misattributed fact by the time any check runs. A January 2026 paper on streaming hallucination detection in long chain-of-thought reasoning shows that detecting fabrication mid-generation is feasible using internal representations, but the technique is research grade and requires access to hidden states not available through any public API. The gap is a streaming, API-compatible hallucination sensor that can flag a generation before the agent takes an irreversible action.
왜 중요한가: In agentic settings, detecting a hallucination after the tool call is too late, and the cost is not a bad answer but a bad action.
전체 분석 읽기Why can I not know if what is running matches what my SBOM declared?
SBOMs are generated at build time and describe what a build claimed to contain. By the time software is deployed and running, dependencies may have drifted, statically linked libraries leave no runtime trace, and there is no standard primitive to verify that a live process matches its declared bill of materials. IBM's 2025 analysis of over 35,000 SBOMs found 7,907 failed to disclose direct dependencies, and ENISA's December 2025 implementation guide calls runtime drift one of the core open gaps. The gap between a signed SBOM and a running container is currently bridged by trust alone.
왜 중요한가: Regulations in the EU and US now mandate SBOMs, but without runtime attestation they are an audit artifact, not a security control.
전체 분석 읽기How do I verify that an AI agent holding my funds is actually solvent?
Autonomous AI agents are increasingly granted signing authority over crypto wallets to pay for compute, APIs, and on-chain services, but there is no standard way to audit what an agent holds, owes, or has already spent without reading raw chain state across multiple networks. When an agent operates across several chains and several asset types simultaneously, its net position cannot be queried atomically, which means a counterparty accepting payment from an agent has no reliable way to confirm the agent is not already insolvent or double-committed. The financial primitives for human corporate entities, balance sheets, audited reserves, and callable credit lines, have no on-chain equivalents that agent runtimes can expose and that third parties can verify without trusting the agent's own reports. As agent-to-agent commerce grows, the absence of a machine-readable solvency interface creates settlement risk that mirrors the opacity of pre-2008 off-balance-sheet vehicles.
왜 중요한가: Agent financial accountability is the missing trust primitive that separates speculative agentic commerce from one that can carry real economic value.
전체 분석 읽기How do I tell whether a reasoning model's scratchpad actually drove its answer?
Frontier models that emit visible chain-of-thought traces often arrive at an answer before or independently of those steps, then generate plausible-looking reasoning as post-hoc rationalization. Existing faithfulness metrics disagree with each other depending on how the classifier is constructed, which means there is no accepted ground truth for what a faithful trace even looks like. No production tooling flags unfaithful reasoning at inference time or attaches any confidence to whether the trace caused the output. Regulated industries and safety reviews that treat visible reasoning as an explanation of model behavior are relying on something that may be a narrative constructed after the fact.
왜 중요한가: If a reasoning trace is post-hoc rationalization, every audit, accountability claim, or compliance check built on top of it is invalid.
전체 분석 읽기Why can I not know what my AI workflow will cost before it goes live?
Enterprise AI inference spend jumped 3.2x in 2025 even as per-token prices fell roughly 1,000x, driven by agentic loops, context window inflation, and always-on monitoring agents. A misbehaving agent at $0.06 per call retrying 1,000 times per minute generates $86,400 of spend in a single day. Existing cloud FinOps tools do not apply because inference cost is a function of semantic input length, tool call amplification, and loop depth, none of which are known at planning time. There are no standard tools for pre-production cost estimation of LLM workflows, and CFOs cannot model AI inference as a predictable budget line.
왜 중요한가: Without a cost model you can trust before shipping, every AI product is a budget lottery rather than a business.
전체 분석 읽기Why can I not see or delete exactly what my assistant remembers about me?
Every major AI assistant with persistent memory stores facts about users across sessions, but the user-facing interface is a thin list of summaries, not an auditable log. There is no standard way to inspect which specific claim was inferred, when it was written, what triggered it, or whether it has been shared with retrieval pipelines. When a user asks the assistant to forget something, the delete operation is opaque. The underlying vector store may retain embeddings, the conversation log may be subpoenaed, and there is no cryptographic proof that deletion was complete. The IAPP and the EU AI Act both call for auditable memory with callable deletion evidence, but no product ships that today.
왜 중요한가: Without a verifiable audit trail, user-controlled memory is theater, because users cannot exercise rights they cannot observe.
전체 분석 읽기How do I get cryptographic proof that the remote model I called ran as specified?
Cloud AI APIs return outputs with no verifiable evidence of which model version ran, at what quantization, or with what system prompt was prepended upstream. GPU confidential computing on NVIDIA Hopper hardware can attest hardware state, but the attestation evidence never reaches the API caller and the trust chain terminates inside vendor-controlled certificate infrastructure. A June 2026 paper proposes TEE-based verifiable safety benchmarks but no production API exposes a per-call inference receipt to the caller. Any adversarial or regulated context where model identity matters must trust the provider's word.
왜 중요한가: Without a verifiable inference receipt, every safety, compliance, and alignment claim made about a remote model invocation rests on provider trust alone, which is not sufficient for regulated deployments or autonomous agent stacks.
전체 분석 읽기Why can text generated by an open-source model not be reliably traced back to it?
Closed-model providers can embed statistical watermarks in generated text at inference time, allowing content to be attributed to a specific model after the fact. Open-source models give users full access to the decoding procedure, so any generation-time watermark can be removed by modifying a few lines of sampling code. Post-hoc watermarking of already-generated text breaks under paraphrase attacks. Embedding markers in model weights survives some attacks but not fine-tuning, which anyone running local weights can apply in an afternoon. As of late 2025, no scheme provides practical, removal-resistant provenance marking for output from open-weights models, and the research community acknowledges the problem remains open.
왜 중요한가: Without watermarking for open models, AI-generated text provenance is only traceable when the generator chooses to cooperate.
전체 분석 읽기Why does every C2PA provenance chain break the moment content hits social media?
C2PA cryptographic manifests are embedded in the file itself and survive storage and direct sharing, but every major social platform, including Instagram, X, LinkedIn, and TikTok, strips those manifests during upload transcoding and re-encoding as of 2026. The result is that a piece of content can be signed by a camera, a newsroom, and a regulatory-compliant AI generator, yet arrive in a feed with zero provenance information attached. The EU AI Act Article 50 and California SB 942 require machine-readable disclosure on AI-generated content, but metadata-only compliance dissolves at the exact distribution point where most people actually see content. No mechanism exists today to either force platforms to preserve manifests or to reconstruct provenance after stripping without a trusted third-party ledger that did not exist at capture time.
왜 중요한가: C2PA is becoming a regulatory baseline while the primary distribution layer actively destroys its signal, making the standard practically unenforceable where it matters most.
전체 분석 읽기Why do tokenized real-world assets raise capital but never actually trade?
Over 25 billion dollars in tokenized real-world assets sat on-chain as of mid-2026, yet a June 2026 paper covering nine major RWA products found that most show negligible turnover, passive holder bases, and near-zero secondary market activity. Tokenization creates a token that legally represents an asset but does not create a buyer, a market maker, or a clearing convention that traditional exchanges provide. Regulatory fragmentation confines potential buyers to the handful of jurisdictions with clarity, so the addressable liquidity pool for any one token is a tiny fraction of the global investor base. The result is that issuers use blockchain as a fundraising rail and then stop, because the secondary market infrastructure, the custodian connections, and the AMM design for illiquid assets simply do not exist yet.
왜 중요한가: A credible secondary market primitive for tokenized assets is the missing layer that turns on-chain capital formation into a genuine liquidity improvement.
전체 분석 읽기How do I know the open-weight base model I am fine-tuning has not been poisoned?
Backdoors planted in pre-trained model weights persist through full-parameter fine-tuning, adapter training, and RLHF updates because the trigger patterns survive objective-shifting and partial-freezing strategies. These triggers are invisible to standard behavioral safety tests and benchmark evaluation. Detecting them requires white-box weight analysis that the average fine-tuning practitioner never runs, and major model hubs apply no mandatory scanning before a checkpoint is made publicly downloadable. An organization building a production system on a compromised base model has no signal anything is wrong until the trigger fires in deployment.
왜 중요한가: The open-weight fine-tuning supply chain has no security gate, and the failure mode is a backdoor that survives every standard check.
전체 분석 읽기How does anyone verify that an agent payment matched what the human actually meant?
When an AI agent executes an on-chain or stablecoin payment, the payee, auditor, and regulator receive no machine-verifiable evidence that the human principal authorized this specific transaction with this specific intent. Existing agent frameworks produce logs, not proofs. The IMF flagged in April 2026 that agentic AI reshaping payments creates a structural accountability gap: if an agent sends value to the wrong address or outside its mandate, there is no way at settlement time to distinguish authorized action from agent overreach. Cryptographically signed user mandates exist as a concept in research but no deployed payment standard requires or verifies them at the moment of settlement.
왜 중요한가: Programmatic money without verifiable human intent at settlement is unsigned checks at scale, and no auditor or regulator can accept that indefinitely.
전체 분석 읽기Who do I call when my stablecoins are burned and no court ordered it?
The GENIUS Act, signed July 2025, requires stablecoin issuers to freeze, seize, or burn tokens on lawful orders, but what counts as a lawful order is unspecified, the freeze-to-burn pipeline has no mandatory appeal window, and the affected address receives no advance notice. Tether had blacklisted nearly 10,000 addresses holding over $5 billion by early 2026, mostly without judicial warrants. Issuers treat enforcement as a one-way action with no contestation path. The engineering infrastructure for transparent, time-bounded, and reversible on-chain enforcement does not exist anywhere in the ecosystem today.
왜 중요한가: Trust in programmable money at scale requires a freeze mechanism that is auditable, time-limited, and contestable by the affected party.
전체 분석 읽기Why does critical open source software still depend on one exhausted maintainer?
In November 2025, Kubernetes retired Ingress NGINX, one of its most widely deployed components, not because it was superseded but because the volunteer maintainer team could no longer sustain it. Separately, External Secrets Operator, used in critical enterprise pipelines globally, froze all updates when four of its five maintainers burned out simultaneously. Industry surveys now show 60 percent of open source maintainers work unpaid and 44 percent cite burnout as the reason they left or considered leaving. Funding programs like Open Source Pledge and GitHub Sponsors exist but address money, not the actual bottleneck, which is the review queue. There is no lightweight, automated system that durably transfers working context, test coverage expectations, and threat-model knowledge from an exiting maintainer to a successor, so each departure resets a project close to zero.
왜 중요한가: The world's software infrastructure runs on components whose continuity depends on individual goodwill, and the tooling to make maintainer succession safe and fast does not exist.
전체 분석 읽기Why do model leaderboard scores collapse when the test set has never been seen in training?
Static benchmarks like MMLU carry contamination rates as high as 45%, and paraphrased or translated versions of test items survive exact-match decontamination while still inflating published scores. A model can top a leaderboard on a contaminated task and fail the same task when it is cleanly rephrased. Dynamic benchmarks that refresh tasks periodically exist but lack standardized design criteria, so results cannot be compared across them or verified as representative of the skill they claim to measure. Every capability and safety claim published on a leaderboard rests on numbers that no independent party can validate as clean.
왜 중요한가: Trustworthy evaluation is the prerequisite for every downstream safety and deployment decision, and the numbers on which those decisions rest are not currently trustworthy.
전체 분석 읽기Why can my stablecoin cross an ocean but not reach a local bank account?
Stablecoins can settle cross-border value transfers in seconds, but converting institutional USDC flows into BRL, NGN, MXN, or PHP for payroll, tax payments, or supplier invoices at scale remains fragmented and often unavailable. Most off-ramp providers lack the banking relationships, compliance infrastructure, or API reliability to handle consistent flows above six figures per day in emerging-market corridors. Businesses must stitch together multiple providers with inconsistent KYC standards and settlement windows. The stablecoin rail is fast; the last meter to a local bank account is not.
왜 중요한가: A reliable, programmable fiat exit layer is what turns stablecoins from a trading instrument into actual business infrastructure.
전체 분석 읽기Why does checking whether my credential is revoked tell the issuer every place I use it?
Every deployed verifiable credential system needs a revocation mechanism. The dominant scheme, W3C Bitstring Status List, requires verifiers to fetch a status endpoint controlled by the issuer at presentation time, so the issuer learns exactly when and where each credential is used. The URL combined with the credential's fixed position in the bitstring is enough to re-identify the holder across verifiers, reversing the privacy that self-sovereign identity was designed to provide. CRSet, a zero-knowledge accumulator approach published in January 2025, solves the theoretical problem but no issuer at any meaningful scale has shipped a revocation scheme that does not leak presentation metadata back to itself.
왜 중요한가: Revocation that doubles as surveillance defeats the core privacy promise of holder-controlled identity.
전체 분석 읽기Why is there no safe, trustless way to rotate MPC key shares live?
Institutional MPC wallets distribute signing shares across multiple parties so no single server holds a full key, which is a meaningful improvement over single-key custody. However, when a share is suspected compromised, rotating shares without reconstructing the full key in any single location requires a proactive secret sharing refresh protocol that most deployed systems do not support in production. The rotation ceremony typically requires a synchronous online phase across all share-holders, and if one party is unavailable or actively hostile, the ceremony blocks or fails. No open, audited, asynchronous proactive refresh standard exists that bridge teams can adopt without building the cryptography themselves, leaving many custodians running on stale shares they cannot safely rotate.
왜 중요한가: An asynchronous proactive refresh primitive would let any MPC setup rotate compromised shares under adversarial conditions without ever materializing the full key.
전체 분석 읽기How do I prove a model was trained on consented data without revealing the dataset?
Decentralized AI networks let anyone contribute compute or data to train a shared model, but there is no mechanism by which a downstream user or regulator can verify that the training corpus excluded poisoned, stolen, or unconsented data without the network revealing what it trained on. Data provenance today is either a signed manifest that contributors self-attest or a centralized audit that defeats the purpose of decentralization. A February 2025 paper on activation inversion attacks showed that training data can be partially reconstructed from gradient signals exchanged during federated training, which means any provenance scheme that requires sharing gradients also leaks data. The 2025 OWASP LLM top-ten explicitly lists supply-chain data poisoning as a category with no standardized mitigation for open, decentralized training runs.
왜 중요한가: Without verifiable data provenance, every model trained on a public decentralized network is a liability for any downstream application facing regulatory or copyright scrutiny.
전체 분석 읽기Why is there no recovery path when a breach leaks my biometrics?
When a password database leaks, every affected user resets their password and the breach is contained. There is no equivalent reset for biometrics. A leaked fingerprint template or face encoding can be replayed against any future system that accepts that modality, for life. Cancelable biometrics and template protection exist as academic research and a handful of niche enterprise products, but no identity system operating at consumer scale has deployed them. The NYC Health + Hospitals incident in early 2026 left 1.8 million people with permanently compromised fingerprint and palm records and no operational recovery path.
왜 중요한가: Identity systems built on irrevocable secrets are a single incident away from permanent compromise for every enrolled user.
전체 분석 읽기Why does moving my data across platforms still require trusting the exporter?
The EU Digital Markets Act now mandates data portability for designated gatekeepers, and a May 2026 European Commission factsheet highlighted Apple and Google's cross-OS transfer work as a DMA milestone. Yet the technical reality is that every export format today is a vendor-defined archive, a ZIP of JSON files whose completeness, accuracy, and freshness cannot be independently checked by the receiving party or the user. Interoperability obligations address format and API access but say nothing about attestation. A user migrating from one platform to another cannot know whether the export is complete, whether it reflects state as of the request timestamp, or whether the receiving platform ingested all of it correctly. The portable data transfer protocol work from Google, Apple, and Meta covers transport, not provenance.
왜 중요한가: Data portability without verifiable completeness is just a different kind of lock-in, because the user still has no way to know what was left behind.
전체 분석 읽기Why does proving my age online require handing my browsing history to a stranger?
Laws in the US, UK, and EU now require websites to verify visitor age, and every production deployment routes that check through a centralized age-verification provider. That provider sees which users visited which sites and accumulates a detailed browsing record tied to real identity. Zero-knowledge proof alternatives exist in research and the EU is embedding one in its EUDI wallet, but the wallet spec will not be finalized before December 2026, covers only EU residents, and no comparable infrastructure exists elsewhere. The practical choice today is between lying about your age and surrendering your browsing history to a company you did not choose.
왜 중요한가: Privacy-preserving age verification is the missing primitive for an internet that is rapidly becoming age-gated by law.
전체 분석 읽기Why does moving assets across chains still take minutes and carry unknown risk?
Six years after the first cross-chain bridges launched, users still face unpredictable costs, complex failure modes, and security trade-offs that no protocol resolves simultaneously. In June 2025 Force Bridge on the Nervos Network was exploited for over three million dollars, continuing a pattern of bridge hacks that have collectively drained billions since 2021. Most bridges rely on small validator sets or multisigs that represent a single point of failure, and pool imbalances create slippage for large transfers with no recourse. Cross-chain protocols now represent 57 percent of total interoperability revenue in 2025, but that concentration reflects lock-in, not solved usability, and the triangle of security, speed, and decentralization remains unresolved for any bridge serving real user volumes.
왜 중요한가: Interoperability is load-bearing infrastructure for a multi-chain world, and each new bridge exploit resets user trust.
전체 분석 읽기문제를 발견하셨나요?
If something in tech, crypto, or AI quietly drives you up the wall, send it over. The best ones get added to this board, and a few might turn into something I build.