Skip to main content

v5.0.0 — Enforcement boundary hardening

Released: April 4, 2026 · GitHub Release
Major release making QWED’s verification boundary fail-closed, deterministic about what it proves, and harder to bypass under adversarial conditions. Consolidates 98 commits and 20 merged PRs since v4.0.1, including the full enforcement hardening series.

Breaking changes

  • INCONCLUSIVE is a new verification status — Natural-language math responses now return INCONCLUSIVE when verifying LLM-translated expressions, because the translation step is non-deterministic. Downstream consumers must handle this status alongside VERIFIED, ERROR, and BLOCKED.
  • BLOCKED and UNKNOWN are explicit outcomes — These are no longer treated as generic failures. Consumers should map them to distinct UI states or retry logic.
  • ActionContext is mandatory for agent verification — All verify_action requests must include conversation_id and step_number. Requests without them are rejected with QWED-AGENT-CTX-001.
  • security_checks field removed — Exfiltration and MCP poison checks are now server-enforced and unconditional. The security_checks request field and the TypeScript SDK checkExfiltration/checkMcpPoison options no longer exist.
  • /metrics endpoints require admin authenticationGET /metrics and GET /metrics/prometheus now require an authenticated admin or owner role. Update monitoring integrations accordingly.
  • Docker required for stats and consensus verification — The in-process execution fallbacks have been removed. If Docker is unavailable, /verify/stats and /verify/consensus return HTTP 503.

Security

  • Fail-closed verification boundary — Disabled unsafe in-process execution fallbacks; stats and consensus paths now require the secure Docker sandbox.
  • Logic verifier eval() removed — The raw eval() fallback in logic constraint parsing has been removed. If SafeEvaluator is unavailable, the verifier raises a RuntimeError.
  • Consensus rate limitingPOST /verify/consensus now enforces per-tenant rate limiting, matching other verification endpoints.
  • Consensus fact self-attestation removed — The fact engine no longer participates in automatic consensus engine selection, preventing self-referential verification loops.
  • Redis fail-closed — The sliding window rate limiter now denies requests on Redis errors instead of allowing them.
  • Timing-safe token verification — Agent token comparison uses hmac.compare_digest to prevent timing side-channel attacks.
  • Environment integrity enforcement — The API server runs verify_environment_integrity() at startup before database initialization.

Trust boundary metadata

API responses now include a trust_boundary object that describes exactly what was verified and what was not. This gives you machine-readable transparency into the verification scope.
{
  "trust_boundary": {
    "query_interpretation_source": "llm_translation",
    "query_semantics_verified": false,
    "verification_scope": "translated_expression_only",
    "deterministic_expression_evaluation": true,
    "formal_proof": false,
    "overall_status": "INCONCLUSIVE"
  }
}
See the POST /verify/natural_language endpoint reference for full field descriptions.

Agent hardening

  • Action context mandatory — All verify_action requests must include conversation_id and step_number. Requests without them are rejected with QWED-AGENT-CTX-001.
  • Replay detection — Reusing a (conversation_id, step_number) pair is blocked (QWED-AGENT-LOOP-002).
  • Loop detection — Repeating the same action 3+ consecutive times triggers a denial (QWED-AGENT-LOOP-003).
  • In-flight step reservations — Prevents race conditions when multiple agent calls run concurrently.
  • Budget denial isolation — Budget-exceeded denials do not consume conversation state, so the agent can retry after the budget resets.
  • Stats exception masking — The Stats Engine no longer leaks internal exception details to API callers.

SDK version bumps

  • qwed (PyPI): 4.0.15.0.0
  • qwed_sdk (Python): 2.1.0-dev5.0.0
  • @qwed-ai/sdk (npm): 4.0.15.0.0
  • TypeScript SDK: Removed security_checks from agent verification helpers; tool_schema remains.

Upgrade

pip install qwed==5.0.0
docker pull qwedai/qwed-verification:5.0.0
npm install @qwed-ai/sdk@5.0.0
If you use POST /verify/natural_language and check the top-level status field, you must now handle INCONCLUSIVE as a valid outcome. Use the trust_boundary object to inspect the detailed verification breakdown. For fully deterministic results, use POST /verify/math directly.
If you use agent verification, all requests must now include conversation_id and step_number in the context. See the agent verification guide for details.

v4.0.5 — Stats exception masking

Released: April 3, 2026 · GitHub PR #118
Prevents the Stats Engine from leaking internal exception details to API callers. Clients now receive a generic error message when stats code generation fails.

Security

  • Stats translation error masking — When stats code generation fails, the API now returns "Internal verification error" instead of the raw exception message. This prevents accidental exposure of file paths, API keys, or other sensitive data that may appear in exception text.
  • Server-side logging preserved — The exception type is still logged server-side for operator debugging, but the log no longer includes the full exception message.

v4.0.4 — Fail-closed execution enforcement

Released: April 3, 2026 · GitHub PR #117
Removes unsafe in-process execution fallbacks from statistical and consensus verification. All model-generated Python now runs exclusively in the secure Docker sandbox.

Breaking changes

  • Docker required for stats verification — The Wasm and restricted Python fallback execution paths have been removed from the Stats Engine. If Docker is unavailable, /verify/stats returns HTTP 503 instead of executing code in-process.
  • Consensus Python engine uses Docker — The Python verification engine within consensus verification now runs through SecureCodeExecutor instead of the in-process CodeExecutor. If Docker is unavailable during high or maximum mode, /verify/consensus returns HTTP 503.

Security

  • Fail-closed sandbox gating — Both the Stats Engine and the Consensus Engine refuse to execute model-generated code when the Docker sandbox is not available, preventing any downgrade to in-process execution.
  • Live Docker health checksSecureCodeExecutor.is_available() now pings the Docker daemon on each request instead of relying on cached startup state, catching mid-operation Docker failures.

API changes

  • POST /verify/stats now returns HTTP 503 when the secure runtime is unavailable, and HTTP 403 when generated code is blocked by security policy.
  • POST /verify/consensus now returns HTTP 503 when secure execution is required but Docker is unavailable.

v4.0.3 — Server-enforced agent security

Released: April 2, 2026 · GitHub PR #116
Closes critical verification boundary gaps by enforcing agent security checks server-side, adding rate limiting to consensus verification, and removing unsafe fallback paths.

Breaking changes

  • security_checks removed from agent verification — Exfiltration and MCP poison checks are now enforced server-side on every request. The security_checks request field and the TypeScript SDK checkExfiltration/checkMcpPoison options have been removed. tool_schema remains available to trigger MCP poison inspection.

Security

  • Logic verifier fail-closed — The raw eval() fallback in logic constraint parsing has been removed. If SafeEvaluator is unavailable, the verifier raises an error instead of falling back to unrestricted evaluation.
  • Consensus rate limiting — The POST /verify/consensus endpoint is now rate-limited per API key, matching other verification endpoints.
  • Consensus fact self-attestation removed — The Fact engine is no longer automatically selected during consensus verification. It produced self-referential results without external context.

v4.0.2 — Security Cleanup

Released: April 1, 2026 · GitHub PR #114
Hardens expression evaluation, improves error handling across the SDK and consensus engine, and removes unused code flagged by CodeQL.

Security

  • eval() fully eliminated — SymPy and Z3 expression evaluation now uses a custom AST-walking interpreter instead of compile() + eval(). The evaluator validates every AST node against a strict allow-list, then interprets the tree directly in a restricted namespace.
  • CodeGuard integration — Expressions are screened by CodeGuard (when available) as a second defense layer between AST validation and evaluation.
  • Keyword unpacking blocked — Call nodes with **kwargs-style keyword unpacking are now rejected during AST validation.

Improvements

  • Exact SymPy arithmetic — Numeric literals are coerced to sympy.Integer / sympy.Float during evaluation, preventing floating-point drift in intermediate math comparisons.
  • Consensus failure recording — Unexpected errors during async engine aggregation are now logged and recorded as an EngineResult entry instead of being silently dropped.
  • Telemetry initialization — Replaced the _initialized flag with @lru_cache for one-time cached initialization.
  • SDK import cycle brokenqwed_sdk.cache no longer imports from qwed_sdk.qwed_local for terminal colors; it uses colorama directly with a plain-text fallback.
  • Integration imports hardened — Optional framework imports (LangChain, CrewAI, LlamaIndex) now default to None explicitly instead of using bare except: pass.

Chores

  • Removed unused locals and globals across SDK, core modules, examples, and scripts.
  • Replaced module-level print() calls in database.py with logger.debug().
  • Added ast.Tuple and ast.List to the safe SymPy node type allow-list.

v4.0.1 — Sentinel Guard Sync 🔄

Released: March 23, 2026 · GitHub Release · PyPI
Patch release aligning the TypeScript SDK, backend API schemas, and security guard integrations introduced in v4.0.0.

🆕 New Endpoints

  • POST /verify/process — Glass-box reasoning process verifier with IRAC structural compliance and custom milestone validation.
  • Agent Security ChecksPOST /agents/{id}/verify now accepts security_checks: { exfiltration, mcp_poison } to run ExfiltrationGuard and MCPPoisonGuard before verification.

🔒 Security Fixes

  • Information Disclosure — Removed raw exception messages from /verify/rag error responses; clients receive only INTERNAL_VERIFICATION_ERROR.
  • Symbolic PrecisionRAGVerifyRequest.max_drm_rate changed from float | strstr with field_validator enforcing Fraction-compatible values.
  • Response Consistency — RAG error responses now return "verified": false matching the success path schema.

🛠️ SDK Changes (@qwed-ai/sdk@4.0.1)

  • Added verifyProcess() method for IRAC/milestone validation.
  • verifyRAG()maxDrmRate type changed from number to string.
  • verifyAgent() — Payload aligned with backend schema, agent IDs URL-encoded.
  • New types: Process, RAG, Security in VerificationType enum.

🧪 Tests

  • test_api_phase17_endpoints.py — covers /verify/process, /verify/rag exception masking, and agent security check blocking.
pip install qwed==4.0.1
docker pull qwedai/qwed-verification:4.0.1
npm install @qwed-ai/sdk@4.0.1

v4.0.0 — Sentinel Edition 🛡️

Released: March 12, 2026 · GitHub Release · PyPI
147 commits since v3.0.1 — the largest update in QWED history.

🔧 v4.0.0 patch — Provider fallbacks, .env loading, and Gemini stability

  • Native Gemini provider — Gemini now runs through a dedicated provider with native support for math, logic, stats, fact, and image verification. All API calls enforce a 30-second timeout and automatically strip Markdown code fences from responses.
  • Centralized .env loading — Environment variables now load in a deterministic priority order: project .env first, then ~/.qwed/.env. This prevents configuration drift between CLI and server contexts.
  • Gemini AST stability — Fixed intermittent JSON parse failures when Gemini returns code-fenced responses during logic translation.

🆕 Agentic Security Guards (Phase 17)

A brand-new guard subsystem for securing AI agent tool chains and RAG pipelines:
  • RAGGuard — Detects prompt injection, data poisoning, and context manipulation in RAG pipelines. IRAC-compliant reporting.
  • ExfiltrationGuard — Prevents data exfiltration through agent tool calls by analyzing output patterns and destination validation.
  • MCP Poison Guard — Detects poisoned or tampered MCP tool definitions before agent execution.
All three guards went through five rounds of security review via CodeRabbit and SonarCloud.

🆕 New Standalone Guards

  • SovereigntyGuard — Enforces data residency policies and local routing rules (GDPR, data localization).
  • ToxicFlowGuard — Stateful detection of toxic tool-chaining patterns across multi-step agent workflows.
  • SelfInitiatedCoTGuard (S-CoT) — Verifies self-initiated Chain-of-Thought logic paths for reasoning integrity.

🆕 Process Determinism

A new class of deterministic verification:
  • ProcessVerifier — IRAC/milestone-based process verification with decimal scoring, budget-aware timeouts, and structured compliance reporting. Ensures AI-driven workflows follow deterministic process steps — not just correct answers, but correct procedures.

🔒 Critical Security Fixes

  • Replaced direct eval() with AST-validated execution (Code Injection Prevention). Further hardened in v4.0.2 with a full AST-walking interpreter.
  • Patched critical sandbox escape and namespace mismatch.
  • Hardened SymPy input parsing against injection.
  • Fixed URL whitespace bypass and protocol wildcard bypass.
  • Resolved CVE-2026-24049 (Critical), CVE-2025-8869, and HTTP request smuggling.
  • Fixed all 19 Snyk Code findings.
  • Secured exception handling across verify_logic, ControlPlane, verify_stats, agent_tool_call.

🐳 Docker Hardening

  • Pinned base image digests with hash-verified requirements
  • Non-root user execution with gosu/runuser
  • Automated Docker Hub publishing on release
  • SBOM generation (SPDX) and Docker Scout scanning
docker pull qwedai/qwed-verification:4.0.0

🔧 CI/CD Infrastructure

  • Sentry SDK — Error tracking and monitoring.
  • CircleCI — Python matrix testing (3.10, 3.11, 3.12).
  • SonarCloud — Code quality and coverage.
  • Snyk — Security scanning with SARIF output.
  • Docker Auto-Publish — Automated image push on every release.

📝 Documentation & Badges

  • OpenSSF Best Practices badge (Silver)
  • Snyk security badge and partner attribution
  • Docker Hub pulls badge and BuildKit badge
  • 11 verification engines across all docs

v3.0.1 — Ironclad Update 🦾

Released: February 4, 2026 · GitHub Release

🛡️ Critical Security Hardening

  • CodeQL Remediation: Resolved 50+ alerts including ReDoS, Clear-text Logging, and Exception Exposure.
  • Workflow Permissions: Enforced permissions: contents: read across all GitHub Actions to adhere to Least Privilege.
  • PII Protection: Implemented robust redact_pii logic in all API endpoints and exception handlers.

📝 Compliance

  • Snyk Attribution: Added Snyk attribution to README and Documentation footer for Partner Program compliance.

🐛 Bug Fixes

  • API Stability: Fixed unhandled exceptions in verify_logic and agent_tool_call endpoints.

v2.4.1 — The Reasoning Engine 🚀

Released: January 20, 2026 · GitHub Release

New Features

  • Optimization Engine (verify_optimization): Added LogicVerifier support for Z3’s Optimize context.
  • Vacuity Checker (check_vacuity): Added logical proof to detect “Vacuous Truths”.

Enterprise Updates

  • Dockerized GitHub Action: The main qwed-verification action now runs in a Docker container.

Fixes & Improvements

  • Updated logic_verifier.py with additive, non-breaking methods.
  • Replaced shell-based action_entrypoint.sh with robust Python handler action_entrypoint.py.