TL;DR: QWED uses deterministic solvers (SymPy, Z3, AST, SQLGlot) wherever possible. When LLM fallback is required, the response is explicitly marked as HEURISTIC.
Engine Classification
Every QWED verification engine falls into one of three categories:| Category | Engines | Technology | Reproducible? |
|---|---|---|---|
| 100% Symbolic | Math, Logic, Code, SQL, Schema, Taint | SymPy, Z3, AST, SQLGlot | ✅ Yes |
| Hybrid | Fact, Stats | TF-IDF / Sandbox → LLM fallback | ⚠️ Conditional |
| Heuristic | Image, Consensus, Reasoning | VLM / Multi-LLM voting | ❌ No |
Detailed Breakdown
| Engine | Mode | Primary Technology | LLM Fallback? |
|---|---|---|---|
| Math | Symbolic | SymPy (symbolic algebra) | ❌ Never |
| Logic | Symbolic | Z3 Theorem Prover (SMT) | ❌ Never |
| Code | Symbolic | Python AST + Bandit | ❌ Never |
| SQL | Symbolic | SQLGlot AST parser | ❌ Never |
| Schema | Symbolic | Pydantic + SymPy | ❌ Never |
| Taint | Symbolic | Data Flow Analysis (AST) | ❌ Never |
| Fact | Hybrid | TF-IDF similarity | ✅ When TF-IDF confidence < threshold |
| Stats | Hybrid | Wasm/Docker sandbox | ✅ When sandbox execution fails |
| Image | Heuristic | Vision LLM (GPT-4V, Claude) | ✅ Always |
| Consensus | Heuristic | Multi-LLM voting | ✅ Always |
| Reasoning | Heuristic | Chain-of-thought LLM | ✅ Always |
How to Know Which Mode Was Used
Every API response includes averification_mode field:
verification_mode Values
| Value | Meaning | Trust Level |
|---|---|---|
SYMBOLIC | Result proven by deterministic solver (SymPy/Z3/AST) | 🟢 100% reproducible |
HEURISTIC | Result from LLM fallback or multi-model voting | 🟡 Best-effort, not guaranteed |
Why This Matters
For Regulated Industries
Banks, healthcare, and legal systems require audit trails and reproducibility:- ✅
SYMBOLICresults can be independently verified - ⚠️
HEURISTICresults should be flagged for human review
For Production Systems
Trust boundaries in translated queries
When a natural language query is verified through the math pipeline, the response status isINCONCLUSIVE — even when the expression evaluation itself is fully deterministic. This is because the LLM translation step (natural language → expression) is not formally verified.
Each response includes a trust_boundary object that makes this explicit:
trust_boundary fields to decide how to handle the result in your application. For example, you might accept the computed answer while logging that the translation was not formally proven. See Math Engine — Trust boundary for the full field reference.
Design philosophy
“Probabilistic systems should not be trusted with deterministic tasks.”QWED treats LLMs as untrusted translators:
- LLM converts natural language → formal specification
- Deterministic solver verifies the formal specification
- If solver cannot verify, result is marked
HEURISTIC - If the translation itself cannot be formally verified, result is marked
INCONCLUSIVEwith atrust_boundary
Trust boundaries in the natural language pipeline
When a query enters throughPOST /verify/natural_language, an LLM first translates the user’s natural language into a formal expression. Even when the downstream solver (e.g., SymPy) evaluates that expression deterministically, the overall result depends on whether the LLM correctly interpreted the user’s intent — and that step is not deterministic.
To make this distinction explicit, the natural language math pipeline now:
- Returns
INCONCLUSIVEas the top-level status instead ofVERIFIED, even when the expression evaluation succeeds. - Includes a
trust_boundaryobject in the response that describes exactly what was and was not proven.
VERIFIED result from the Math Engine’s direct endpoint (POST /verify/math) carries stronger guarantees than the same calculation routed through the natural language pipeline. See the Math Engine documentation for the full field reference.
Numerical sampling fallback
The identity verification fallback (numerical sampling at fixed test points) now returnsUNKNOWN with confidence: 0.0 instead of LIKELY_EQUIVALENT with confidence: 0.99. Matching at a handful of sample points does not constitute a formal proof and should not be treated as one.
Frequently Asked Questions
Q: Can I filter responses by verification_mode?
A: Yes! Use therequire_symbolic option in your request:
Q: Why is Fact verification sometimes HEURISTIC?
A: Fact verification uses TF-IDF (deterministic) to find evidence. If TF-IDF confidence is below threshold, it falls back to NLI (LLM-based), marking the result asHEURISTIC.
Q: Is Consensus useful if it’s HEURISTIC?
A: Yes! Consensus detects disagreement between multiple LLMs. While not deterministic, it catches cases where models are uncertain — useful as a safety net, not a formal verifier.See Also
- Engines Overview — Full engine documentation
- API Endpoints — Response schema reference
- Whitepaper — Academic justification for neurosymbolic approach