Example 1: Financial calculations
Scenario
You’re building an AI-powered financial advisor. Users ask for compound interest calculations, and you need to ensure the AI’s math is correct.
Setup
{
"mcpServers": {
"qwed-verification": {
"command": "qwed-mcp",
"env": {
"QWED_MCP_TRUSTED_CODE_EXECUTION": "true"
}
}
}
}
Conversation
User: Calculate the future value of $10,000 invested at 5% annual interest, compounded monthly, for 10 years.
Claude’s process:
- Calculate: FV = P(1 + r/n)^(nt) = 10000(1 + 0.05/12)^(12×10)
- Call
execute_python_code to verify:
{
"code": "from decimal import Decimal, ROUND_HALF_UP\n\nP = Decimal('10000')\nr = Decimal('0.05')\nn = Decimal('12')\nt = Decimal('10')\n\nFV = P * (1 + r/n) ** (n*t)\nFV = FV.quantize(Decimal('0.01'), rounding=ROUND_HALF_UP)\nprint(f'Future value: ${FV}')"
}
Result:
STDOUT:
Future value: $16470.09
Execution completed successfully.
Claude’s response: The future value is $16,470.09 (verified using deterministic decimal arithmetic).
Example 2: Legal reasoning
Scenario
An AI legal assistant needs to verify that conclusions follow from legal premises.
Conversation
User: Under contract law: If both parties agree and consideration is exchanged, a contract is formed. Alice agreed and paid Bob. Is there a valid contract?
Claude’s process:
- Formalize the logic
- Call
execute_python_code with a Z3 solver script:
{
"code": "from qwed_new.engines.logic_engine import verify_logic_statement\n\nresult = verify_logic_statement(\n premises=[\n 'if agreed and consideration then contract',\n 'Alice agreed',\n 'consideration exchanged'\n ],\n conclusion='contract formed'\n)\nprint(f'Verified: {result.get(\"verified\")}')\nprint(f'Message: {result.get(\"message\", \"\")}')"
}
Result: Verified — the conclusion logically follows.
Example 3: Secure code review
Scenario
A developer asks AI to review their code for security issues.
Conversation
User: Review this login function:
def login(username, password):
query = f"SELECT * FROM users WHERE name='{username}'"
user = db.execute(query)
if eval(f"'{password}' == '{user.password}'"):
return True
return False
Claude’s process:
- Call
execute_python_code with a CodeGuard analysis:
{
"code": "from qwed_new.guards.code_guard import CodeGuard\n\ncode = '''def login(username, password):\n query = f\"SELECT * FROM users WHERE name='{username}'\"\n user = db.execute(query)\n if eval(f\"'{password}' == '{user.password}'\"):\n return True\n return False'''\n\nguard = CodeGuard()\nresult = guard.verify_safety(code)\nprint(f'Verified: {result[\"verified\"]}')\nif result.get('violations'):\n for v in result['violations']:\n print(f' - {v}')"
}
Result:
STDOUT:
Verified: False
- Dangerous function call: eval()
- Warning: String concatenation in SQL query detected
Execution completed successfully.
Claude’s response:
Security issues found:
- SQL injection — direct string formatting in SQL query
- Code injection — using
eval() for password comparison
Recommended fix:
def login(username, password):
query = "SELECT * FROM users WHERE name = ?"
user = db.execute(query, (username,))
return password == user.password
Example 4: Research paper validation
Scenario
Verifying statistical claims in AI-assisted research paper review.
Conversation
User: The paper claims: “If the p-value < 0.05, we reject the null hypothesis. Our p-value was 0.03. Therefore, we reject the null hypothesis.”
Claude’s process:
{
"code": "from qwed_new.engines.logic_engine import verify_logic_statement\n\nresult = verify_logic_statement(\n premises=[\n 'if p_value less than 0.05 then reject_null',\n 'p_value less than 0.05'\n ],\n conclusion='reject_null'\n)\nprint(f'Verified: {result.get(\"verified\")}')"
}
Result: Verified.
Example 5: SQL query validation
Scenario
An AI data analyst generates SQL queries. You need to ensure they’re safe before execution.
Conversation
User: Get all orders for user input “admin’ OR ‘1’=‘1”
Claude’s process:
- Generate query
- Call
execute_python_code with a SQLGuard check before responding:
{
"code": "from qwed_new.guards.sql_guard import SQLGuard\n\nguard = SQLGuard()\nresult = guard.verify_query(\"SELECT * FROM orders WHERE user_id = 'admin'' OR ''1''=''1'\")\nprint(f'Verified: {result[\"verified\"]}')\nprint(f'Message: {result.get(\"message\", \"\")}')"
}
Result:
STDOUT:
Verified: False
Message: Potential SQL injection detected
Execution completed successfully.
Claude’s response:
The input appears to be a SQL injection attempt. Use parameterized queries:
SELECT * FROM orders WHERE user_id = ?
-- Pass user_id as parameter
Example 6: Background verification
Scenario
You need to run a complex, multi-guard verification that may exceed the default 30-second timeout. Use background=true to dispatch the job asynchronously, then poll for results with verification_status.
Conversation
User: Run a full legal contract review including deadline, liability, and provenance checks on this 50-page NDA.
Claude’s process:
- Call
execute_python_code with background=true:
{
"code": "from qwed_legal import LegalGuard\nimport hashlib\n\nguard = LegalGuard(provenance_config={'require_disclosure': True})\n\n# Deadline check\nd = guard.verify_deadline('2026-01-15', '90 business days', '2026-05-01')\nprint(f'Deadline verified: {d.verified}')\n\n# Liability check\nl = guard.verify_liability_cap(5000000, 200, 10000000)\nprint(f'Liability verified: {l.verified}')\n\n# Provenance check\ncontent = 'This AI-generated contract review...'\nh = hashlib.sha256(content.encode()).hexdigest()\np = guard.verify_provenance(content, {'content_hash': h, 'model_id': 'gpt-4', 'generation_timestamp': '2026-03-25T10:00:00+00:00'})\nprint(f'Provenance verified: {p[\"verified\"]}')",
"background": true
}
Response:
Verification order is being placed for the request a1b2c3d4-e5f6-7890-abcd-ef1234567890. Check back using the 'verification_status' tool.
- Poll for results using
verification_status:
{
"name": "verification_status",
"arguments": {
"job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
}
Response:
Status: success
Result:
STDOUT:
Deadline verified: True
Liability verified: True
Provenance verified: True
Execution completed successfully.
Background jobs are ideal for multi-guard verification pipelines, large document analysis, or any script that may exceed the 30-second synchronous timeout.
Example 7: Integration with LangChain
Python code
from langchain_anthropic import ChatAnthropic
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain.tools import StructuredTool
import subprocess
import json
# Create a tool that calls execute_python_code via MCP CLI
def run_verification(code: str) -> str:
"""Execute Python verification code via QWED-MCP."""
result = subprocess.run(
["qwed-mcp-cli", "execute_python_code",
"--code", code],
capture_output=True, text=True
)
return result.stdout
verify_tool = StructuredTool.from_function(
func=run_verification,
name="execute_python_code",
description="Execute Python code to verify calculations using QWED SDKs."
)
# Create agent with verification
llm = ChatAnthropic(model="claude-3-sonnet")
agent = create_tool_calling_agent(llm, [verify_tool], prompt)
executor = AgentExecutor(agent=agent, tools=[verify_tool])
# Use it
result = executor.invoke({
"input": "What's the integral of 2x? Write a script to verify your answer."
})
Best practices
1. Always verify before responding
User asks → AI calculates → AI writes verification script → AI responds
↓ (if fails)
AI recalculates
2. Use background mode for heavy tasks
Set background=true when running scripts that may take longer than 30 seconds. Poll results using verification_status with the returned job_id.
3. Use appropriate SDK imports
| Task | SDK import |
|---|
| Math/calculations | from sympy import ... |
| Logic/reasoning | from qwed_new.engines.logic_engine import verify_logic_statement |
| Code review | from qwed_new.guards.code_guard import CodeGuard |
| SQL queries | from qwed_new.guards.sql_guard import SQLGuard |
| Legal deadlines | from qwed_legal import DeadlineGuard |
| AI provenance | from qwed_legal import ProvenanceGuard |
4. Handle verification failures
When verification fails:
- Acknowledge the error
- Recalculate
- Verify again
- Explain the correction to user
5. Explain verification to users
"I calculated X. Let me verify this is correct...
[Runs verification script]
✅ Verified using deterministic computation.
The answer is definitely X."