QWED MCP examples for verified tool execution

Example 1: Financial calculations

Scenario

You’re building an AI-powered financial advisor. Users ask for compound interest calculations, and you need to ensure the AI’s math is correct.

Setup

{
  "mcpServers": {
    "qwed-verification": {
      "command": "qwed-mcp",
      "env": {
        "QWED_MCP_TRUSTED_CODE_EXECUTION": "true"
      }
    }
  }
}

Conversation

User: Calculate the future value of $10,000 invested at 5% annual interest, compounded monthly, for 10 years. Claude’s process:

Calculate: FV = P(1 + r/n)^(nt) = 10000(1 + 0.05/12)^(12×10)
Call execute_python_code to verify:

{
  "code": "from decimal import Decimal, ROUND_HALF_UP\n\nP = Decimal('10000')\nr = Decimal('0.05')\nn = Decimal('12')\nt = Decimal('10')\n\nFV = P * (1 + r/n) ** (n*t)\nFV = FV.quantize(Decimal('0.01'), rounding=ROUND_HALF_UP)\nprint(f'Future value: ${FV}')"
}

Result:

STDOUT:
Future value: $16470.09

Execution completed successfully.

Claude’s response: The future value is $16,470.09 (verified using deterministic decimal arithmetic).

Example 2: Legal reasoning

Scenario

An AI legal assistant needs to verify that conclusions follow from legal premises.

Conversation

User: Under contract law: If both parties agree and consideration is exchanged, a contract is formed. Alice agreed and paid Bob. Is there a valid contract? Claude’s process:

Formalize the logic
Call execute_python_code with a Z3 solver script:

{
  "code": "from qwed_new.engines.logic_engine import verify_logic_statement\n\nresult = verify_logic_statement(\n    premises=[\n        'if agreed and consideration then contract',\n        'Alice agreed',\n        'consideration exchanged'\n    ],\n    conclusion='contract formed'\n)\nprint(f'Verified: {result.get(\"verified\")}')\nprint(f'Message: {result.get(\"message\", \"\")}')"
}

Result: Verified — the conclusion logically follows.

Example 3: Secure code review

Scenario

A developer asks AI to review their code for security issues.

Conversation

User: Review this login function:

def login(username, password):
    query = f"SELECT * FROM users WHERE name='{username}'"
    user = db.execute(query)
    if eval(f"'{password}' == '{user.password}'"):
        return True
    return False

Claude’s process:

Call execute_python_code with a CodeGuard analysis:

{
  "code": "from qwed_new.guards.code_guard import CodeGuard\n\ncode = '''def login(username, password):\n    query = f\"SELECT * FROM users WHERE name='{username}'\"\n    user = db.execute(query)\n    if eval(f\"'{password}' == '{user.password}'\"):\n        return True\n    return False'''\n\nguard = CodeGuard()\nresult = guard.verify_safety(code)\nprint(f'Verified: {result[\"verified\"]}')\nif result.get('violations'):\n    for v in result['violations']:\n        print(f'  - {v}')"
}

Result:

STDOUT:
Verified: False
  - Dangerous function call: eval()
  - Warning: String concatenation in SQL query detected

Execution completed successfully.

Claude’s response: Security issues found:

SQL injection — direct string formatting in SQL query
Code injection — using eval() for password comparison

Recommended fix:

def login(username, password):
    query = "SELECT * FROM users WHERE name = ?"
    user = db.execute(query, (username,))
    return password == user.password

Example 4: Research paper validation

Scenario

Verifying statistical claims in AI-assisted research paper review.

Conversation

User: The paper claims: “If the p-value < 0.05, we reject the null hypothesis. Our p-value was 0.03. Therefore, we reject the null hypothesis.” Claude’s process:

{
  "code": "from qwed_new.engines.logic_engine import verify_logic_statement\n\nresult = verify_logic_statement(\n    premises=[\n        'if p_value less than 0.05 then reject_null',\n        'p_value less than 0.05'\n    ],\n    conclusion='reject_null'\n)\nprint(f'Verified: {result.get(\"verified\")}')"
}

Result: Verified.

Example 5: SQL query validation

Scenario

An AI data analyst generates SQL queries. You need to ensure they’re safe before execution.

Conversation

User: Get all orders for user input “admin’ OR ‘1’=‘1” Claude’s process:

Generate query
Call execute_python_code with a SQLGuard check before responding:

{
  "code": "from qwed_new.guards.sql_guard import SQLGuard\n\nguard = SQLGuard()\nresult = guard.verify_query(\"SELECT * FROM orders WHERE user_id = 'admin'' OR ''1''=''1'\")\nprint(f'Verified: {result[\"verified\"]}')\nprint(f'Message: {result.get(\"message\", \"\")}')"
}

Result:

STDOUT:
Verified: False
Message: Potential SQL injection detected

Execution completed successfully.

Claude’s response: The input appears to be a SQL injection attempt. Use parameterized queries:

SELECT * FROM orders WHERE user_id = ?
-- Pass user_id as parameter

Example 6: Background verification

Scenario

You need to run a complex, multi-guard verification that may exceed the default 30-second timeout. Use background=true to dispatch the job asynchronously, then poll for results with verification_status.

Conversation

User: Run a full legal contract review including deadline, liability, and provenance checks on this 50-page NDA. Claude’s process:

Call execute_python_code with background=true:

{
  "code": "from qwed_legal import LegalGuard\nimport hashlib\n\nguard = LegalGuard(provenance_config={'require_disclosure': True})\n\n# Deadline check\nd = guard.verify_deadline('2026-01-15', '90 business days', '2026-05-01')\nprint(f'Deadline verified: {d.verified}')\n\n# Liability check\nl = guard.verify_liability_cap(5000000, 200, 10000000)\nprint(f'Liability verified: {l.verified}')\n\n# Provenance check\ncontent = 'This AI-generated contract review...'\nh = hashlib.sha256(content.encode()).hexdigest()\np = guard.verify_provenance(content, {'content_hash': h, 'model_id': 'gpt-4', 'generation_timestamp': '2026-03-25T10:00:00+00:00'})\nprint(f'Provenance verified: {p[\"verified\"]}')",
  "background": true
}

Response:

Verification order is being placed for the request a1b2c3d4-e5f6-7890-abcd-ef1234567890. Check back using the 'verification_status' tool.

Poll for results using verification_status:

{
  "name": "verification_status",
  "arguments": {
    "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
  }
}

Response:

Status: success

Result:
STDOUT:
Deadline verified: True
Liability verified: True
Provenance verified: True

Execution completed successfully.

Background jobs are ideal for multi-guard verification pipelines, large document analysis, or any script that may exceed the 30-second synchronous timeout.

Example 7: Integration with LangChain

Python code

from langchain_anthropic import ChatAnthropic
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain.tools import StructuredTool
import subprocess
import json

# Create a tool that calls execute_python_code via MCP CLI
def run_verification(code: str) -> str:
    """Execute Python verification code via QWED-MCP."""
    result = subprocess.run(
        ["qwed-mcp-cli", "execute_python_code",
         "--code", code],
        capture_output=True, text=True
    )
    return result.stdout

verify_tool = StructuredTool.from_function(
    func=run_verification,
    name="execute_python_code",
    description="Execute Python code to verify calculations using QWED SDKs."
)

# Create agent with verification
llm = ChatAnthropic(model="claude-3-sonnet")
agent = create_tool_calling_agent(llm, [verify_tool], prompt)
executor = AgentExecutor(agent=agent, tools=[verify_tool])

# Use it
result = executor.invoke({
    "input": "What's the integral of 2x? Write a script to verify your answer."
})

Best practices

1. Always verify before responding

User asks → AI calculates → AI writes verification script → AI responds
                             ↓ (if fails)
                         AI recalculates

2. Use background mode for heavy tasks

Set background=true when running scripts that may take longer than 30 seconds. Poll results using verification_status with the returned job_id.

3. Use appropriate SDK imports

Task	SDK import
Math/calculations	`from sympy import ...`
Logic/reasoning	`from qwed_new.engines.logic_engine import verify_logic_statement`
Code review	`from qwed_new.guards.code_guard import CodeGuard`
SQL queries	`from qwed_new.guards.sql_guard import SQLGuard`
Legal deadlines	`from qwed_legal import DeadlineGuard`
AI provenance	`from qwed_legal import ProvenanceGuard`

4. Handle verification failures

When verification fails:

Acknowledge the error
Recalculate
Verify again
Explain the correction to user

5. Explain verification to users

"I calculated X. Let me verify this is correct...
[Runs verification script]
✅ Verified using deterministic computation.
The answer is definitely X."

MCP Server

Open Responses

UCP (Commerce)

QWED MCP examples for verified tool execution

Example 1: Financial calculations

Scenario

Setup

Conversation

Example 2: Legal reasoning

Scenario

Conversation

Example 3: Secure code review

Scenario

Conversation

Example 4: Research paper validation

Scenario

Conversation

Example 5: SQL query validation

Scenario

Conversation

Example 6: Background verification

Scenario

Conversation

Example 7: Integration with LangChain

Python code

Best practices

1. Always verify before responding

2. Use background mode for heavy tasks

3. Use appropriate SDK imports

4. Handle verification failures

5. Explain verification to users

MCP Server

Open Responses

UCP (Commerce)

​Example 1: Financial calculations

​Scenario

​Setup

​Conversation

​Example 2: Legal reasoning

​Scenario

​Conversation

​Example 3: Secure code review

​Scenario

​Conversation

​Example 4: Research paper validation

​Scenario

​Conversation

​Example 5: SQL query validation

​Scenario

​Conversation

​Example 6: Background verification

​Scenario

​Conversation

​Example 7: Integration with LangChain

​Python code

​Best practices

​1. Always verify before responding

​2. Use background mode for heavy tasks

​3. Use appropriate SDK imports

​4. Handle verification failures

​5. Explain verification to users

Example 1: Financial calculations

Scenario

Setup

Conversation

Example 2: Legal reasoning

Scenario

Conversation

Example 3: Secure code review

Scenario

Conversation

Example 4: Research paper validation

Scenario

Conversation

Example 5: SQL query validation

Scenario

Conversation

Example 6: Background verification

Scenario

Conversation

Example 7: Integration with LangChain

Python code

Best practices

1. Always verify before responding

2. Use background mode for heavy tasks

3. Use appropriate SDK imports

4. Handle verification failures

5. Explain verification to users