Verified structured extraction
Pull structured fields out of your sources where every field value is a verbatim
substring of a source — or null. Deterministic, no LLM judge. Unlike schema validators
(which check the shape of JSON), this checks each field’s value is actually grounded.
- SDK:
mx.verified.extract(params) - HTTP:
POST https://api.maxmodel.com/v1/verified/extract
SDK
const r = await mx.verified.extract({
model: 'gpt-5.5-pro',
sources: [
{ id: 'pricing.md', text: 'The Pro plan costs $29/month and includes 50 seats.' },
{ id: 'sla.md', text: 'Uptime SLA is 99.9%. Support responds within 24 hours.' },
],
schema: {
price: 'the monthly price of the Pro plan',
sla: 'the uptime SLA percentage',
ceo: 'the name of the CEO',
},
})
r.fields.price // { value: '$29/month', source: 'pricing.md', range: [19, 28], grounded: true }
r.fields.ceo // { value: null, grounded: false, reason: 'not_found' } ← not in the sources
r.coverage // 0.67 (grounded fields / total fields)r = mx.verified.extract(model="gpt-5.5-pro", sources=[...], schema={
"price": "the monthly price", "ceo": "the CEO name",
})
r.fields["price"].value # "$29/month"
r.fields["price"].source # "pricing.md"
r.fields["ceo"].grounded # False (reason: "not_found")
r.coverageRequest / response (HTTP)
// POST /v1/verified/extract
{ "model": "gpt-5.5-pro", "mode": "strict",
"sources": [ { "id": "pricing.md", "text": "..." } ],
"schema": { "price": "the monthly price", "ceo": "the CEO name" } }{ "id": "ext_...", "model": "gpt-5.5-pro",
"fields": {
"price": { "value": "$29/month", "source": "pricing.md", "range": [19, 28], "grounded": true },
"ceo": { "value": null, "grounded": false, "reason": "not_found" }
},
"coverage": 0.5,
"usage": { "prompt_tokens": 120, "completion_tokens": 30, "total_tokens": 150, "model_calls": 1 } }How it works
The model is asked to return each field as an exact substring of a source. Then code
checks each value appears verbatim in some source (the same character-for-character match as
verified.create, boundary-exact). A value that isn’t found is set
to null with reason: 'not_found' — so a hallucinated field can’t slip through with a real-looking
value. range is the offset of the match in the cited source, for highlighting.
schemais{ fieldName: description }. Keep descriptions short and concrete.mode:strict(default) orlenient(tolerates punctuation/case in the match).coverage= grounded fields / total fields.