The platform, in one page
Everything Pagentic does, and why it matters.
Pagentic turns messy real-world PDFs — freight invoices, medical EOBs, policy appendixes, contracts — into clean, audited, structured JSON your software can actually trust. One signed multi-tenant API, a portal for humans, and specialized agents tuned to your specific document types.
Document agents
Specialized agents, not a generic extractor
Off-the-shelf document AI treats every document the same. Pagentic doesn't. You build a dedicated agent per document type — an LTL freight invoice agent, a medical EOB agent, a policy-appendix agent — each with its own instructions, JSON schema, and worked example tuned to the quirks of that document.
The payoff: dramatically higher accuracy on the messy, variable, real-world documents that generic tools quietly fail on.
Agent builder
Build a working agent in an afternoon
Upload a handful of sample documents. Pagentic surveys them, drafts an agent (instructions, JSON schema, worked example), and runs it on the spot. You go from "I have ten PDFs" to "I have a working extraction agent" in a single session.
- Auto-drafted schema inferred from your sample docs
- Inline document preview alongside the JSON output
- One-click promotion from draft to versioned, production-ready runtime agent
Iterate interactively
Chat with the agent, version every change
Review extractions side-by-side with the source PDF. Click any field to see where it came from on the page. Disagree with a value? Give the agent feedback in natural language — the next iteration responds. No prompt-engineering tax.
Every iteration is versioned. Compare any two versions. Roll back any time. Pin production integrations to a specific version so iteration in the builder never silently changes what your downstream systems see.
Verifiable outputs
Every field is traceable back to the source
Every extracted leaf field arrives wrapped with value, verbatim (the literal text from the PDF), pages (the 1-based page indices it came from), confidence, and notes. Any number you act on can be audited back to the exact spot in the original document — for compliance, dispute resolution, or just to build trust internally.
Cross-checks
The agent tells you when something is suspicious
Every result includes a _meta.noteschannel where the agent proactively flags inconsistencies it noticed: subtotals that don't add up, OCR misreads preserved verbatim, missing fields, conflicting values, ambiguous interpretations.
That's the difference between "extracted the data" and understanding the document— and it's how you catch integration errors that would otherwise hit downstream systems silently.
Human in the loop
Approval queue with edit-in-place corrections
Toggle requires approval on any agent and every extraction lands in a reviewer queue before the webhook fires. Reviewers see the document next to the extracted JSON, can click any field to fix it directly in the tree or the visual view, and approve or reject with one click.
Use it for sensitive workflows — legal, healthcare, large-dollar approvals — or while you're still building trust with a brand-new agent. Toggle it off when you're ready to go fully automated. Original values are preserved for audit on every manual correction.
Roles & access
Granular permissions per teammate
Three role tiers per tenant:
- Owner — full admin: billing, users, API keys, agent edits.
- Member — can extract documents, review queue items, view results.
- Reviewer — read-only access scoped to the approval queue.
On top of that, any teammate can be restricted to a specific subset of agents. A reviewer who only handles freight invoices won't see medical EOBs in their queue at all — even via direct URL.
Production-grade API
Built for engineers who ship to production
- Signed webhooks — HMAC-SHA256 on every payload, with a six-attempt retry schedule (1min → 5min → 30min → 2h → 12h → dead-letter) and full replay from the portal.
- Idempotency keys — pass
X-Idempotency-Keyon any submission; network glitches stop costing you double extractions. - Versioned agents — pin
agent=my_agent@5so production output is reproducible; retire old versions without breaking existing integrations. - Live + test API keys with per-scope rate limits, narrowest-scope minting, and immediate revocation. Keys are SHA-256 hashed at rest.
- Sync or async — pick the right mode per call. Async is the default and right for almost every production integration.
Multi-tenant by design
Tenant isolation, retention controls, audit trails
Every record — extractions, agents, keys, audit events — is hard-isolated per tenant. You set your own retention window. Every API call lands in an audit log keyed by API key and user. Webhook secrets and API keys are SHA-256 hashed at rest and never recoverable.
Observability
Usage roll-ups and full extraction history
Every successful extraction emits a usage event with page counts, token counts, and the per-page price applied. Roll-ups are visible in the portal, broken down by agent and API key, so you always know what's costing what — and which agents are doing the heavy lifting.
Want to see it on your documents?
Email us a representative sample of the document type you're trying to extract. We'll build a working agent in a sandbox and walk you through the output.