The platform, in one page

Everything Pagentic does, and why it matters.

Pagentic turns messy real-world PDFs — freight invoices, medical EOBs, policy appendixes, contracts — into clean, audited, structured JSON your software can actually trust. One signed multi-tenant API, a portal for humans, and specialized agents tuned to your specific document types.

Document agents

Specialized agents, not a generic extractor

Off-the-shelf document AI treats every document the same. Pagentic doesn't. You build a dedicated agent per document type — an LTL freight invoice agent, a medical EOB agent, a policy-appendix agent — each with its own instructions, JSON schema, and worked example tuned to the quirks of that document.

The payoff: dramatically higher accuracy on the messy, variable, real-world documents that generic tools quietly fail on.

Agent builder

Build a working agent in an afternoon

Upload a handful of sample documents. Pagentic surveys them, drafts an agent (instructions, JSON schema, worked example), and runs it on the spot. You go from "I have ten PDFs" to "I have a working extraction agent" in a single session.

Auto-drafted schema inferred from your sample docs
Inline document preview alongside the JSON output
One-click promotion from draft to versioned, production-ready runtime agent

Iterate interactively

Chat with the agent, version every change

Review extractions side-by-side with the source PDF. Click any field to see where it came from on the page. Disagree with a value? Give the agent feedback in natural language — the next iteration responds. No prompt-engineering tax.

Every iteration is versioned. Compare any two versions. Roll back any time. Pin production integrations to a specific version so iteration in the builder never silently changes what your downstream systems see.

Verifiable outputs

Every field is traceable back to the source

Every extracted leaf field arrives wrapped with value, verbatim (the literal text from the PDF), pages (the 1-based page indices it came from), confidence, and notes. Any number you act on can be audited back to the exact spot in the original document — for compliance, dispute resolution, or just to build trust internally.

Cross-checks

The agent tells you when something is suspicious

Every result includes a _meta.noteschannel where the agent proactively flags inconsistencies it noticed: subtotals that don't add up, OCR misreads preserved verbatim, missing fields, conflicting values, ambiguous interpretations.

That's the difference between "extracted the data" and understanding the document— and it's how you catch integration errors that would otherwise hit downstream systems silently.

Human in the loop

Approval queue with edit-in-place corrections

Toggle requires approval on any agent and every extraction lands in a reviewer queue before the webhook fires. Reviewers see the document next to the extracted JSON, can click any field to fix it directly in the tree or the visual view, and approve or reject with one click.

Use it for sensitive workflows — legal, healthcare, large-dollar approvals — or while you're still building trust with a brand-new agent. Toggle it off when you're ready to go fully automated. Original values are preserved for audit on every manual correction.

Roles & access

Granular permissions per teammate

Three role tiers per tenant:

Owner — full admin: billing, users, API keys, agent edits.
Member — can extract documents, review queue items, view results.
Reviewer — read-only access scoped to the approval queue.

On top of that, any teammate can be restricted to a specific subset of agents. A reviewer who only handles freight invoices won't see medical EOBs in their queue at all — even via direct URL.

Production-grade API

Built for engineers who ship to production

Signed webhooks — HMAC-SHA256 on every payload, with a six-attempt retry schedule (1min → 5min → 30min → 2h → 12h → dead-letter) and full replay from the portal.
Idempotency keys — pass X-Idempotency-Key on any submission; network glitches stop costing you double extractions.
Versioned agents — pin agent=my_agent@5 so production output is reproducible; retire old versions without breaking existing integrations.
Live + test API keys with per-scope rate limits, narrowest-scope minting, and immediate revocation. Keys are SHA-256 hashed at rest.
Sync or async — pick the right mode per call. Async is the default and right for almost every production integration.

Multi-tenant by design

Tenant isolation, retention controls, audit trails

Every record — extractions, agents, keys, audit events — is hard-isolated per tenant. You set your own retention window. Every API call lands in an audit log keyed by API key and user. Webhook secrets and API keys are SHA-256 hashed at rest and never recoverable.

Observability

Usage roll-ups and full extraction history

Every successful extraction emits a usage event with page counts, token counts, and the per-page price applied. Roll-ups are visible in the portal, broken down by agent and API key, so you always know what's costing what — and which agents are doing the heavy lifting.

Want to see it on your documents?

Email us a representative sample of the document type you're trying to extract. We'll build a working agent in a sandbox and walk you through the output.

hello@pagentic.com Read the API docs →