PDF Vector

PDF Extraction API for developers

Extract structured data from PDF documents using AI and JSON Schema. Perfect for invoices, forms, research papers, and more.

  • Schema-Driven ExtractionDefine your data structure with JSON Schema and get perfectly formatted results every time
  • AI-Powered UnderstandingAdvanced AI analyzes documents to extract exactly the data you need, handling variations gracefully
  • Type-Safe IntegrationJSON Schema validation ensures consistent, predictable data structure for your applications

Easy to use APIs

Use our simple APIs directly or our TypeScript SDK with just a few lines of code.

PDF Extract

API Docs
import { PDFVector } from "pdfvector";

const client = new PDFVector({
  apiKey: "pdfvector_xxxxxxx"
});

// From URL
const results = await client.extract({
  url: "https://example.com/invoice.pdf",
  prompt: "Extract all invoice details",
  schema: {
    type: "object",
    properties: {
      invoiceNumber: { type: "string" },
      date: { type: "string" },
      totalAmount: { type: "number" }
  }
  }
});

// From file
import { readFile } from "fs/promises";
const results = await client.extract({
  data: await readFile("document.pdf"),
  contentType: "application/pdf",
  prompt: "Extract all invoice details",
  schema: {
    type: "object",
    properties: {
      invoiceNumber: { type: "string" },
      date: { type: "string" },
      totalAmount: { type: "number" }
  }
  }
});

What people are saying

See how PDF Vector is helping teams improve their document processing workflows

Abdo El-Mobayad

Can't recommend PDF Vector enough! It boosts your AI workflow accuracy to 100% while dropping your costs! Especially if you're a T4 Org in the $150/m spend range!

Abdo El-Mobayad

Trent

Gotta give a shoutout to PDF Vector team for helping me set up PDF Vector for a project. They even delivered on a feature request before I purchased. Incredible customer service. 👏

Trent

Praneeth Pike

Been implementing RAG and changing a lot of things under the hood for @rabbitholesai, came across PDF Vector and it was a huge time saver. I got a document parsing solution for the rag pipeline within minutes! one less thing to worry about

Praneeth Pike

Structured Data Extraction

Turn unstructured PDFs into structured, validated data. Define your schema once and extract consistently from thousands of documents.

Get started

JSON Schema Validation

Define exactly what data you need using JSON Schema. Get type-safe, validated results that match your data structure perfectly.

AI-Powered Extraction

Advanced AI understands document layouts and extracts data intelligently, handling variations and edge cases automatically.

Database-Ready Output

Get clean, structured JSON output ready for your database or application. No post-processing needed.

Complex Document Support

Extract from invoices, forms, reports, contracts, and more. Handles tables, nested data, and complex layouts with ease.

Example Output

Real examples of structured data extracted from PDFs

Original Document

Invoice

Output

AI-generated answer to your question

Question

Extract all invoice details

Schema

{
  "type": "object",
  "properties": {
    "invoiceNumber": {
      "type": "string"
    },
    "totalAmount": {
      "type": "number"
    },
    "Basic Fee wmView": {
      "type": "string"
    }
  },
  "required": [
    "invoiceNumber",
    "totalAmount"
  ],
  "additionalProperties": false
}

Answer

{
  "data": {
    "invoiceNumber": "123100401",
    "totalAmount": 453.53,
    "Basic Fee wmView": "130,00 €"
  },
  "pageCount": 3,
  "creditCount": 9
}

One subscription, all APIs

Start for free, then scale as you grow. No hidden fees.

Save one month

Free

$0

Credit Card Required

Perfect for testing and small projects

  • Access to all APIs
  • 100 credits
Subscribe to Free

Basic

$23/month

$275 billed annually

Great for personal projects and small businesses

  • Access to all APIs
  • 3,000 credits
Subscribe to Basic
Most Popular

Pro

$89/month

$1067 billed annually

Most popular plan for growing businesses

  • Access to all APIs
  • 100,000 credits
Subscribe to Pro

Enterprise

$457/month

$5489 billed annually

For large-scale applications and enterprises

  • Access to all APIs
  • 500,000 credits
Subscribe to Enterprise

Ready to Extract Data at Scale?

Transform your PDFs into structured data with our powerful extraction API. Define your schema once, extract consistently forever.

No setup fees • Integrate in minutes • Cancel anytime

Frequently asked questions