How It Works
Instead of processing an entire document in a single LLM call, Parallel Extraction:- Identifies array structures in your JSON schema (e.g.,
line_items,transactions,entries) - Extracts unique keys from the document (e.g., product names, invoice numbers, dates) that identify each item
- Segments the document by locating where each key appears
- Processes items concurrently - each array item is extracted in parallel using dedicated LLM calls
- Merges results into the final structured output while preserving order
Context Engineering: Why Parallel Extraction is More Accurate
Beyond speed, Parallel Extraction significantly improves extraction accuracy through Context Engineering — the practice of optimizing what context the LLM sees for each extraction task. When processing a 50-page invoice with 200 line items in a single LLM call, the model must:- Hold the entire document in its context window
- Track hundreds of items simultaneously
- Maintain attention across thousands of tokens
- Avoid confusing similar items that appear pages apart
| Aspect | Standard Extraction | Parallel Extraction |
|---|---|---|
| Context per item | Entire document (all pages) | Only the relevant segment |
| Noise level | High (hundreds of unrelated items) | Minimal (just the target item) |
| Attention dilution | Significant on long documents | None — laser-focused extraction |
| Position bias | Later items often less accurate | Equal accuracy for all items |
When to Use Parallel Extraction
Parallel Extraction is ideal when:- Your schema contains arrays of objects (e.g.,
line_items: [{sku, description, quantity, price}]) - Documents have many repeating items (10+ items benefit most)
- You need faster turnaround on large documents
- Items can be uniquely identified by a key field
Usage
Enable Parallel Extraction by specifying theparallel_ocr_keys parameter in your extraction request:
parallel_ocr_keys parameter is a dictionary mapping:
- Key: The path to the array in your schema (e.g.,
"line_items","transactions","items.products") - Value: The field within each array item that uniquely identifies it (e.g.,
"product_name","sku","id")
Example Schema
parallel_ocr_keys={"line_items": "product_name"}, Retab will:
- Extract all product names from the document
- Locate each product’s position in the document
- Extract each line item’s details in parallel
- Merge results and extract constants (
invoice_number,date,total_amount) separately
Supported Document Types
- PDF documents
- Images (JPEG, PNG, etc.)
- Office documents (DOCX, PPTX, ODT, ODP)
- Excel spreadsheets (XLSX, XLS)
Performance Benefits
| Document Size | Standard Extraction | Parallel Extraction |
|---|---|---|
| 5 line items | ~3s | ~3s |
| 20 line items | ~8s | ~4s |
| 50 line items | ~15s | ~5s |
| 100+ items | ~30s+ | ~6s |
Consensus with Parallel Extraction
Parallel Extraction fully supports then_consensus parameter. When enabled, each item is extracted multiple times independently, and results are compared to improve accuracy. This is particularly useful for:
- High-value documents requiring verification
- Documents with challenging handwriting or scan quality
- Compliance-critical extractions
Pricing
Parallel Extraction uses the same credit-based pricing as standard extraction. The cost is calculated per page:Cost Breakdown
| Component | Cost |
|---|---|
| Key discovery | 1 × model_credits × page_count |
| Per-item extraction | n_items × n_consensus × model_credits |
| Constants extraction | 1 × model_credits × page_count |
Example
For a 10-page invoice with 25 line items usingauto-small (1.0 credit) and n_consensus=1:
- Key discovery: 1 × 1.0 × 10 = 10 credits
- Item extraction: 25 × 1 × 1.0 = 25 credits
- Constants: 1 × 1.0 × 10 = 10 credits
- Total: 45 credits