Extractions are the results of document processing. Each extraction contains the structured data extracted from a document, along with metadata about the extraction process. You can list, filter, and retrieve extractions programmatically.
Retrieve a paginated list of extractions with optional filtering by date, origin, review status, or custom metadata.
from datetime import datetime
from retab import Retab
client = Retab()
# List recent extractions
extractions = client.extractions.list(
limit=10,
order="desc"
)
# Filter by metadata
extractions = client.extractions.list(
metadata={"organization_id": "org_acme_corp"},
limit=50
)
# Filter by date range
extractions = client.extractions.list(
from_date=datetime(2024, 1, 1),
to_date=datetime(2024, 12, 31)
)
Parameters
Maximum number of extractions to return per page.
Sort order by creation date. Either "asc" or "desc".
Cursor for pagination - return extractions before this ID.
Cursor for pagination - return extractions after this ID.
Filter extractions created on or after this date. Use datetime in Python or Date in JavaScript.
Filter extractions created on or before this date. Use datetime in Python or Date in JavaScript.
Filter by custom metadata key-value pairs.
Retrieve a single extraction by its ID.
from retab import Retab
client = Retab()
extraction = client.extractions.get("extr_01G34H8J2K")
print(extraction)
Metadata filtering is powerful for organizing extractions across multiple clients or workflows. When you attach metadata during extraction, you can later filter by those same keys.
from retab import Retab
client = Retab()
# List all extractions for a specific organization
org_extractions = client.extractions.list(
metadata={"organization_id": "org_acme_corp"},
limit=100
)
Please check the API Reference for complete method documentation.