Skip to main content
POST
/
v1
/
jobs
from retab import Retab

client = Retab()

# Create an async extraction job
job = client.jobs.create(
    endpoint="/v1/documents/extract",
    request={
        "document": {
            "filename": "invoice.pdf",
            "url": "data:application/pdf;base64,JVBERi0xLjQK..."
        },
        "json_schema": {
            "type": "object",
            "properties": {
                "invoice_number": {"type": "string"},
                "total": {"type": "number"}
            },
            "required": ["invoice_number", "total"]
        },
        "model": "retab-small"
    },
    metadata={"batch_id": "batch_001", "source": "api"}
)

print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
{
  "id": "job_V1StGXR8_Z5jdHi6B-myT",
  "object": "job",
  "status": "queued",
  "endpoint": "/v1/documents/extract",
  "request": {
    "document": {
      "filename": "invoice.pdf",
      "url": "data:application/pdf;base64,JVBERi0xLjQK..."
    },
    "json_schema": {
      "type": "object",
      "properties": {
        "invoice_number": {"type": "string"},
        "total": {"type": "number"}
      },
      "required": ["invoice_number", "total"]
    },
    "model": "retab-small"
  },
  "response": null,
  "error": null,
  "created_at": 1705420800,
  "started_at": null,
  "completed_at": null,
  "expires_at": 1706025600,
  "organization_id": "org_abc123",
  "metadata": {
    "batch_id": "batch_001",
    "source": "api"
  }
}
from retab import Retab

client = Retab()

# Create an async extraction job
job = client.jobs.create(
    endpoint="/v1/documents/extract",
    request={
        "document": {
            "filename": "invoice.pdf",
            "url": "data:application/pdf;base64,JVBERi0xLjQK..."
        },
        "json_schema": {
            "type": "object",
            "properties": {
                "invoice_number": {"type": "string"},
                "total": {"type": "number"}
            },
            "required": ["invoice_number", "total"]
        },
        "model": "retab-small"
    },
    metadata={"batch_id": "batch_001", "source": "api"}
)

print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
{
  "id": "job_V1StGXR8_Z5jdHi6B-myT",
  "object": "job",
  "status": "queued",
  "endpoint": "/v1/documents/extract",
  "request": {
    "document": {
      "filename": "invoice.pdf",
      "url": "data:application/pdf;base64,JVBERi0xLjQK..."
    },
    "json_schema": {
      "type": "object",
      "properties": {
        "invoice_number": {"type": "string"},
        "total": {"type": "number"}
      },
      "required": ["invoice_number", "total"]
    },
    "model": "retab-small"
  },
  "response": null,
  "error": null,
  "created_at": 1705420800,
  "started_at": null,
  "completed_at": null,
  "expires_at": 1706025600,
  "organization_id": "org_abc123",
  "metadata": {
    "batch_id": "batch_001",
    "source": "api"
  }
}

Request Parameters

endpoint
string
required
The API endpoint to call asynchronously. Supported values:
  • /v1/documents/extract - Extract structured data
  • /v1/documents/parse - Parse to text/markdown
  • /v1/documents/split - Split documents
  • /v1/documents/classify - Classify documents
  • /v1/schemas/generate - Generate schemas
  • /v1/edit/agent/fill - AI agent form filling
  • /v1/edit/templates/fill - Template filling
  • /v1/edit/templates/generate - Generate form schema
  • /v1/projects/extract - Project extraction (requires project_id in request)
request
object
required
The full request body for the target endpoint. Must match the schema expected by the specified endpoint.
metadata
object
Optional key-value pairs for tracking. Maximum 16 pairs; keys up to 64 characters, values up to 512 characters.

Response Fields

id
string
Unique identifier for the job, prefixed with job_.
object
string
Always "job".
status
string
Current status: validating, queued, in_progress, completed, failed, cancelled, or expired.
endpoint
string
The target endpoint for this job.
request
object
The original request body submitted.
response
object | null
The response from the target endpoint when status is completed. Contains status_code and body.
error
object | null
Error details when status is failed. Contains code, message, and optional details.
created_at
integer
Unix timestamp when the job was created.
started_at
integer | null
Unix timestamp when processing started.
completed_at
integer | null
Unix timestamp when the job reached a terminal status.
expires_at
integer
Unix timestamp when the job data will expire (7 days after creation).
organization_id
string
The organization that owns this job.
metadata
object | null
User-provided metadata.

Authorizations

Api-Key
string
header
required

Headers

Idempotency-Key
string | null

Query Parameters

access_token
string | null

Body

application/json

Request body for POST /v1/jobs.

endpoint
enum<string>
required
Available options:
/v1/documents/extract,
/v1/documents/parse,
/v1/documents/split,
/v1/documents/classify,
/v1/schemas/generate,
/v1/edit/agent/fill,
/v1/edit/templates/fill,
/v1/edit/templates/generate,
/v1/projects/extract
request
Request · object
required
metadata
Metadata · object

Max 16 pairs; keys ≤64 chars, values ≤512 chars

Response

Successful Response

Core Job object following OpenAI-style specification.

Represents a single asynchronous job that can be polled for status and result retrieval.

endpoint
enum<string>
required
Available options:
/v1/documents/extract,
/v1/documents/parse,
/v1/documents/split,
/v1/documents/classify,
/v1/schemas/generate,
/v1/edit/agent/fill,
/v1/edit/templates/fill,
/v1/edit/templates/generate,
/v1/projects/extract
request
Request · object
required
organization_id
string
required
id
string
object
string
default:job
Allowed value: "job"
status
enum<string>
default:validating
Available options:
validating,
queued,
in_progress,
completed,
failed,
cancelled,
expired
response
JobResponse · object

Response stored when job completes successfully.

error
JobError · object

Error details when job fails.

created_at
integer
started_at
integer | null
completed_at
integer | null
expires_at
integer
metadata
Metadata · object
cloud_task_name
string | null
cancelled
boolean
default:false