Skip to main content
GET
/
v1
/
extractions
/
{extraction_id}
/
sources
from retab import Retab

client = Retab()

result = client.extractions.sources("extr_01G34H8J2K")
print(result)
{
  "object": "extraction.sources",
  "extraction_id": "extr_01G34H8J2K",
  "document_type": "pdf",
  "file": {
    "id": "file_abc123",
    "filename": "invoice_001.pdf",
    "mime_type": "application/pdf"
  },
  "extraction": {
    "invoice_number": "INV-1032",
    "customer": {
      "name": "Acme Inc."
    },
    "total_amount": 1240.0
  },
  "sources": {
    "invoice_number": {
      "value": "INV-1032",
      "source": {
        "content": "INV-1032",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.6,
          "top": 0.12,
          "width": 0.25,
          "height": 0.03
        }
      }
    },
    "customer": {
      "name": {
        "value": "Acme Inc.",
        "source": {
          "content": "Acme Inc.",
          "anchor": {
            "kind": "pdf_bbox",
            "page": 1,
            "left": 0.1,
            "top": 0.25,
            "width": 0.3,
            "height": 0.03
          }
        }
      }
    },
    "total_amount": {
      "value": 1240.0,
      "source": {
        "content": "1,240.00",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.65,
          "top": 0.85,
          "width": 0.2,
          "height": 0.03
        }
      }
    }
  }
}
from retab import Retab

client = Retab()

result = client.extractions.sources("extr_01G34H8J2K")
print(result)
{
  "object": "extraction.sources",
  "extraction_id": "extr_01G34H8J2K",
  "document_type": "pdf",
  "file": {
    "id": "file_abc123",
    "filename": "invoice_001.pdf",
    "mime_type": "application/pdf"
  },
  "extraction": {
    "invoice_number": "INV-1032",
    "customer": {
      "name": "Acme Inc."
    },
    "total_amount": 1240.0
  },
  "sources": {
    "invoice_number": {
      "value": "INV-1032",
      "source": {
        "content": "INV-1032",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.6,
          "top": 0.12,
          "width": 0.25,
          "height": 0.03
        }
      }
    },
    "customer": {
      "name": {
        "value": "Acme Inc.",
        "source": {
          "content": "Acme Inc.",
          "anchor": {
            "kind": "pdf_bbox",
            "page": 1,
            "left": 0.1,
            "top": 0.25,
            "width": 0.3,
            "height": 0.03
          }
        }
      }
    },
    "total_amount": {
      "value": 1240.0,
      "source": {
        "content": "1,240.00",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.65,
          "top": 0.85,
          "width": 0.2,
          "height": 0.03
        }
      }
    }
  }
}

Authorizations

Api-Key
string
header
required

Path Parameters

extraction_id
string
required

Response

Successful Response

An extraction's output annotated with the source that backs each value.

Returned when fetching the sources for an extraction. Carries the source file and its detected document_type, the original extraction output, and a parallel sources tree where each leaf is a {value, source} object locating the value in the document (a page region for PDFs, a cell for spreadsheets, a text span for plain text, and so on).

extraction_id
string
required

ID of the extraction

document_type
enum<string>
required

Detected document type of the source file

Available options:
pdf,
image,
csv,
xlsx,
docx,
txt
file
FileRef · object
required

File metadata (id, filename, mime_type)

extraction
Extraction · object
required

Original extraction output

sources
Sources · object
required

Same shape as extraction but leaves are {value, source} objects

object
string
default:extraction.sources
Allowed value: "extraction.sources"