Skip to main content
Functions allow you to define computed fields in your extraction schema—fields whose values are calculated from other extracted data rather than directly pulled from the document. Use Functions to create totals, apply business rules, validate consistency, and derive new values from your extracted data.

Overview

When extracting structured data from documents, you often need values that aren’t explicitly stated but can be computed from other fields. For example:
  • Line item totals: quantity * unit_price
  • Invoice totals: Sum of all line item totals
  • Reconciliation checks: Verify that computed totals match stated totals
  • Conditional values: Apply different logic based on field values
Functions use a formula-based expression language that supports:
  • Arithmetic operations: +, -, *, /
  • Mathematical functions: ABS, SQRT, LOG, POW
  • Aggregation functions: SUM, AVERAGE, MIN, MAX, COUNT
  • Array operations: FILTER, FLATTEN, LOOKUP, MATCH
  • Logical operations: AND, OR, NOT, IF
  • Comparison operators: =, !=, >, >=, <, <=
  • String operations: CONCAT

Defining Computed Fields

Computed fields are defined in the project section of the dashboard (schema editor/functions).

Expression Syntax

Paths and Field References

Use dot notation to reference fields in your extracted data:
ConceptSyntax ExampleDescription
Field pathorder.totalField total inside object order
Array fieldtransactionsArray of transaction objects
Wildcard elementtransactions.*”Each element” of transactions
Field per elementtransactions.*.amountamount field for every element in transactions

Wildcards for Arrays

When working with arrays, use * to reference each element:
# Sum the 'amount' field from each transaction
SUM(transactions.*.amount)

# Equivalent syntax using field projection
SUM(transactions, "amount")

Arithmetic Operators

All arithmetic operators work on single values (scalars) and arrays (element-wise).
OperatorSyntaxDescriptionExample
Adda + bAdd numbersprice + tax
Subtracta - bSubtract b from atotal - discount
Multiplya * bMultiply numbersquantity * unit_price
Dividea / bDivide a by b (null if b = 0)distance / time
Example: Line Item Total
quantity * unit_price
Example: Discounted Price
price * (1 - discount_rate)
Example: Element-wise Array Operation
# Add a $5 fee to each transaction amount
transactions.*.amount + 5

Mathematical Functions

FunctionSyntaxDescriptionExample
ABSABS(x)Absolute valueABS(-10)10
SQRTSQRT(x)Square root (x must be non-negative)SQRT(25)5
LOGLOG(x) or LOG(x, base)Natural or custom-base logarithmLOG(100, 10)2
POWPOW(base, exponent)Raise base to exponent powerPOW(2, 3)8
Example: Euclidean Distance
SQRT(POW(x2 - x1, 2) + POW(y2 - y1, 2))
Example: Compound Interest
principal * POW(1 + rate, years)

Aggregation Functions

Aggregation functions reduce arrays to single values.
FunctionSyntaxDescriptionExample
SUMSUM(array, "field")Sum all valuesSUM(items, "price")
AVERAGEAVERAGE(array, "field")Arithmetic meanAVERAGE(scores, "points")
MINMIN(array, "field")Smallest valueMIN(items, "price")
MAXMAX(array, "field")Largest valueMAX(items, "rating")
COUNTCOUNT(array, value)Count occurrencesCOUNT(tasks.*.status, "done")
Two equivalent syntaxes:
# Using wildcard path
SUM(items.*.price)

# Using field projection (recommended)
SUM(items, "price")
Example: Invoice Total
SUM(line_items, "total")
Example: Average Order Value
AVERAGE(orders, "amount")
Example: Count Completed Tasks
COUNT(tasks.*.status, "completed")

Array Functions

FunctionSyntaxDescriptionExample
FILTERFILTER(array, "field", comparator, value)Keep elements matching conditionFILTER(items, "active", "=", true)
FLATTENFLATTEN(nestedArray) or FLATTEN(array, "field")Merge nested arrays into oneFLATTEN(groups, "items")
LOOKUPLOOKUP(array, "field", comparator, value, [proj])Find first match, optionally projectLOOKUP(users, "id", "=", 5, "name")
MATCHMATCH(array, "field", comparator, value)Get index of first match (-1 if none)MATCH(items, "status", "=", "pending")

FILTER

Keep only elements that satisfy a condition. Comparators: "=", "!=", ">", ">=", "<", "<=", "contains"
# All products over $100
FILTER(products, "price", ">", 100)

# All active users
FILTER(users, "status", "=", "active")

# Products containing "Pro" in name
FILTER(products, "name", "contains", "Pro")
Example: Sum Only Active Items
SUM(FILTER(items, "active", "=", true), "price")

FLATTEN

Merge nested arrays into a single flat array.
# Flatten array of arrays
FLATTEN([[1, 2], [3, 4]])  # → [1, 2, 3, 4]

# Flatten array field from objects
FLATTEN(orders, "items")
# If orders = [{items: [1,2]}, {items: [3,4]}]
# Result: [1, 2, 3, 4]

LOOKUP

Find the first matching element in an array.
# Get full user object with id = 5
LOOKUP(users, "id", "=", 5)

# Get just the name of user with id = 5
LOOKUP(users, "id", "=", 5, "name")

MATCH

Get the index (position) of the first matching element.
# Index of first pending task (0-based, -1 if not found)
MATCH(tasks, "status", "=", "pending")

Comparison Operators

OperatorSyntaxFunction FormDescriptionExample
Equala = b, a == bEQ(a, b)True if equalstatus = "active"
Not Equala != bTrue if differentrole != "admin"
Greatera > bGT(a, b)True if a > bprice > 100
Greater/Equala >= bGTE(a, b)True if a >= bquantity >= 10
Lessa < bLT(a, b)True if a < bage < 18
Less/Equala <= bLTE(a, b)True if a <= bscore <= 100
Inside array functions, use comparator strings:
FILTER(users, "role", "=", "admin")
FILTER(products, "price", ">", 100)
FILTER(items, "name", "contains", "Pro")

Logical Operators

FunctionSyntaxDescriptionExample
ANDAND(cond1, cond2, ...)True if all conditions trueAND(user.active, user.verified)
OROR(cond1, cond2, ...)True if any condition trueOR(priority = "high", overdue)
NOTNOT(cond) or !condInvert boolean valueNOT(user.active) or !verified
Example: Complex Condition
AND(user.active, user.verified, user.age >= 18)

Conditional Logic (IF)

The IF function returns different values based on a condition. Syntax: IF(condition, value_if_true, value_if_false)
# Label based on price
IF(price > 100, "expensive", "affordable")

# Default for missing field
IF(discount != null, discount, 0)

# Tiered pricing
IF(quantity >= 100, price * 0.8, IF(quantity >= 50, price * 0.9, price))
Array Behavior: When the condition is an array, IF returns an array of results:
# For each item, check if price > 50
IF(items.*.price > 50, "premium", "standard")
# Returns: ["premium", "standard", "premium", ...]

String Functions

FunctionSyntaxDescriptionExample
CONCATCONCAT(part1, part2, ..., partN)Join values into a stringCONCAT("Hello ", name, "!") → “Hello Alice!”
Non-string values are automatically converted. Null values become empty strings.
# Build order reference
CONCAT(order.prefix, "-", order.number)
# e.g., "ORD-12345"

# Format full name
CONCAT(first_name, " ", last_name)

Null Handling

SituationBehavior
Numeric op with nullTreated as 0 (e.g., 5 + null5)
Logical op with nullTreated as false
Missing fieldReturns null; use IF for safe defaults
Safe default pattern:
IF(field != null, field, 0)

Real-World Examples

Example 1: Invoice Reconciliation

Verify that the sum of all property totals matches the check amount, within 1% tolerance.
ABS(SUM(properties, "total_property_value") - check.amount) 
<= 0.01 * check.amount
Breakdown:
  1. SUM(properties, "total_property_value") — Sum all property totals
  2. - check.amount — Subtract the check amount
  3. ABS(...) — Get absolute difference (error)
  4. <= 0.01 * check.amount — Check if error is within 1%

Example 2: Filtered Aggregation

Sum only deposits from a transaction list:
SUM(FILTER(transactions, "type", "=", "deposit"), "amount")
Breakdown:
  1. FILTER(transactions, "type", "=", "deposit") — Get only deposit transactions
  2. SUM(..., "amount") — Sum their amounts

Example 3: Line Item Validation

For each line item, check if owner_value matches computed value (within 1%):
IF(
    line_items.*.owner_value <= 0,
    true,
    ABS(
        line_items.*.property_value * line_items.*.distribution_interest
        - line_items.*.owner_value
    ) <= line_items.*.owner_value * 0.01
)
Breakdown:
  1. Skip validation for non-positive values (taxes, deductions)
  2. Compute expected value: property_value * distribution_interest
  3. Compare to actual owner_value with 1% tolerance

Example 4: Nested Array Aggregation

Sum values across all nested arrays:
SUM(FLATTEN(orders, "items"), "price")
Breakdown:
  1. FLATTEN(orders, "items") — Collect all items from all orders
  2. SUM(..., "price") — Sum all prices

Example 5: Conditional Labeling

Categorize orders by size:
IF(
    total >= 10000,
    "enterprise",
    IF(total >= 1000, "business", "personal")
)

Best Practices

1. Use Field Projection Syntax

Prefer SUM(items, "price") over SUM(items.*.price) for clarity.

2. Handle Nulls Explicitly

# Bad: may fail on null
quantity * unit_price

# Good: safe default
IF(quantity != null, quantity, 0) * IF(unit_price != null, unit_price, 0)

3. Use Tolerance for Comparisons

Floating-point arithmetic can cause tiny differences. Use tolerance checks:
# Bad: exact comparison
computed_total = stated_total

# Good: tolerance comparison
ABS(computed_total - stated_total) <= stated_total * 0.01

4. Build Complex Expressions Incrementally

Define intermediate computed fields for readability:
# In line_item schema:
line_total: "quantity * unit_price"

# In order schema:
subtotal: "SUM(line_items, \"line_total\")"
tax: "subtotal * tax_rate"
total: "subtotal + tax"

5. Use FILTER for Conditional Aggregation

# Sum only active items
SUM(FILTER(items, "active", "=", true), "price")

# Count items by status
COUNT(FILTER(items, "status", "=", "pending"), "_")

Common Patterns

Header vs. Detail Reconciliation

# Check: sum of line items = header total
ABS(SUM(line_items, "amount") - header_total) <= header_total * 0.01

Cross-Reference Validation

# Check: computed value matches stated value
ABS(price * quantity * interest - owner_value) <= owner_value * 0.01

Conditional Processing

# Different calculation based on type
IF(
    type = "percentage",
    base_amount * rate / 100,
    rate
)

Aggregate with Multiple Conditions

# Sum only active, taxable items over $100
SUM(
    FILTER(
        FILTER(items, "active", "=", true),
        "price", ">", 100
    ),
    "price"
)

Quick Reference

Arithmetic

OperationExampleResult
Add5 + 38
Subtract10 - 46
Multiply6 * 742
Divide15 / 35

Math Functions

FunctionExampleResult
ABSABS(-10)10
SQRTSQRT(16)4
POWPOW(2, 3)8
LOGLOG(100, 10)2

Aggregation

FunctionExampleDescription
SUMSUM(items, "price")Total of all prices
AVERAGEAVERAGE(scores, "value")Mean of all values
MINMIN(items, "price")Lowest price
MAXMAX(items, "rating")Highest rating
COUNTCOUNT(tasks.*.status, "done")Count of “done”

Array Functions

FunctionExampleDescription
FILTERFILTER(items, "active", "=", true)Keep active items
FLATTENFLATTEN(orders, "items")Merge all items
LOOKUPLOOKUP(users, "id", "=", 5)Find user by id
MATCHMATCH(items, "status", "=", "pending")Index of first match

Comparators for Array Functions

ComparatorMeaningExample
"="EqualFILTER(users, "role", "=", "admin")
"!="Not equalFILTER(users, "role", "!=", "admin")
">"Greater thanFILTER(items, "price", ">", 100)
">="Greater or equalFILTER(orders, "total", ">=", 1000)
"<"Less thanFILTER(items, "priority", "<", 3)
"<="Less or equalFILTER(scores, "value", "<=", 50)
"contains"Substring matchFILTER(products, "name", "contains", "Pro")

Go Further

  • Extraction - Learn how to extract structured data
  • Reasoning - Add step-by-step reasoning for complex calculations
  • Schema - Design your extraction schemas