Key Concepts

Understanding the core concepts will help you work effectively with the bizsupply API.

Core Entities

Document

A document is any file processed through bizsupply.

Key Properties:

Property	Description	Example
`id`	Unique identifier	`01HQZX3K4M2N5P7Q8R9S0T1U2V`
`metadata`	Source information	`{"source": "gmail", "sender": "[email protected]"}`
`labels`	Classification tags	`["invoice", "utility"]`
`data`	Extracted structured data	`{"total": 1500.00, "vendor": "Acme"}`

Lifecycle:

Created by a Source plugin (ingested from Gmail, etc.)
Labeled by a Classification plugin (tagged as "invoice", etc.)
Data extracted by an Extraction plugin (total, date, vendor, etc.)
Related to other documents by an Aggregation plugin

Plugin

Plugins are Python code that process documents. They are the core extensibility mechanism.

Four Plugin Types:

Type	Base Class	Method	Return Type
Source	`SourcePlugin`	`fetch()`	`AsyncIterator[DocumentInput]`
Classification	`ClassificationPlugin`	`classify()`	`str \| None`
Extraction	`ExtractionPlugin`	`extract()`	`ExtractionResult`
Aggregation	`BaseBenchmark`	`score()`, `compute()`, `compare()`	`float \| None`, `float`, `bool`

Plugin Components:

Code (plugin.py): Python class with your processing logic and metadata defined as class attributes. Built using the bizsupply-sdk package (pip install bizsupply-sdk).

Execution Order:

Plugins always execute in this order: Source → Classification → Extraction → Aggregation

Ontology

An ontology defines what to classify and extract from documents.

Two Parts:

Taxonomy - Hierarchical labels for classification
Fields - Data fields to extract for each label

Example:

name: "Invoice Ontology"
description: "Schema for invoice processing"
taxonomy:
  label: "invoice"
  fields:
    - name: "invoice_total"
      type: "number"
      required: true
    - name: "invoice_date"
      type: "date"
      required: true
    - name: "vendor_name"
      type: "string"
      required: true
  children:
    - label: "utility_invoice"
      fields:
        - name: "utility_type"
          type: "string"
          required: true

Usage:

Classification plugins use the taxonomy to apply appropriate labels
Extraction plugins use the fields to know what data to extract
Multiple ontologies can be combined in a single pipeline

Pipeline

A pipeline is a configured workflow combining plugins and ontologies.

Components:

Component	Description
`plugin_ids`	Ordered list of plugins to execute
`ontology_catalogs_ids`	Ontologies to use for extraction
`source_ids`	(Optional) Specific sources to process

Example:

Pipeline: "Gmail Invoice Processing"
  ├─ Plugins:
  │   1. Gmail Source (ingest emails with attachments)
  │   2. Invoice Classifier (detect and label invoices)
  │   3. Invoice Extractor (extract total, date, vendor)
  └─ Ontologies:
      • Invoice Ontology

Job

When you execute a pipeline, a job tracks the processing.

Job States:

State	Description
`pending`	Job created, waiting to start
`running`	Currently processing documents
`completed`	Finished successfully
`failed`	Encountered an error

Job Information:

Documents processed count
Current plugin being executed
Start/end timestamps
Error details (if failed)

Credential

Credentials connect bizsupply to external services.

Supported Types:

OAuth2 (Gmail, Outlook):

{
  "client_id": "your-client-id",
  "client_secret": "your-client-secret",
  "refresh_token": "your-refresh-token"
}

IMAP (Email servers):

{
  "host": "imap.gmail.com",
  "port": 993,
  "username": "[email protected]",
  "password": "your-app-password",
  "use_ssl": true
}

API Key (Custom APIs):

{
  "api_key": "your-api-key",
  "api_url": "https://api.example.com"
}

Credentials are stored securely and never exposed in API responses.

Relationships

User
  ├─ owns Documents
  ├─ owns Plugins
  ├─ owns Ontologies
  ├─ owns Pipelines
  └─ has Credentials

Pipeline
  ├─ references Plugins (in execution order)
  ├─ references Ontologies
  └─ creates Jobs when executed

Job
  ├─ executes a Pipeline
  ├─ processes Documents
  └─ tracks status and results

Document
  ├─ has labels (from Classification)
  ├─ has data (from Extraction)
  └─ can relate to other Documents

Data Isolation

Every resource belongs to a specific user:

You can only access your own documents, plugins, and pipelines
All API operations are automatically scoped to your user context
Complete data isolation between users

Next Steps

Install the SDK → pip install bizsupply-sdk
Build a plugin → Create a Plugin
Define extraction schemas → Create an Ontology
Process documents → Process Documents