Logo
  • Pricing
Sign In
Join Slack
Logo

AI-powered document data extraction. Simplify data entry and automate document processing with our intelligent parsing solution.

© Copyright 2025 Parsie. All Rights Reserved.

Support
  • Email us
  • System Status
Product
  • Use Cases
  • Features
  • API Documentation
  • Pricing
  • Google Sheets Add-On
Free Tools
  • Resume Parser
  • Bank Statement Converter
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Getting started with Parsie API

Getting started with Parsie API

Learn how to get started with Parsie API using our latest endpoints.

Parsie is an Intelligent Document Processing platform that automates the extraction and processing of data from your documents. Use the endpoints below to create tables, upload files to them, manage your table schemas, and retrieve data.

Getting started

  1. Sign up for an account on the Parsie website.
  2. Obtain your API key from the Parsie dashboard.

Authentication

All API requests require an API key. Include your API key in the x-api-key header:

x-api-key: YOUR_API_KEY

Supported File Types

Parsie currently supports the following file types for extraction:

  • PDF documents (.pdf)
  • Microsoft Office documents: Word (.doc, .docx), Excel (.xls, .xlsx)
  • Image files: PNG (.png) and JPEG (.jpg, .jpeg)

Maximum file size: 4MB per upload.

API Endpoints

Note: For detailed information about table schema structure and limitations, see the Table Schema Structure section at the end of this document.

Base URL

https://parsie.pro

List Tables

Endpoint:
GET {{baseUrl}}/api/tables

Description:
Retrieves a list of all tables associated with your account.

Headers:

  • x-api-key: YOUR_API_KEY

Response:

[
  {
    "id": "931637d2-ca0a-47be-aae3-aafd262cdaf0",
    "name": "Invoice Table",
    "description": "Table for storing invoice data"
  }
]

Create Table

Endpoint:
POST {{baseUrl}}/api/tables

Description:
Creates a new table with a defined schema for storing extracted data.

Schema Details: The definition field must follow our schema structure with specific limitations.

Headers:

  • x-api-key: YOUR_API_KEY

Request Body:

{
  "name": "Invoice Table",
  "description": "Table for storing invoice data",
  "definition": {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "properties": {
      "name": "Invoice Table",
      "description": "Table for storing invoice data",
      "columns": [
        {
          "id": "invoice_number",
          "title": "Invoice Number",
          "type": "string",
          "description": "Unique identifier for the invoice",
          "examples": "INV-100",
          "isList": false,
          "properties": []
        }
      ]
    },
    "required": ["name", "description", "columns"],
    "additionalProperties": false
  }
}

Response:

{
  "id": "931637d2-ca0a-47be-aae3-aafd262cdaf0",
  "name": "Invoice Table",
  "description": "Table for storing invoice data",
  "definition": { ... }
}

Delete Table

Endpoint:
DELETE {{baseUrl}}/api/tables

Description:
Deletes an existing table by providing its unique ID.

Headers:

  • x-api-key: YOUR_API_KEY

Request Body:

{
  "id": "284bd9dd-0c10-4e67-aab7-bf9f1af3e5c1"
}

Response:

{
  "message": "Table with id 284bd9dd-0c10-4e67-aab7-bf9f1af3e5c1 has been deleted"
}

Get Table Schema

Endpoint:
GET {{baseUrl}}/api/tables/{tableId}/schema

Description:
Retrieves the schema definition for a specified table. Replace {tableId} with the table's ID.

Headers:

  • x-api-key: YOUR_API_KEY

Response:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "name": "Invoice Table",
    "description": "Table for storing invoice data",
    "columns": [
      {
        "id": "invoice_number",
        "title": "Invoice Number",
        "type": "string",
        "description": "Unique identifier for the invoice",
        "examples": "INV-100",
        "isList": false,
        "properties": []
      }
    ]
  },
  "required": ["name", "description", "columns"],
  "additionalProperties": false
}

Update Table Schema

Endpoint:
PUT {{baseUrl}}/api/tables/{tableId}/schema

Description:
Updates the schema definition for a specified table. Replace {tableId} with the table's ID.

Schema Details: The definition field must follow our schema structure with specific limitations.

Headers:

  • x-api-key: YOUR_API_KEY

Request Body:

{
  "definition": {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "properties": {
      "name": "Invoice Table",
      "description": "Table for storing invoice data",
      "columns": [
        {
          "id": "invoice_number",
          "title": "Invoice Number",
          "type": "string",
          "description": "Unique identifier for the invoice",
          "examples": "INV-100",
          "isList": false,
          "properties": []
        }
      ]
    },
    "required": ["name", "description", "columns"],
    "additionalProperties": false
  }
}

Response:

{
  "success": true
}

Get Table Data by Row IDs

Endpoint:
POST {{baseUrl}}/api/tables/{tableId}/data

Description:
Retrieves data for specific rows from a table.

Headers:

  • x-api-key: YOUR_API_KEY

Request Body:

{
  "rowIds": ["929e7f55-f210-4639-b300-0179ad5820bd"]
}

Example (curl):

curl -X POST "{{baseUrl}}/api/tables/{tableId}/data" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "rowIds": ["929e7f55-f210-4639-b300-0179ad5820bd"]
}'

Response (200):

[
  {
    "id": "929e7f55-f210-4639-b300-0179ad5820bd",
    "invoice_number": "INV-100",
    "invoice_date": "2023-01-15",
    // ... additional fields ...
  }
]

Submit Extraction Job

Endpoint: POST {{baseUrl}}/api/tables/{tableId}/extract-data

Description: Submits a document file (base64-encoded) for processing. Returns a job ID for polling and the number of seconds to wait before retrying or polling.

Supported file types: PDF, PNG, JPEG, DOC, DOCX, XLS, XLSX.

Headers:

  • x-api-key: YOUR_API_KEY
  • Content-Type: application/json

Request Body:

{
  "fileName": "sample.pdf",
  "base64String": "JVBERi0xLjcNJeLjz9MNCjI1NCAwIG9...",
  "fileSize": 123456           // optional, size of file in bytes
}

Note: The fileName field should have an appropriate extension (.pdf, .png, .jpg, .jpeg, .doc, .docx, .xls, .xlsx). Maximum allowed file size: 4MB.

Response (202 Accepted):
Headers:

Location: /api/jobs/{jobId}
Retry-After: 2

Body:

{
  "message": "Extraction job submitted successfully",
  "jobId": "abcd1234-5678-90ef-ghij-klmnopqrstuv",
  "retryAfter": 2
}

Example (curl):

curl -X POST "{{baseUrl}}/api/tables/{tableId}/extract-data" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "fileName": "sample.pdf",
  "base64String": "JVBERi0xLjcNJeLjz9MNCjI1NCAwIG9..."
}'

Poll Extraction Job Status

Endpoint: GET {{baseUrl}}/api/jobs/{jobId}

Description: Retrieves the status of an extraction job and, if completed, the result and extra information, including extraction feedback with field-level confidence score and suggestions for schema improvements.

Headers:

  • x-api-key: YOUR_API_KEY

Responses:

  • Pending or Processing
    Headers:

    Retry-After: 2
    

    Body:

    {
      "status": "pending" /* or "running" */
    }
    
  • Completed
    Body:

    {
      "status": "completed",
      "result": { 
        "id": "535d2c2a-78a9-41ea-9f88-fdaad413baf0",
        { /* Data for the row here */ }
       },
      "extra": {
         "pagesExtracted": 2,
         "schemaImprovementsSuggestions": {
          "generalFeedback": "General feedback for improving the schema to achieve better accuracy.",
          "fieldFeedback": [
           {
            "fieldPath": "invoice.items[0].name",
            "confidence": 0.5,
            "suggestion": "Suggested improvement for the field in the schema.",
            "ambiguityDetail": "Ambiguous details or difficulty that the model faced in extracting the field."
           },
           {
            "fieldPath": "invoice.items[1].unit_price",
            "confidence": 0.3,
            "suggestion": "Suggested improvement for the field in the schema.",
            "ambiguityDetail": "Ambiguous details or difficulty that the model faced in extracting the field."
           }
          ]
         }
      }
    }
    
  • Failed
    Body:

    {
      "status": "failed",
      "error": "Error message"
    }
    

Example (curl):

curl -X GET "{{baseUrl}}/api/jobs/{jobId}" \
  -H "x-api-key: YOUR_API_KEY"

Get Extraction Usage

Endpoint:
GET {{baseUrl}}/api/usage

Description:
Retrieves current extraction usage (allocated and used pages) for your account.

Headers:

  • x-api-key: YOUR_API_KEY

Response (200):

{
  "success": true,
  "data": {
    "allocated_pages": 100,
    "used_pages": 20
  }
}

Error Responses:

  • 401 Unauthorized - Invalid or missing API key.
  • 500 Internal Server Error - Failed to get extraction usage.

Example (curl):

curl -X GET "{{baseUrl}}/api/usage" \
  -H "x-api-key: YOUR_API_KEY"

Error Handling

Errors are returned in the following format:

{
  "error": "Error message description"
}

Common error status codes:

  • 400 - Bad Request (invalid input)
  • 401 - Unauthorized (invalid or missing API key)
  • 404 - Not Found (resource doesn't exist)
  • 500 - Internal Server Error

Rate Limits

The API enforces rate limits based on your user plan:

  • Beta Plan: 5 requests per second

Exceeding the limit returns a 429 Too Many Requests response.

Table Schema Structure

When creating or updating tables, you need to define a schema that specifies the structure of your data. Our schema follows the JSON Schema Draft-07 format with specific limitations:

Schema Structure

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "The name of the table."
    },
    "description": {
      "type": "string",
      "description": "A description of the table."
    },
    "columns": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "id": {
            "type": "string",
            "description": "A unique identifier for the column."
          },
          "title": {
            "type": "string",
            "description": "The display name of the column."
          },
          "type": {
            "type": "string",
            "enum": ["string", "number", "boolean", "object"],
            "description": "The data type of the column."
          },
          "description": {
            "type": "string",
            "description": "A brief description of the column."
          },
          "examples": {
            "type": "string",
            "description": "Example values for the column."
          },
          "isList": {
            "type": "boolean",
            "description": "Indicates if the column is a list (array)."
          },
          "properties": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "id": {
                  "type": "string",
                  "description": "A unique identifier for the sub-property."
                },
                "title": {
                  "type": "string",
                  "description": "The display name of the sub-property."
                },
                "type": {
                  "type": "string",
                  "enum": ["string", "number", "boolean"],
                  "description": "The data type of the sub-property."
                },
                "description": {
                  "type": "string",
                  "description": "A brief description of the sub-property."
                },
                "examples": {
                  "type": "string",
                  "description": "Example values for the sub-property."
                }
              },
              "required": ["id", "title", "type", "description", "examples"],
              "additionalProperties": false
            },
            "description": "List of sub-properties if the column type is 'object'."
          }
        },
        "required": ["id", "title", "type", "description", "examples", "isList", "properties"],
        "additionalProperties": false
      },
      "description": "An array of column definitions."
    }
  },
  "required": ["name", "description", "columns"],
  "additionalProperties": false
}

Schema Limitations

Based on our schema structure, please note the following limitations:

  1. Column Types: We support the following column types:

    • string: Text data
    • number: Numeric data
    • boolean: True/false values
    • object: Structured data with sub-properties
  2. Object Columns: For columns of type object:

    • ✅ Can contain sub-properties of type string, number, or boolean
    • ❌ Cannot contain nested objects (sub-properties cannot be of type object)
    • ❌ Cannot contain arrays of objects
  3. Array Support: For any column type:

    • ✅ Can be marked as a list using the isList property
    • ✅ Lists can contain primitive values (string, number, boolean)
    • ✅ Lists can contain objects (but those objects can only have primitive properties or arrays of primitives)
    • ❌ Lists cannot contain nested arrays of objects

Example Schema

Here's a practical example of a valid schema for an invoice table:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "name": "Invoice Table",
    "description": "Table for storing invoice data",
    "columns": [
      {
        "id": "invoice_number",
        "title": "Invoice Number",
        "type": "string",
        "description": "Unique identifier for the invoice",
        "examples": "INV-100",
        "isList": false,
        "properties": []
      },
      {
        "id": "invoice_date",
        "title": "Invoice Date",
        "type": "string",
        "description": "The date the invoice was issued",
        "examples": "2023-01-15",
        "isList": false,
        "properties": []
      },
      {
        "id": "customer",
        "title": "Customer Information",
        "type": "object",
        "description": "Customer details",
        "examples": "",
        "isList": false,
        "properties": [
          {
            "id": "name",
            "title": "Customer Name",
            "type": "string",
            "description": "Full name of the customer",
            "examples": "John Doe"
          },
          {
            "id": "email",
            "title": "Email",
            "type": "string",
            "description": "Customer email address",
            "examples": "john@example.com"
          }
        ]
      },
      {
        "id": "items",
        "title": "Items",
        "type": "object",
        "description": "List of items in the invoice",
        "examples": "",
        "isList": true,
        "properties": [
          {
            "id": "name",
            "title": "Item Name",
            "type": "string",
            "description": "Name of the item",
            "examples": "Product A"
          },
          {
            "id": "price",
            "title": "Price",
            "type": "number",
            "description": "Price of the item",
            "examples": "29.99"
          },
          {
            "id": "tags",
            "title": "Tags",
            "type": "string",
            "description": "Tags associated with the item",
            "examples": "electronics, gadgets",
            "isList": true
          }
        ]
      }
    ]
  },
  "required": ["name", "description", "columns"],
  "additionalProperties": false
}
  1. Getting started
    1. Authentication
      1. Supported File Types
        1. API Endpoints
          1. Base URL
          2. List Tables
          3. Create Table
          4. Delete Table
          5. Get Table Schema
          6. Update Table Schema
          7. Get Table Data by Row IDs
          8. Submit Extraction Job
          9. Poll Extraction Job Status
          10. Get Extraction Usage
        2. Error Handling
          1. Rate Limits
            1. Table Schema Structure
              1. Schema Structure
              2. Schema Limitations
              3. Example Schema