>95.7% Accuracy

Please use a larger screen to view the documentation.

Guides

Quick Start

Follow these steps to make your first API call and extract data from a document.

Prerequisites

Before you begin, ensure you have an API key. If you don't have one, please refer to the Setting up section in our Introduction page.

We launched self-serve beta on June 5th, 2025!

Authentication

JustExtract.it API uses API keys to authenticate requests. You must include your API key in the Authorization header as a Bearer token for all requests.

1Authorization: Bearer YOUR_API_KEY

Replace YOUR_API_KEY with your actual API key.

Extraction Workflow

The typical workflow for extracting data from a document involves these steps:

Get File Upload URL: Request a secure, pre-signed URL to upload your document.
Upload Your File: Upload your document (e.g., PDF) to the provided pre-signed URL.
Start Extraction: Initiate the data extraction process by providing the URL of the uploaded document.
Check Task Status: Poll the API to check the status of your extraction task and retrieve the results once completed.

Step 1: Get File Upload URL

First, obtain pre-signed URLs for your file. This endpoint provides a put_url for uploading your file securely to our storage, and a get_url which you will use in the extraction step.

Endpoint: GET /api/file-url

The response will include:

put_url: The pre-signed URL to PUT your file to.
get_url: The pre-signed URL to access your file (use this in the /api/extract call).
file_key: A unique identifier for the file in our system.
expires_in: The duration in seconds for which the URLs are valid (typically 3600 seconds / 1 hour).

1curl -X GET https://api.justextract.it/api/file-url \
2-H "Authorization: Bearer YOUR_API_KEY"

Step 2: Upload Your File

Once you have the put_url from Step 1, upload your document using an HTTP PUT request. Ensure the Content-Type header matches your file type (e.g., application/pdf for PDF files).

1# Replace YOUR_PUT_URL with the put_url from Step 1
2# Replace PATH_TO_YOUR_FILE.pdf with the actual file path
3curl -X PUT YOUR_PUT_URL \
4-H "Content-Type: application/pdf" \
5--upload-file PATH_TO_YOUR_FILE.pdf

Note: Remember to store your original files in a secure location. We may delete your files after 72-96 hours depending on your activity and usage.

Step 3: Start Extraction

After successfully uploading your file, you can initiate the extraction process. Send a POST request to the /api/extract endpoint with the get_url (obtained in Step 1) of your uploaded document.

Endpoint: POST /api/extract

Request Body:

url (string, required): The URL of the document to process (this should be the get_url from Step 1, or your own publicly accessible URL if you are not using our pre-signed upload).
filters (array, optional): An array of filter objects to refine the extraction. Each filter object should contain its specific parameters (e.g., "keywords": [...] or "pages": [...]) and an "include": boolean field. The type of filter is inferred by the API. See the Development page for details on filters. For a basic extraction, you can pass an empty array [].

The API will respond with a task_id, which you will use to track the progress of the extraction.

1# Replace YOUR_GET_URL with the get_url from Step 1
2curl -X POST https://api.justextract.it/api/extract \
3-H "Authorization: Bearer YOUR_API_KEY" \
4-H "Content-Type: application/json" \
5-d '{
6  "url": "YOUR_GET_URL",
7  "filters": [] // No filters for a basic extraction
8}'

Step 4: Check Task Status

Use the task_id obtained in Step 3 to poll the /api/task/{task_id} endpoint. This will provide the current status of your extraction job.

Endpoint: GET /api/task/{task_id}

Possible statuses include:

PENDING: The task is queued and waiting to be processed.
PROCESSING: The task is actively being processed.
SUCCESS: The task completed successfully. The response will include the extracted data.
FAILURE: The task failed. The response may include error details.

You should poll this endpoint periodically until the status is SUCCESS or FAILURE.

1# Replace YOUR_TASK_ID with the task_id from Step 3
2curl -X GET https://api.justextract.it/api/task/YOUR_TASK_ID \
3-H "Authorization: Bearer YOUR_API_KEY"

Putting It All Together

By following these four steps—obtaining an upload URL, uploading your file, initiating extraction, and checking the task status—you can integrate JustExtract.it into your applications to automate document data extraction.

For more advanced use cases, such as applying specific filters to your documents during extraction, please refer to our Development page.

Previous Next