POST /v1/pdf/find/table

This function finds tables in documents using an AI-powered table detection engine.

This endpoint locates tables in an input PDF document and returns JSON with:

  • The array of tables objects.

  • X, Y, Width, and Height coordinates for every table found.

  • Rect param for every table that you can re-use with pdf/convert/to/json, pdf/convert/to/csv, pdf/convert/to/csv, and other endpoints to extract a selected table only.

  • PageIndex page index for a page with a table. The very first page is 0 .

  • Columns array with the set of X coordinates for every column inside the table that was found.

To extract the table into CSV, JSON, or XML please use pdf/convert/to/csv, pdf/convert/to/json2, and pdf/convert/to/xml endpoints with rect parameter value from rect output param for this table accordingly.

To extract the table into CSV, JSON, or XML please use pdf/convert/to/csv, pdf/convert/to/json2, and pdf/convert/to/xml endpoints with rect parameter value from rect output param for this table accordingly.

Attributes

Attributes are case-sensitive and should be inside JSON for POST request. for example: { "url": "https://example.com/file1.pdf" }
AttributeTypeRequiredDefaultDescription
urlstringYes-URL to the source file url attribute
callbackstringNo-The callback URL (or Webhook) used to receive the POST data. see Webhooks & Callbacks. This is only applicable when async is set to true.
httpusernamestringNo-HTTP auth user name if required to access source URL.
httppasswordstringNo-HTTP auth password if required to access source URL.
pagesstringNoall pagesSpecify page indices as comma-separated values or ranges to process (e.g. “0, 1, 2-” or “1, 2, 3-7”). The first-page index is 0. Use ”!” before a number for inverted page numbers (e.g. “!0” for the last page). If not specified, the default configuration processes all pages. The input must be in string format.
inlinebooleanNofalseSet to true to return results inside the response. Otherwise, the endpoint will return a URL to the output file generated.
passwordstringNo-Password for the PDF file.
asyncbooleanNofalseSet async to true for long processes to run in the background, API will then return a jobId which you can use with the Background Job Check endpoint. Also see Webhooks & Callbacks
profilesobjectNo-See Profiles for more information.
    DataEncryptionAlgorithmstringNo-Controls the encryption algorithm used for data encryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256.
    DataEncryptionKeystringNo-Controls the encryption key used for data encryption. See User-Controlled Encryption for more information.
    DataEncryptionIVstringNo-Controls the encryption IV used for data encryption. See User-Controlled Encryption for more information.
    DataDecryptionAlgorithmstringNo-Controls the decryption algorithm used for data decryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256.
    DataDecryptionKeystringNo-Controls the decryption key used for data decryption. See User-Controlled Encryption for more information.
    DataDecryptionIVstringNo-Controls the decryption IV used for data decryption. See User-Controlled Encryption for more information.

Find only bordered tables

You can limit search to bordered tables only by enabling the legacy table search mode with the following profiles config:

{
 "profiles": "{ 'Mode': 'Legacy',
 'ColumnDetectionMode': 'BorderedTables',
 'DetectionMinNumberOfRows': 1,
 'DetectionMinNumberOfColumns': 1,
 'DetectionMaxNumberOfInvalidSubsequentRowsAllowed': 0,
 'DetectionMinNumberOfLineBreaksBetweenTables': 0,
 'EnhanceTableBorders': false
 }"
}

Query parameters

No query parameters accepted.

Responses

ParameterTypeDescription
bodyobjectResponse body.
pageCountintegerNumber of pages in the PDF document.
errorbooleanIndicates whether an error occurred (false means success)
statusstringStatus code of the request (200, 404, 500, etc.). For more information, see Response Codes.
namestringName of the output file
creditsintegerNumber of credits consumed by the request
remainingCreditsintegerNumber of credits remaining in the account
durationintegerTime taken for the operation in milliseconds

Example Payload

To see the request size limits, please refer to the Request Size Limits.
{
  "url": "pdfco-test-files.s3.us-west-2.amazonaws.compdf-to-text/sample.pdf",
  "async": "false",
  "inline": "true",
  "password": ""
}

Example Response

To see the main response codes, please refer to the Response Codes page.
{
  "body": {
    "tables": [
      {
        "PageIndex": 0,
        "X": 36,
        "Y": 34.4400024,
        "Width": 523.44,
        "Height": 160.82,
        "Columns": [
          357.675
        ],
        "rect": "36, 34.4400024, 523.44, 160.82"
      },
      {
        "PageIndex": 0,
        "X": 36,
        "Y": 316.249969,
        "Width": 523.44,
        "Height": 120.620026,
        "Columns": [
          157.117,
          340.68,
          475.84
        ],
        "rect": "36, 316.249969, 523.44, 120.620026"
      }
    ]
  },
  "pageCount": 1,
  "error": false,
  "status": 200,
  "name": "sample.json",
  "remainingCredits": 98892697,
  "credits": 21
}

Code Samples

curl --location --request POST 'https://api.pdf.co/v1/pdf/find/table' \
--header 'x-api-key: *******************' \
--header 'Content-Type: application/json' \
--data-raw '{
"url": "pdfco-test-files.s3.us-west-2.amazonaws.compdf-to-text/sample.pdf",
"async": "false",
"inline": "true",
"password": ""
}'