POST /v1/pdf/documentparser

Attributes

Attributes are case-sensitive and should be inside JSON for POST request. for example: { "url": "https://example.com/file1.pdf" }
AttributeTypeRequiredDefaultDescription
urlstringYes-URL to the source file url attribute
callbackstringNo-The callback URL (or Webhook) used to receive the POST data. see Webhooks & Callbacks. This is only applicable when async is set to true.
httpusernamestringNo-HTTP auth user name if required to access source URL.
httppasswordstringNo-HTTP auth password if required to access source URL.
templateIdintegerNo-Set ID of document parser template to be used. View and manage your templates at Document Parser
templatestringNo-The raw format of the document parser template to be used directly. see Template
passwordstringNo-Password for the PDF file.
inlinebooleanNofalseSet to true to include the results directly in the response, in addition to providing a URL to the generated output file. Applies only when async mode is enabled.
asyncbooleanNofalseSet async to true for long processes to run in the background, API will then return a jobId which you can use with the Background Job Check endpoint. Also see Webhooks & Callbacks
pagesstringNoall pagesSpecify page indices as comma-separated values or ranges to process (e.g. “0, 1, 2-” or “1, 2, 3-7”). The first-page index is 0. Use ”!” before a number for inverted page numbers (e.g. “!0” for the last page). If not specified, the default configuration processes all pages. The input must be in string format.
namestringNo-File name for the generated output, the input must be in string format.
expirationintegerNo60Set the expiration time for the output link in minutes. After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.
outputFormatstringNoJSONThe format of the output file. The output format can be JSON, CSV, or XML.
profilesobjectNo-See Profiles for more information.
    DataEncryptionAlgorithmstringNo-Controls the encryption algorithm used for data encryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256.
    DataEncryptionKeystringNo-Controls the encryption key used for data encryption. See User-Controlled Encryption for more information.
    DataEncryptionIVstringNo-Controls the encryption IV used for data encryption. See User-Controlled Encryption for more information.
    DataDecryptionAlgorithmstringNo-Controls the decryption algorithm used for data decryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256.
    DataDecryptionKeystringNo-Controls the decryption key used for data decryption. See User-Controlled Encryption for more information.
    DataDecryptionIVstringNo-Controls the decryption IV used for data decryption. See User-Controlled Encryption for more information.

Query parameters

No query parameters accepted.

Responses

ParameterTypeDescription
pageCountintegerNumber of pages in the PDF document.
errorbooleanIndicates whether an error occurred (false means success)
statusstringStatus code of the request (200, 404, 500, etc.). For more information, see Response Codes.
creditsintegerNumber of credits consumed by the request
remainingCreditsintegerNumber of credits remaining in the account
durationintegerTime taken for the operation in milliseconds
bodyobjectNo
    objectsarray[object]
    elapsedfloatProcessing time in seconds
    templateNamestringName of the parsing template used
    templateVersionstringVersion of the parsing template
    timestampstringTimestamp when the parsing occurred

Example Payload

To see the request size limits, please refer to the Request Size Limits.
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
  "outputFormat": "JSON",
  "templateId": "1",
  "async": false,
  "inline": "true",
  "password": "",
  "profiles": ""
}

Example Response

To see the main response codes, please refer to the Response Codes page.
{
  "body": {
    "objects": [
      {
        "name": "companyName",
        "objectType": "field",
        "value": "Amazon Web Services, Inc",
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "name": "companyName2",
        "objectType": "field",
        "value": "Amazon Web Services, Inc",
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "name": "invoiceId",
        "objectType": "field",
        "value": "123456789",
        "pageIndex": 0,
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "name": "dateIssued",
        "objectType": "field",
        "value": "2018-04-03T00:00:00",
        "pageIndex": 0,
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "name": "dateDue",
        "objectType": "field",
        "value": "2018-04-03T00:00:00",
        "pageIndex": 0,
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "name": "bankAccount",
        "objectType": "field",
        "value": "123456789012",
        "pageIndex": 0,
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "name": "total",
        "objectType": "field",
        "value": 6.58,
        "pageIndex": 0,
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "name": "subTotal",
        "objectType": "field",
        "value": ""
      },
      {
        "name": "tax",
        "objectType": "field",
        "value": 1.01,
        "pageIndex": 0,
        "rectangle": [
          0,
          0,
          0,
          0
        ]
      },
      {
        "objectType": "table",
        "name": "table",
        "rows": []
      }
    ],
    "templateName": "Generic Invoice [en]",
    "templateVersion": "4",
    "timestamp": "2020-08-21T19:23:31"
  },
  "pageCount": 1,
  "error": false,
  "status": 200,
  "name": "sample-invoice.json",
  "remainingCredits": 60803
}

Code Samples

curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: {{x-api-key}}' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "JSON",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}'