File Upload#

You can upload files as temporary files into PDF.co. Temporary files are stored for 1 hour by default and then auto removed.

To store files permanently (pdf templates, images you want to reuse) please use PDF.co Built-In Files Storage instead.

You can also use 3rd party cloud services:

  • Dropbox: you can use public link to a file from Dropbox.

  • Google Drive: you can use link to a file that was shared as anyone with a link.

  • Google Docs/Sheets/Slides: you can use a link to a document in Google Docs that was shared as anyone with a link.

  • Any publicly accessible URL from any cloud service or web source that provides a direct link to the uploaded file.

Note

IMPORTANT NOTE FOR GOOGLE DRIVE/DOCS users: free Google Drive/Docs limits the number of requests to their files. If you use a link to file or document from Google Drive or Google Drive then make sure you have no more than 5-10 requests per minute. Otherwise Google Drive returns no file or error page.

Temporary Files Upload#

You can upload temporary files up to 2GB in size. Please note that to process these files you should use async=true mode with data extraction and tools endpoints along with /job/check to check status of background jobs you create.

Steps to Upload File#

  1. First, call /file/upload/get-presigned-url. It will generate link for uploading (presignedUrl) and final link (url).

  2. Now send your file to the presignedUrl link using the PUT method within the next 30 minutes.

  3. Once finished, use url to access the file you have just uploaded.

Note: all uploaded files are considered to be temporary files and are automatically permanently removed after 1 hour.

Response Codes#

Code

Description

200

The request has succeeded.

400

Bad input parameters.

401

Unauthorized.

403

Not enough credits.

405

Timeout error. Use /file/upload/url for small files (<500kb). For faster and stable file uploads, use /file/upload/get-presigned-url and the PUT workflow with /presignedUrl –data-binary ‘sample.pdf’.

/file/upload/get-presigned-url#

This method generates links to upload your local file to. Use this presignedUrl from the response to upload your file. Once you upload your file to this presignedUrl using PUT, you can use the url link to access the uploaded file.

With this method you can upload files up to 2GB in size. Please note that to process these files you should use async=true mode with data extraction and tools endpoints along with /job/check to check status of background jobs you create.

  • Method: GET

  • Endpoint: /v1/file/upload/get-presigned-url

Query parameters#

Parameter name

Description

Example

name

File name for the generated output, the input must be in string format.

test.pdf

encrypt

Whether to encrypt the file

false

contentType

The content type of the uploaded file

application/pdf

CURL#

curl --location --request GET https://api.pdf.co/v1/file/upload/get-presigned-url?name=test.pdf&encrypt=true
--header 'x-api-key: YOUR_API_KEY'

Response#

{
    "presignedUrl": "https://pdf-temp-files.s3.us-west-2.amazonaws.com/A1VGV42YE0NWXMKEB4BUIWNYGKXEWTND/test.pdf?X-Amz-Expires=900&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIZJDPLX6D7EHVCKA/20220913/us-west-2/s3/aws4_request&X-Amz-Date=20220913T074159Z&X-Amz-SignedHeaders=content-type;host&X-Amz-Signature=53f326afde5bcfb3b2714ee8cb5322795bf10a03feb7dab3764e6ca63c017f43",
    "url": "https://pdf-temp-files.s3.us-west-2.amazonaws.com/A1VGV42YE0NWXMKEB4BUIWNYGKXEWTND/test.pdf?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzEBgaDLZTUxFLOwF9iiGk%2FyKCATiLp%2FRn9nPmt%2Fey9PcilcRMXtLl0TS6IFNOpk%2BKtSF%2B%2BEVcbNFThw4c1KVx21RQxT5zf7csSEESGov1Xd4uDhF0xGoVkXff9saXGVUtgKrYgPKhUfv5KEO7gz3E0t%2FqCPZJn2KGs1yMbUkohzeIrEd0NH8EVvqfxrfCcW0ZANiG2iMoh8eAmQYyKLjRMfg02ZJPTgoFPQmfMyYt0FacTg4RhkP3PeD9mrWLefDXCwcYkkI%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHFHQYL4OV/20220913/us-west-2/s3/aws4_request&X-Amz-Date=20220913T074159Z&X-Amz-SignedHeaders=host&X-Amz-Signature=9b1a90f36635459bb40f09b0fc6fe3eba185ba3cfdb0a8ef1096ac9efa9b6299",
    "error": false,
    "status": 200,
    "name": "test.pdf",
    "credits": 7,
    "duration": 0,
    "remainingCredits": 98191146
}

See Response Codes.

Code samples#

var https = require("https");
var path = require("path");

const API_KEY = "**********************************************";

function getPresignedUrl(apiKey) {
  return new Promise(resolve => {
      // Prepare request to `Get Presigned URL` API endpoint
      let queryPath = `/v1/file/upload/get-presigned-url?name=test.pdf`;
      let reqOptions = {
          host: "api.pdf.co",
          path: encodeURI(queryPath),
          headers: { "x-api-key": apiKey }
      };
      // Send request
      https.get(reqOptions, (response) => {
          response.on("data", (d) => {
              let data = JSON.parse(d);
              if (data.error == false) {
                  console.log("presignedUrl: " + data.presignedUrl);
                  // Return presigned url we received
                  resolve([data.presignedUrl, data.url]);
              }
              else {
                  // Service reported error
                  console.log("getPresignedUrl(): " + data.message);
              }
          });
       })
       .on("error", (e) => {
            // Request error
            console.log("getPresignedUrl(): " + e);
       });
  });
}

let result = getPresignedUrl(API_KEY);
# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "*************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

fileName = "test.pdf"

url = "{}/file/upload/get-presigned-url?contenttype=application/octet-stream&name={}".format(
    BASE_URL, os.path.basename(fileName))

# Execute request and get response as JSON
response = requests.get(url, headers={"x-api-key": API_KEY})
if (response.status_code == 200):
    json = response.json()

    if json["error"] == False:
        # URL to use for file upload
        uploadUrl = json["presignedUrl"]
        # URL for future reference
        uploadedFileUrl = json["url"]
<?
  $apiKey = "***************";
  $url = "https://api.pdf.co/v1/file/upload/get-presigned-url" .
      "?name=" . urlencode($_FILES["file"]["name"]) .
      "&contenttype=application/octet-stream";

  // Create request
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  // Execute request
  $result = curl_exec($curl);
?>

/presignedUrl –data-binary ‘sample.pdf’#

With this method you can upload files up to 100mb in size. Please note that to process these files you should use async=true mode with data extraction and tools endpoints along with /job/check to check status of background jobs you create.

Important

The presigned URL must be retreived from the /file/upload/get-presigned-url for the PUT operation to succeed.

Content-Type header#

When sending PUT request don’t forget to add Content-Type header with proper value based on input file type.

For example:

File Extension

Content-Type Value

.txt .csv .xml .json

text/plain

.pdf

application/pdf

.msg .eml

application/vnd.ms-outlook

.doc

application/msword

Note

If you’re not sure then use application/octet-stream header. It works for most file types.

All uploaded files are treated as temporary files and are automatically permanently removed after 1 hour. If you have a file that you want to reuse over and over, please upload it to PDF.co Built-In Files Storage and get its filetoken:// link that you may reuse inside PDF.co API.

CURL#

curl --location --request PUT '<insert presignedUrl here>' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/octet-stream' \
--data-binary '@./sample.pdf'

Response#

{
    "presignedUrl": "https://pdf-temp-files.s3-us-west-2.amazonaws.com/0c72bf56341142ba83c8f98b47f14d62/test.pdf?X-Amz-Expires=900&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIZJDPLX6D7EHVCKA/20200302/us-west-2/s3/aws4_request&X-Amz-Date=20200302T143951Z&X-Amz-SignedHeaders=host&X-Amz-Signature=8650913644b6425ba8d52b78634698e5fc8970157d971a96f0279a64f4ba87fc",
    "url": "https://pdf-temp-files.s3-us-west-2.amazonaws.com/0c72bf56341142ba83c8f98b47f14d62/test.pdf?X-Amz-Expires=3600&x-amz-security-token=FwoGZXIvYXdzEGgaDA9KaTOXRjkCdCqSTCKBAW9tReCLk1fVTZBH9exl9VIbP8Gfp1pE9hg6et94IBpNamOaBJ6%2B9Vsa5zxfiddlgA%2BxQ4tpd9gprFAxMzjN7UtjU%2B2gf%2FKbUKc2lfV18D2wXKd1FEhC6kkGJVL5UaoFONG%2Fw2jXfLxe3nCfquMEDo12XzcqIQtNFWXjKPWBkQEvmii4tfTyBTIot4Na%2BAUqkLshH0R7HVKlEBV8btqa0ctBjwzwpWkoU%2BF%2BCtnm8Lm4Eg%3D%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHEGHTOA4W/20200302/us-west-2/s3/aws4_request&X-Amz-Date=20200302T143951Z&X-Amz-SignedHeaders=host;x-amz-security-token&X-Amz-Signature=243419ac4a9a315eebc2db72df0817de6a261a684482bbc897f0e7bb5d202bb9",
    "error": false,
    "status": 200,
    "name": "test.pdf",
    "remainingCredits": 98145
}

See Response Codes.

Code samples#

function uploadFile(apiKey, localFile, uploadFileUrl) {
      return new Promise(resolve => {
          fs.readFile(localFile, (err, data) => {
              request({
                  method: "PUT",
                  url: uploadFileUrl,
                  body: data,
                  headers: {
                      "Content-Type": "application/octet-stream",
                      "x-api-key": apiKey
                  }
              }, (err, res, body) => {
                  if (!err) {
                      resolve();
                  }
                  else {
                      console.log("uploadFile() request error: " + e);
                  }
              });
          });
      });
  }
uploadFileUrl = "file URL retrieved from /file/upload/get-presigned-url"

with open(fileName, 'rb') as file:
    requests.put(uploadFileUrl, data=file, headers={"x-api-key": API_KEY, "content-type": "application/octet-stream"})
<?
  $uploadFileUrl = "file URL retrieved from /file/upload/get-presigned-url";

  $localFile = $_FILES["fileInput"]["tmp_name"];
  $fileHandle = fopen($localFile, "r");

  curl_setopt($curl, CURLOPT_URL, $uploadFileUrl);
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "content-type: application/octet-stream"));
  curl_setopt($curl, CURLOPT_PUT, true);
  curl_setopt($curl, CURLOPT_INFILE, $fileHandle);
  curl_setopt($curl, CURLOPT_INFILESIZE, filesize($localFile));

  // Execute request
  curl_exec($curl);

  fclose($fileHandle);
?>


/file/upload#

Uploads a small (up to 100KB) local file as a temporary file in PDF.co storage. Note: temporary files are automatically permanently removed after 1 hour.

  • Method: POST

  • Endpoint: /v1/file/upload

CURL#

curl --location --request POST 'https://api.pdf.co/v1/file/upload' \
--header 'x-api-key: *******************' \
--form 'file=@"/path/to/file"'

Response#

{
    "url": "https://pdf-temp-files.s3.amazonaws.com/1a4a92ac805c41c28ef75a24e0f35ba5/sample.pdf",
    "error": false,
    "status": 200,
    "name": "sample.pdf",
    "remainingCredits": 98145
}

See Response Codes.


/file/upload/base64#

Creates a temporary file using base64 source data. You may use this temporary file URL with other API methods. Temporary files are automatically permanently removed after 1 hour.

  • Method: POST

  • Endpoint: /v1/file/upload/base64

CURL#

curl --location --request POST 'https://api.pdf.co/v1/file/upload/base64' \
--header 'x-api-key: *******************' \
--form 'file=""'

Response#

{
    "url": "https://pdf-temp-files.s3.amazonaws.com/7588d614c9ad41eb98ec317a02abda63/uploadfile.txt",
    "error": false,
    "status": 200,
    "remainingCredits": 77769
}

See Response Codes.


/file/upload/url#

Downloads file from a source url and uploads it as a temporary file. Temporary files are automatically permanently removed after 1 hour.

  • Method: POST

  • Endpoint: /v1/file/upload/url

CURL#

curl --location --request POST 'https://api.pdf.co/v1/file/upload/url' \
--header 'x-api-key: *******************' \
--form 'name="sample.pdf"' \
--form 'url="pdfco-test-files.s3.us-west-2.amazonaws.compdf-split/sample.pdf"'

Response#

{
    "url": "https://pdf-temp-files.s3.amazonaws.com/1a4a92ac805c41c28ef75a24e0f35ba5/sample.pdf",
    "error": false,
    "status": 200,
    "name": "sample.pdf",
    "remainingCredits": 98145
}

See Response Codes.

Code Samples#

var https = require("https");
var path = require("path");

const API_KEY = "*************************************";


function upload(apiKey, fileName) {
  return new Promise(resolve => {
      // Prepare request to `file/upload/url` API endpoint
      let queryPath = `/v1/file/upload/url?url=${fileName}`;
      let reqOptions = {
          host: "api.pdf.co",
          path: encodeURI(queryPath),
          headers: { "x-api-key": apiKey }
      };
      // Send request
      https.get(reqOptions, (response) => {
          response.on("data", (d) => {
              let data = JSON.parse(d);
              if (data.status == 200) {
                  console.log("temp url: " + data.url);
                  console.log("remainingCredits: " + data.remainingCredits);
                  resolve([data.remainingCredits]);
              }
              else {
                  // Service reported error
                  console.log("Error");
              }
          });
       })
       .on("error", (e) => {
            // Request error
            console.log("error: " + e);
       });
  });
}

let result = upload(API_KEY, "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf");
# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "*************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

fileName = "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf"

url = "{}/file/upload/url?url={}".format(BASE_URL, fileName)

# Execute request and get response as JSON
response = requests.get(url, headers={"x-api-key": API_KEY})
if (response.status_code == 200):
    json = response.json()

    if json["status"] == 200:
        temp_url = json["url"]
        remainingCredits = json["remainingCredits"]
<?
  $apiKey = "***************";
  $fileName = "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf";
  $url = "https://api.pdf.co/v1/file/upload/url?url=" . $fileName);

  // Create request
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  // Execute request
  $result = curl_exec($curl);
?>


/file/upload/url#

Downloads file from a source url and uploads it as a temporary file. Temporary files are automatically permanently removed after 1 hour.

  • Method: GET

  • Endpoint: /v1/file/upload/url

CURL#

curl --location --request GET 'https://api.pdf.co/v1/file/upload/url?url=pdfco-test-files.s3.us-west-2.amazonaws.compdf-split/sample.pdf' \
--header 'x-api-key: ******************'

Response#

{
    "url": "https://pdf-temp-files.s3.amazonaws.com/97415d1c45a04b29ac42c8dc01883316/sample.pdf",
    "error": false,
    "status": 200,
    "name": "sample.pdf",
    "remainingCredits": 77765
}

See Response Codes.


/file/delete#

Deletes temporary file (that was uploaded by you or generated by API).

Note

All temporary files are auto removed after 1 hour. You may use File Upload methods to explicitly force remove temp files once you don’t need them.

  • Method: POST

  • Endpoint: /v1/file/delete

Attributes#

Attribute

Description

Required

url

URL of the previously uploaded temporary file or output file that was generated by the API method.

yes

Query parameters#

No query parameters accepted.

Payload 3#

{
    "url": "https://pdf-temp-files.s3.amazonaws.com/b5c1e67d98ab438292ff1fea0c7cdc9d/sample.pdf"
}

Response#

{
    "error": false,
    "status": 200,
    "remainingCredits": 9999986
}

See Response Codes.

CURL#

curl --location --request POST 'https://api.pdf.co/v1/file/delete' \
--header 'x-api-key: *******************' \
--data-raw '{
    "url": "https://pdf-temp-files.s3.amazonaws.com/b5c1e67d98ab438292ff1fea0c7cdc9d/sample.pdf"
}'

Code samples#

var https = require("https");
var path = require("path");

const API_KEY = "*************************************";


function deleteFile(apiKey, fileName) {
  return new Promise(resolve => {
      // Prepare request to `file/delete` API endpoint
      let queryPath = `/v1/file/delete?url=${fileName}`;
      let reqOptions = {
          host: "api.pdf.co",
          path: encodeURI(queryPath),
          headers: { "x-api-key": apiKey }
      };
      // Send request
      https.get(reqOptions, (response) => {
          response.on("data", (d) => {
              let data = JSON.parse(d);
              if (data.status == 200) {
                  console.log("remainingCredits: " + data.remainingCredits);
                  resolve([data.remainingCredits]);
              }
              else {
                  // Service reported error
                  console.log("Error");
              }
          });
       })
       .on("error", (e) => {
            // Request error
            console.log("error: " + e);
       });
  });
}

let result = deleteFile(API_KEY, "https://pdf-temp-files.s3.amazonaws.com/b5c1e67d98ab438292ff1fea0c7cdc9d/sample.pdf");
# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "*************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

fileName = "https://pdf-temp-files.s3.amazonaws.com/b5c1e67d98ab438292ff1fea0c7cdc9d/sample.pdf"

url = "{}/file/delete?url={}".format(BASE_URL, fileName)

# Execute request and get response as JSON
response = requests.get(url, headers={"x-api-key": API_KEY})
if (response.status_code == 200):
    json = response.json()

    if json["status"] == 200:
        remainingCredits = json["remainingCredits"]
<?
  $apiKey = "***************";
  $fileName = "https://pdf-temp-files.s3.amazonaws.com/b5c1e67d98ab438292ff1fea0c7cdc9d/sample.pdf";
  $url = "https://api.pdf.co/v1/file/delete?url=" . $fileName);

  // Create request
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  // Execute request
  $result = curl_exec($curl);
?>


/file/hash#

Calculate and return MD5 hash of file by url. Commonly used to control if the source document has been changed or not because every little change will cause hash string to differ as well.

  • Method: POST

  • Endpoint: /v1/file/hash

Attributes#

Attribute

Description

Required

url

URL to the source file. 1

yes

Query parameters#

No query parameters accepted.

Payload 3#

{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf"
}

Response#

{
    "hash": "d942e5becdcb0386598cce15e9e56deb1ca9d893b8578a88eca4a62f02c4000b",
    "remainingCredits": 98143
}

See Response Codes.

CURL#

curl --location --request POST 'https://api.pdf.co/v1/file/hash' \
--header 'x-api-key: *******************' \
--data-raw '{
    "url": "pdfco-test-files.s3.us-west-2.amazonaws.compdf-split/sample.pdf"
}'

Code samples#

var https = require("https");
var path = require("path");

const API_KEY = "*************************************";


function getHash(apiKey, fileName) {
  return new Promise(resolve => {
      // Prepare request to `file/hash` API endpoint
      let queryPath = `/v1/file/hash?url=${fileName}`;
      let reqOptions = {
          host: "api.pdf.co",
          path: encodeURI(queryPath),
          headers: { "x-api-key": apiKey }
      };
      // Send request
      https.get(reqOptions, (response) => {
          response.on("data", (d) => {
              let data = JSON.parse(d);
              if (data.hash != false) {
                  console.log("hash: " + data.hash);
                  console.log("remainingCredits: " + data.remainingCredits);
                  resolve([data.hash, data.remainingCredits]);
              }
              else {
                  // Service reported error
                  console.log("Error");
              }
          });
       })
       .on("error", (e) => {
            // Request error
            console.log("error: " + e);
       });
  });
}

let result = getHash(API_KEY, "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf");
# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "*************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

fileName = "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf"

url = "{}/file/hash?url={}".format(BASE_URL, fileName)

# Execute request and get response as JSON
response = requests.get(url, headers={"x-api-key": API_KEY})
if (response.status_code == 200):
    json = response.json()

    if json["hash"] != False:
        hash = json["hash"]
        remainingCredits = json["remainingCredits"]
<?
  $apiKey = "***************";
  $fileName = "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf";
  $url = "https://api.pdf.co/v1/file/hash?url=" . $fileName);

  // Create request
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  // Execute request
  $result = curl_exec($curl);
?>

Footnotes

1

Supports publicly accessible links from any source, including Google Drive, Dropbox, and PDF.co Built-In Files Storage. To upload files via the API, check out the File Upload section. Note: If you experience intermittent Access Denied or Too Many Requests errors, please try adding cache: to enable built-in URL caching (e.g., cache:https://example.com/file1.pdf). For data security, you have the option to encrypt output files and decrypt input files. Learn more about user-controlled data encryption.

2

Main response codes as follows:

Code

Description

200

Success

400

Bad request. Typically happens because of bad input parameters, or because the input URLs can’t be reached, possibly due to access restrictions like needing a login or password.

401

Unauthorized

402

Not enough credits

445

Timeout error. To process large documents or files please use asynchronous mode (set the async parameter to true) and then check status using the /job/check endpoint. If a file contains many pages then specify a page range using the pages parameter. The number of pages of the document can be obtained using the /pdf/info endpoint.

3(1,2)

PDF.co Request size: API requests do not support request sizes of more than 4 megabytes in size. Please ensure that request sizes do not exceed this limit.