PDF Split#

Note

When splitting a document the pages parameter controls which pages to split out into individual documents. The page limit should not exceed the number of pages in the document - for example, you cannot split a 100 page document into 200 individual documents, however you can split it into 100 individual documents.

Important

The pages parameter is 1-based, meaning the first page is 1 and not 0.

/pdf/split/#

Split a PDF into multiple PDF files using page indexes or page ranges.

Method: POST
Endpoint: /v1/pdf/split/

Attributes#

Note

Attributes are case-sensitive and should be inside JSON for POST request, for example:

{
    "url": "https://example.com/file1.pdf"
}

Attribute	Description	Required
`url`	URL to the source file. 1	yes
`httpusername`	HTTP auth user name if required to access source `url`.	no
`httppassword`	HTTP auth password if required to access source `url`.	no
`pages`	Comma-separated indices of pages (or page ranges) that you want to use. The first-page index is `1`. For example, if you have a 7-page document that you want to be split into 3 separate PDFs with a different number of pages it would go like this: `1, 2, 2-` or `1, 2, 3-7` which will result in 1 PDF with page one, 1 PDF with page two and 1 PDF with the rest of the pages. You can also use inverted page numbers adding `!` before the number. E.g. `!1` means “the last page”, `2-!1` means “from the second to the penultimate page”, and `!1-` - “last one pages”. Also, you can use a single asterisk (`*`) character as the range to split every page of the document into separate PDF files. The input must be in string format.	no
`password`	Password of PDF file, the input must be in string format.	no
`async`	Set `async` to `true` for long processes to run in the background, API will then return a `jobId` which you can use with the Background Job Check endpoint to check the status of the process and retrieve the output while you can proceed with other tasks.	no
`inline`	Set to `true` to return results inside the response. Otherwise, the endpoint will return a link to the output file generated.	no
`name`	File name for the generated output, the input must be in string format.	no
`expiration`	Set the expiration time for the output link in minutes (default is `60` i.e 60 minutes or 1 hour), After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.	no
`profiles`	Use this parameter to set additional configurations for fine-tuning and extra options. Explore the Profiles section for more.	no

Query parameters#

No query parameters accepted.

Payload 3 #

{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-split/sample.pdf",
    "pages": "1-2,3-",
    "inline": true,
    "name": "result.pdf",
    "async": false
}

Response 2 #

{
    "urls": [
        "https://pdf-temp-files.s3.amazonaws.com/1e9a7f2c46834160903276716424382b/result_page1-2.pdf",
        "https://pdf-temp-files.s3.amazonaws.com/c976b9f89a2e460786a3d5c0deeeef67/result_page3-4.pdf"
    ],
    "pageCount": 4,
    "error": false,
    "status": 200,
    "name": "result.pdf",
    "remainingCredits": 98441
}

CURL#

curl --location --request POST 'https://api.pdf.co/v1/pdf/split' \
--header 'Content-Type: application/json' \
--header 'x-api-key: *******************' \
--data-raw '{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-split/sample.pdf",
    "pages": "1-2,3-",
    "inline": true,
    "name": "result.pdf",
    "async": false
}'

/pdf/split2/#

Split PDF into multiple PDF files by text search (support regular expressions) or by barcode.

Method: POST
Endpoint: /v1/pdf/split2/

Attributes#

Note

Attributes are case-sensitive and should be inside JSON for POST request, for example:

{
    "url": "https://example.com/file1.pdf"
}

Important

When using regular expressions in JSON payloads, ensure that backslashes are properly escaped. For example, a single backslash \ should be written as \\.

Attribute	Description	Required
`url`	URL to the source file. 1	yes
`httpusername`	HTTP auth user name if required to access source `url`.	no
`httppassword`	HTTP auth password if required to access source `url`.	no
`pages`	Comma-separated indices of pages (or page ranges) that you want to use. The first-page index is always 0. For example, if you have a 7-page document that you want to be split into 3 separate PDFs but a different number of pages it would go like this: 0, 1, 2- or 1, 2, 3-7 which will result in 1 PDF with page one, 1 PDF with page two and one PDF with the rest of the pages. You can also use inverted page numbers adding `!` before the number. E.g. `!0` means “the last page”, `1-!1` means “from the second to the penultimate page”, and `!1-` - “last two pages”. Also, you can use a single asterisk (`*`) character as the range to split the document into separate pages. The input must be in string format.	no
`searchString`	Text to search for on pages. Must be a string. To search for a barcode use the following macros string: `[[barcode:<barcodeTypesSeparatedByComma> <barcodeValue>]]`. To search for barcode type without analyzing its value, use this notation instead: `[[barcode:<barcodeTypesSeparatedByComma>]]`. Example #1, split by QR code: `"searchString": "[[barcode:qrcode]]"`. Example #2, split by QR code with value: `"searchString": "[[barcode:qrcode pdfco]]"`. Example #3, split by QR code with value search with regex: `"searchString": "[[barcode:qrcode /pdf\.co/]]"`. Example #4, split by QR code or datamatrix with value search with regex: `"searchString": "[[barcode:qrcode,datamatrix /pdf\.co/]]"`.	yes
`excludeKeyPages`	Set to `true` if you want to exclude pages where the `searchString` text was found.	no
`regexSearch`	Set to `true` to enable regular expressions for the search string.	no
`caseSensitive`	Set to `true` to enable case-sensitive search.	no
`lang`	Set the language for OCR (text from image) to use for scanned PDF, PNG, and JPG documents input when extracting text. The default is `eng`. Other languages are also supported: `deu`, `spa`, `chi_sim`, `jpn`, and many others, see Language Support. You can also use 2 languages simultaneously like this: `eng+deu` or `jpn+kor` (any combination).	no
`inline`	Set to `true` to return results inside the response. Otherwise, the endpoint will return a link to the output file generated.	no
`async`	Set `async` to `true` for long processes to run in the background, API will then return a `jobId` which you can use with the Background Job Check endpoint to check the status of the process and retrieve the output while you can proceed with other tasks.	no
`name`	File name for the generated output, the input must be in string format.	no
`expiration`	Set the expiration time for the output link in minutes (default is `60` i.e 60 minutes or 1 hour), After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.	no
`profiles`	Use this parameter to set additional configurations for fine-tuning and extra options. Explore the Profiles section for more.	no

Query parameters#

No query parameters accepted.

Payload 3 #

{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-split/split_by_barcode.pdf",
    "searchString": "[[barcode:qrcode,datamatrix /pdf\\.co/]]",
    "excludeKeyPages": true,
    "regexSearch": false,
    "caseSensitive": false,
    "inline": true,
    "name": "output-split-by-barcode",
    "async": false
}

Response 2 #

{
    "urls": [
        "https://pdf-temp-files.s3.us-west-2.amazonaws.com/A2WX2GR0PX4818EIKW96VR3BZTK5FWT2/output-split-by-barcode_page1.pdf?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzEK3%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDH1Gv1Q88EtgGpfAYiKCAaQTLV5ot8KMblEXIEFzeznT8mOeGKylp0uktJk2Se8SK5r3nfQTJKa8JqJE0GcW9vOtcBPPqHcPZXf2iQkvSk3yvFJv6cDj8%2B6kck0Eadz4BOXz0ljrE1Vt%2BX2gItx86Fd8rldFG3TL7u99FKiuc1rN9OaBRJpPHL12fVP2gjuVUUIomqShmQYyKHbhGDuLKoCWq%2BdLkggz2eTJna6w9eWR7QMvpIJxc8sBGFT1WEm%2FsyA%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHORHIVCFW/20220919/us-west-2/s3/aws4_request&X-Amz-Date=20220919T114402Z&X-Amz-SignedHeaders=host&X-Amz-Signature=8241ad05ecb5555cbbd4998b5c334104f2849bf4177384e86fbb5cc5d7e81ce8",
        "https://pdf-temp-files.s3.us-west-2.amazonaws.com/B6Z9J274GZ5BK5QYK547ST4T5WF61LNQ/output-split-by-barcode_page3-5.pdf?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzEK3%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDH1Gv1Q88EtgGpfAYiKCAaQTLV5ot8KMblEXIEFzeznT8mOeGKylp0uktJk2Se8SK5r3nfQTJKa8JqJE0GcW9vOtcBPPqHcPZXf2iQkvSk3yvFJv6cDj8%2B6kck0Eadz4BOXz0ljrE1Vt%2BX2gItx86Fd8rldFG3TL7u99FKiuc1rN9OaBRJpPHL12fVP2gjuVUUIomqShmQYyKHbhGDuLKoCWq%2BdLkggz2eTJna6w9eWR7QMvpIJxc8sBGFT1WEm%2FsyA%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHORHIVCFW/20220919/us-west-2/s3/aws4_request&X-Amz-Date=20220919T114402Z&X-Amz-SignedHeaders=host&X-Amz-Signature=94764cfb37819f2a4885ba064dd1ae20f38f42d6bc6c1a208010637fca74a591",
        "https://pdf-temp-files.s3.us-west-2.amazonaws.com/XT5TD1BDBFDNKX0LM6N5GLFLOAF1UC0Y/output-split-by-barcode_page7-9.pdf?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzEK3%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDH1Gv1Q88EtgGpfAYiKCAaQTLV5ot8KMblEXIEFzeznT8mOeGKylp0uktJk2Se8SK5r3nfQTJKa8JqJE0GcW9vOtcBPPqHcPZXf2iQkvSk3yvFJv6cDj8%2B6kck0Eadz4BOXz0ljrE1Vt%2BX2gItx86Fd8rldFG3TL7u99FKiuc1rN9OaBRJpPHL12fVP2gjuVUUIomqShmQYyKHbhGDuLKoCWq%2BdLkggz2eTJna6w9eWR7QMvpIJxc8sBGFT1WEm%2FsyA%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHORHIVCFW/20220919/us-west-2/s3/aws4_request&X-Amz-Date=20220919T114402Z&X-Amz-SignedHeaders=host&X-Amz-Signature=0a7c90a05fd159659451d29273284fbf422d34bd204c07fbc9abdf7a36a84294"
    ],
    "pageCount": 10,
    "error": false,
    "status": 200,
    "name": "output-split-by-barcode.pdf",
    "credits": 350,
    "duration": 4456,
    "remainingCredits": 98221710
}

CURL#

curl --location --request POST 'https://api.pdf.co/v1/pdf/split2' \
--header 'Content-Type: application/json' \
--header 'x-api-key: *******************' \
--data-raw '{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-split/split_by_barcode.pdf",
    "searchString": "[[barcode:qrcode,datamatrix /pdf\\.co/]]",
    "excludeKeyPages": true,
    "regexSearch": false,
    "caseSensitive": false,
    "inline": true,
    "name": "output-split-by-barcode",
    "async": false
}'

Code samples#

/pdf/split/

JavaScript / Node.js

var https = require("https");
var path = require("path");
var fs = require("fs");

// `request` module is required for file upload.
// Use "npm install request" command to install.
var request = require("request");

// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const API_KEY = "***********************************";


// Source PDF file to split
const SourceFile = "./sample.pdf";
// Comma-separated list of page numbers (or ranges) to process. Example: '1,3-5,7-'.
const Pages = "1-2,3-";


// 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.
getPresignedUrl(API_KEY, SourceFile)
    .then(([uploadUrl, uploadedFileUrl]) => {
        // 2. UPLOAD THE FILE TO CLOUD.
        uploadFile(API_KEY, SourceFile, uploadUrl)
            .then(() => {
                // 3. SPLIT UPLOADED PDF
                splitPdf(API_KEY, uploadedFileUrl, Pages);
            })
            .catch(e => {
                console.log(e);
            });
    })
    .catch(e => {
        console.log(e);
    });


function getPresignedUrl(apiKey, localFile) {
    return new Promise(resolve => {
        // Prepare request to `Get Presigned URL` API endpoint
        let queryPath = `/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=${path.basename(SourceFile)}`;
        let reqOptions = {
            host: "api.pdf.co",
            path: encodeURI(queryPath),
            headers: { "x-api-key": API_KEY }
        };
        // Send request
        https.get(reqOptions, (response) => {
            response.on("data", (d) => {
                let data = JSON.parse(d);
                if (data.error == false) {
                    // Return presigned url we received
                    resolve([data.presignedUrl, data.url]);
                }
                else {
                    // Service reported error
                    console.log("getPresignedUrl(): " + data.message);
                }
            });
        })
            .on("error", (e) => {
                // Request error
                console.log("getPresignedUrl(): " + e);
            });
    });
}

function uploadFile(apiKey, localFile, uploadUrl) {
    return new Promise(resolve => {
        fs.readFile(SourceFile, (err, data) => {
            request({
                method: "PUT",
                url: uploadUrl,
                body: data,
                headers: {
                    "Content-Type": "application/octet-stream"
                }
            }, (err, res, body) => {
                if (!err) {
                    resolve();
                }
                else {
                    console.log("uploadFile() request error: " + err);
                }
            });
        });
    });
}

function splitPdf(apiKey, uploadedFileUrl, pages) {
    // Prepare request to `Make Searchable PDF` API endpoint
    var queryPath = `/v1/pdf/split`;

    // JSON payload for api request
    var jsonPayload = JSON.stringify({
        pages: pages, url: uploadedFileUrl, async: true
    });

    var reqOptions = {
        host: "api.pdf.co",
        method: "POST",
        path: queryPath,
        headers: {
            "x-api-key": apiKey,
            "Content-Type": "application/json",
            "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
        }
    };
    // Send request
    var postRequest = https.request(reqOptions, (response) => {
        response.on("data", (d) => {
            response.setEncoding("utf8");
            // Parse JSON response
            let data = JSON.parse(d);
            if (data.error == false) {
                console.log(`Job #${data.jobId} has been created!`);
                checkIfJobIsCompleted(data.jobId, data.url);
            }
            else {
                // Service reported error
                console.log("splitPdf(): " + data.message);
            }
        });
    })
        .on("error", (e) => {
            // Request error
            console.log("splitPdf(): " + e);
        });
    // Write request data
    postRequest.write(jsonPayload);
    postRequest.end();
}

function checkIfJobIsCompleted(jobId, resultFileUrlJson) {
    let queryPath = `/v1/job/check`;

    // JSON payload for api request
    let jsonPayload = JSON.stringify({
        jobid: jobId
    });

    let reqOptions = {
        host: "api.pdf.co",
        path: queryPath,
        method: "POST",
        headers: {
            "x-api-key": API_KEY,
            "Content-Type": "application/json",
            "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
        }
    };

    // Send request
    var postRequest = https.request(reqOptions, (response) => {
        response.on("data", (d) => {
            response.setEncoding("utf8");

            // Parse JSON response
            let data = JSON.parse(d);
            console.log(`Checking Job #${jobId}, Status: ${data.status}, Time: ${new Date().toLocaleString()}`);

            if (data.status == "working") {
                // Check again after 3 seconds
                setTimeout(function () { checkIfJobIsCompleted(jobId, resultFileUrlJson) }, 3000);
            }
            else if (data.status == "success") {

                request({ method: 'GET', uri: resultFileUrlJson, gzip: true },
                    function (error, response, body) {

                        // Parse JSON response
                        let respJsonFileArray = JSON.parse(body);
                        let part = 1;

                        respJsonFileArray.forEach((url) => {
                            var localFileName = `./part${part}.pdf`;
                            var file = fs.createWriteStream(localFileName);
                            https.get(url, (response2) => {
                                response2.pipe(file)
                                    .on("close", () => {
                                        console.log(`Generated PDF file saved as "${localFileName} file."`);
                                    });
                            });
                            part++;
                        }, this);

                    });
            }
            else {
                console.log(`Operation ended with status: "${data.status}".`);
            }
        })
    });

    // Write request data
    postRequest.write(jsonPayload);
    postRequest.end();
}

Python

import os
import requests # pip install requests
import time
import datetime

# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "******************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

# Source PDF file
SourceFile = ".\\sample.pdf"
# Comma-separated list of page numbers (or ranges) to process. Example: '1,3-5,7-'.
Pages = "1-2,3-"
# (!) Make asynchronous job
Async = True

def main(args = None):
    uploadedFileUrl = uploadFile(SourceFile)
    if (uploadedFileUrl != None):
        splitPDF(uploadedFileUrl)


def splitPDF(uploadedFileUrl):
    """Split PDF using PDF.co Web API"""

    # Prepare requests params as JSON
    # See documentation: https://developer.pdf.co/
    parameters = {}
    parameters["async"] = Async
    parameters["pages"] = Pages
    parameters["url"] = uploadedFileUrl

    # Prepare URL for 'Split PDF' API request
    url = "{}/pdf/split".format(BASE_URL)

    # Execute request and get response as JSON
    response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()

        if json["error"] == False:
            # Asynchronous job ID
            jobId = json["jobId"]
            #  URL of the result file
            resultFilePlaceholder = json["url"]

            # Check the job status in a loop.
            # If you don't want to pause the main thread you can rework the code
            # to use a separate thread for the status checking and completion.
            while True:
                status = checkJobStatus(jobId) # Possible statuses: "working", "failed", "aborted", "success".

                # Display timestamp and status (for demo purposes)
                print(datetime.datetime.now().strftime("%H:%M.%S") + ": " + status)

                if status == "success":

                    resJsonImgFiles = requests.get(resultFilePlaceholder)

                    # Download generated PNG files
                    part = 1

                    for resultFileUrl in resJsonImgFiles.json():
                        # Download Result File
                        r = requests.get(resultFileUrl, stream=True)

                        localFileUrl = f"Page{part}.pdf"

                        if r.status_code == 200:
                            with open(localFileUrl, 'wb') as file:
                                for chunk in r:
                                    file.write(chunk)
                            print(f"Result file saved as \"{localFileUrl}\" file.")
                        else:
                            print(f"Request error: {response.status_code} {response.reason}")

                        part = part + 1

                    break
                elif status == "working":
                    # Pause for a few seconds
                    time.sleep(3)
                else:
                    print(status)
                    break
        else:
            # Show service reported error
            print(json["message"])
    else:
        print(f"Request error: {response.status_code} {response.reason}")


def checkJobStatus(jobId):
    """Checks server job status"""

    url = f"{BASE_URL}/job/check?jobid={jobId}"

    response = requests.get(url, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()
        return json["status"]
    else:
        print(f"Request error: {response.status_code} {response.reason}")

    return None


def uploadFile(fileName):
    """Uploads file to the cloud"""

    # 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.

    # Prepare URL for 'Get Presigned URL' API request
    url = "{}/file/upload/get-presigned-url?contenttype=application/octet-stream&name={}".format(
        BASE_URL, os.path.basename(fileName))

    # Execute request and get response as JSON
    response = requests.get(url, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()

        if json["error"] == False:
            # URL to use for file upload
            uploadUrl = json["presignedUrl"]
            # URL for future reference
            uploadedFileUrl = json["url"]

            # 2. UPLOAD FILE TO CLOUD.
            with open(fileName, 'rb') as file:
                requests.put(uploadUrl, data=file, headers={ "x-api-key": API_KEY, "content-type": "application/octet-stream" })

            return uploadedFileUrl
        else:
            # Show service reported error
            print(json["message"])
    else:
        print(f"Request error: {response.status_code} {response.reason}")

    return None


if __name__ == '__main__':
    main()

using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

namespace PDFcoApiExample
{
  class Program
  {
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    const String API_KEY = "***********************************";

    // Source PDF file to split
    const string SourceFile = @".\sample.pdf";
    // Comma-separated list of page numbers (or ranges) to process. Example: '1,3-5,7-'.
    const string Pages = "1-2,3-";


    static void Main(string[] args)
    {
      // Create standard .NET web client instance
      WebClient webClient = new WebClient();

      // Set API Key
      webClient.Headers.Add("x-api-key", API_KEY);

      // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
      // * If you already have a direct file URL, skip to the step 3.

      // Prepare URL for `Get Presigned URL` API call
      string query = Uri.EscapeUriString(string.Format(
        "https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name={0}",
        Path.GetFileName(SourceFile)));

      try
      {
        // Execute request
        string response = webClient.DownloadString(query);

        // Parse JSON response
        JObject json = JObject.Parse(response);

        if (json["error"].ToObject<bool>() == false)
        {
          // Get URL to use for the file upload
          string uploadUrl = json["presignedUrl"].ToString();
          string uploadedFileUrl = json["url"].ToString();

          // 2. UPLOAD THE FILE TO CLOUD.

          webClient.Headers.Add("content-type", "application/octet-stream");
          webClient.UploadFile(uploadUrl, "PUT", SourceFile); // You can use UploadData() instead if your file is byte[] or Stream
          webClient.Headers.Remove("content-type");

          // 3. SPLIT UPLOADED PDF

          // URL for `Split PDF` API call
          var url = "https://api.pdf.co/v1/pdf/split";

          // Prepare requests params as JSON
          Dictionary<string, object> parameters = new Dictionary<string, object>();
          parameters.Add("pages", Pages);
          parameters.Add("url", uploadedFileUrl);

          // Convert dictionary of params to JSON
          string jsonPayload = JsonConvert.SerializeObject(parameters);

          // Execute POST request with JSON payload
          response = webClient.UploadString(url, jsonPayload);

          // Parse JSON response
          json = JObject.Parse(response);

          if (json["error"].ToObject<bool>() == false)
          {
            // Download generated PDF files
            int part = 1;
            foreach (JToken token in json["urls"])
            {
              string resultFileUrl = token.ToString();
              string localFileName = String.Format(@".\part{0}.pdf", part);

              webClient.DownloadFile(resultFileUrl, localFileName);

              Console.WriteLine("Downloaded \"{0}\".", localFileName);
              part++;
            }
          }
          else
          {
            Console.WriteLine(json["message"].ToString());
          }
        }
        else
        {
          Console.WriteLine(json["message"].ToString());
        }
      }
      catch (WebException e)
      {
        Console.WriteLine(e.ToString());
      }

      webClient.Dispose();


      Console.WriteLine();
      Console.WriteLine("Press any key...");
      Console.ReadKey();
    }
  }
}

Java

package com.company;

import com.google.gson.JsonArray;
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import okhttp3.*;

import java.io.*;
import java.net.*;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main
{
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    final static String API_KEY = "***********************************";

    // Source PDF file to split
    final static Path SourceFile = Paths.get(".\\sample.pdf");
    // Comma-separated list of page numbers (or ranges) to process. Example: '1,3-5,7-'.
    final static String Pages = "1-2,3-";


    public static void main(String[] args) throws IOException
    {
        // Create HTTP client instance
        OkHttpClient webClient = new OkHttpClient();

        // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
        // * If you already have a direct file URL, skip to the step 3.

        // Prepare URL for `Get Presigned URL` API call
        String query = String.format(
                "https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=%s",
                SourceFile.getFileName());

        // Prepare request
        Request request = new Request.Builder()
                .url(query)
                .addHeader("x-api-key", API_KEY) // (!) Set API Key
                .build();
        // Execute request
        Response response = webClient.newCall(request).execute();

        if (response.code() == 200)
        {
            // Parse JSON response
            JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

            boolean error = json.get("error").getAsBoolean();
            if (!error)
            {
                // Get URL to use for the file upload
                String uploadUrl = json.get("presignedUrl").getAsString();
                // Get URL of uploaded file to use with later API calls
                String uploadedFileUrl = json.get("url").getAsString();

                // 2. UPLOAD THE FILE TO CLOUD.

                if (uploadFile(webClient, API_KEY, uploadUrl, SourceFile))
                {
                    // 3. SPLIT UPLOADED PDF

                    SplitPdf(webClient, API_KEY, Pages, uploadedFileUrl);
                }
            }
            else
            {
                // Display service reported error
                System.out.println(json.get("message").getAsString());
            }
        }
        else
        {
            // Display request error
            System.out.println(response.code() + " " + response.message());
        }
    }

    public static void SplitPdf(OkHttpClient webClient, String apiKey, String pages, String uploadedFileUrl) throws IOException
    {
        // Prepare URL for `Split PDF` API call
        String query = "https://api.pdf.co/v1/pdf/split";

        // Make correctly escaped (encoded) URL
        URL url = null;
        try
        {
            url = new URI(null, query, null).toURL();
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();
        }

        // Create JSON payload
    String jsonPayload = String.format("{\"pages\": \"%s\", \"url\": \"%s\"}",
                pages,
                uploadedFileUrl);

        // Prepare request body
        RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonPayload);

        // Prepare request
        Request request = new Request.Builder()
            .url(url)
            .addHeader("x-api-key", API_KEY) // (!) Set API Key
            .addHeader("Content-Type", "application/json")
            .post(body)
            .build();

        // Execute request
        Response response = webClient.newCall(request).execute();


        if (response.code() == 200)
        {
            // Parse JSON response
            JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

            boolean error = json.get("error").getAsBoolean();
            if (!error)
            {
                // Download generated PDF files
                JsonArray urls = json.get("urls").getAsJsonArray();

                int part = 1;
                for (JsonElement element: urls)
                {
                    String resultFileUrl = element.getAsString();
                    String localFileName = String.format(".\\part%s.pdf", part);

                    downloadFile(webClient, resultFileUrl, Paths.get(localFileName).toFile());

                    System.out.println(String.format("Splitted part saved as \"%s\".", localFileName));
                    part++;
                }
            }
            else
            {
                // Display service reported error
                System.out.println(json.get("message").getAsString());
            }
        }
        else
        {
            // Display request error
            System.out.println(response.code() + " " + response.message());
        }
    }

    public static boolean uploadFile(OkHttpClient webClient, String apiKey, String url, Path sourceFile) throws IOException
    {
        // Prepare request body
        RequestBody body = RequestBody.create(MediaType.parse("application/octet-stream"), sourceFile.toFile());

        // Prepare request
        Request request = new Request.Builder()
                .url(url)
                .addHeader("x-api-key", apiKey) // (!) Set API Key
                .addHeader("content-type", "application/octet-stream")
                .put(body)
                .build();

        // Execute request
        Response response = webClient.newCall(request).execute();

        return (response.code() == 200);
    }

    public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
    {
        // Prepare request
        Request request = new Request.Builder()
                .url(url)
                .build();
        // Execute request
        Response response = webClient.newCall(request).execute();

        byte[] fileBytes = response.body().bytes();

        // Save downloaded bytes to file
        OutputStream output = new FileOutputStream(destinationFile);
        output.write(fileBytes);
        output.flush();
        output.close();

        response.close();
    }
}

PHP

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>PDF Splitting Results</title>
</head>
<body>

<?php
// Note: If you have input files large than 200kb we highly recommend to check "async" mode example.

// Get submitted form data
$apiKey = $_POST["apiKey"]; // The authentication key (API Key). Get your own by registering at https://app.pdf.co
$pages = $_POST["pages"];


// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have a direct PDF file link, go to the step 3.

// Create URL
$url = "https://api.pdf.co/v1/file/upload/get-presigned-url" .
    "?name=" . urlencode($_FILES["file"]["name"]) .
    "&contenttype=application/octet-stream";

// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
// Execute request
$result = curl_exec($curl);

if (curl_errno($curl) == 0)
{
    $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

    if ($status_code == 200)
    {
        $json = json_decode($result, true);

        // Get URL to use for the file upload
        $uploadFileUrl = $json["presignedUrl"];
        // Get URL of uploaded file to use with later API calls
        $accessFileUrl = $json["url"];

        // 2. UPLOAD THE FILE TO CLOUD.

        $localFile = $_FILES["file"]["tmp_name"];
        $fileHandle = fopen($localFile, "r");

        curl_setopt($curl, CURLOPT_URL, $uploadFileUrl);
        curl_setopt($curl, CURLOPT_HTTPHEADER, array("content-type: application/octet-stream"));
        curl_setopt($curl, CURLOPT_PUT, true);
        curl_setopt($curl, CURLOPT_INFILE, $fileHandle);
        curl_setopt($curl, CURLOPT_INFILESIZE, filesize($localFile));

        // Execute request
        curl_exec($curl);

        fclose($fileHandle);

        if (curl_errno($curl))
        {
            // Display request error
            echo "Error: " . curl_error($curl);
        }
        else
        {
            $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

            if ($status_code == 200)
            {
                // 3. SPLIT UPLOADED PDF DOCUMENT
                SplitPdf($apiKey, $accessFileUrl, $pages);
            }
            else
            {
                // Display service reported error
                echo "<p>Status code: " . $status_code . "</p>";
                echo "<p>" . $result . "</p>";
            }
        }
    }
    else
    {
        // Display service reported error
        echo "<p>Status code: " . $status_code . "</p>";
        echo "<p>" . $result . "</p>";
    }

    curl_close($curl);
}
else
{
    // Display CURL error
    echo "Error: " . curl_error($curl);
}


function SplitPdf($apiKey, $fileUrl, $pages)
{
    // Prepare URL for `Split PDF` API call
    $url = "https://api.pdf.co/v1/pdf/split";

    // Prepare requests params
    $parameters = array();
    $parameters["name"] = "part.pdf";
    $parameters["url"] = $fileUrl;
    $parameters["pages"] = $pages;

    // Create Json payload
    $data = json_encode($parameters);

    // Create request
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_POST, true);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

    // Execute request
    $result = curl_exec($curl);

    if (curl_errno($curl) == 0)
    {
        $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        if ($status_code == 200)
        {
            $json = json_decode($result, true);

            if (!isset($json["error"]) || $json["error"] == false)
            {
                // Display links to splitted parts
                $resultFiles = $json["urls"];
                foreach ($resultFiles as &$resultFileUrl)
                    echo "<p><a href=" . $resultFileUrl . ">" . $resultFileUrl . "</p>";
            }
            else
            {
                // Display service reported error
                echo "<p>Error: " . $json["message"] . "</p>";
            }
        }
        else
        {
            // Display request error
            echo "<p>Status code: " . $status_code . "</p>";
            echo "<p>" . $result . "</p>";
        }
    }
    else
    {
        // Display request error
        echo "Error: " . curl_error($curl);
    }

    // Cleanup
    curl_close($curl);
}

?>

</body>
</html>

/pdf/split2/

JavaScript / Node.js

var https = require("https");
var path = require("path");
var fs = require("fs");

// `request` module is required for file upload.
// Use "npm install request" command to install.
var request = require("request");

// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const API_KEY = "***********************************";


// Source PDF file to split
const SourceFile = "./sample.pdf";
// Split Search String
const SplitText = "invoice number";


// 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.
getPresignedUrl(API_KEY, SourceFile)
    .then(([uploadUrl, uploadedFileUrl]) => {
        // 2. UPLOAD THE FILE TO CLOUD.
        uploadFile(API_KEY, SourceFile, uploadUrl)
            .then(() => {
                // 3. SPLIT UPLOADED PDF
                splitPdf(API_KEY, uploadedFileUrl, SplitText);
            })
            .catch(e => {
                console.log(e);
            });
    })
    .catch(e => {
        console.log(e);
    });


function getPresignedUrl(apiKey, localFile) {
    return new Promise(resolve => {
        // Prepare request to `Get Presigned URL` API endpoint
        let queryPath = `/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=${path.basename(SourceFile)}`;
        let reqOptions = {
            host: "api.pdf.co",
            path: encodeURI(queryPath),
            headers: { "x-api-key": API_KEY }
        };
        // Send request
        https.get(reqOptions, (response) => {
            response.on("data", (d) => {
                let data = JSON.parse(d);
                if (data.error == false) {
                    // Return presigned url we received
                    resolve([data.presignedUrl, data.url]);
                }
                else {
                    // Service reported error
                    console.log("getPresignedUrl(): " + data.message);
                }
            });
        })
            .on("error", (e) => {
                // Request error
                console.log("getPresignedUrl(): " + e);
            });
    });
}

function uploadFile(apiKey, localFile, uploadUrl) {
    return new Promise(resolve => {
        fs.readFile(SourceFile, (err, data) => {
            request({
                method: "PUT",
                url: uploadUrl,
                body: data,
                headers: {
                    "Content-Type": "application/octet-stream"
                }
            }, (err, res, body) => {
                if (!err) {
                    resolve();
                }
                else {
                    console.log("uploadFile() request error: " + err);
                }
            });
        });
    });
}

function splitPdf(apiKey, uploadedFileUrl, splitText) {
    // Prepare request to `Split PDF By Text` API endpoint
    var queryPath = `/v1/pdf/split2`;

    // JSON payload for api request
    var jsonPayload = JSON.stringify({
        searchString: splitText, url: uploadedFileUrl, async: true
    });

    var reqOptions = {
        host: "api.pdf.co",
        method: "POST",
        path: queryPath,
        headers: {
            "x-api-key": apiKey,
            "Content-Type": "application/json",
            "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
        }
    };
    // Send request
    var postRequest = https.request(reqOptions, (response) => {
        response.on("data", (d) => {
            response.setEncoding("utf8");
            // Parse JSON response
            let data = JSON.parse(d);
            if (data.error == false) {
                console.log(`Job #${data.jobId} has been created!`);
                checkIfJobIsCompleted(data.jobId, data.url);
            }
            else {
                // Service reported error
                console.log("splitPdf(): " + data.message);
            }
        });
    })
        .on("error", (e) => {
            // Request error
            console.log("splitPdf(): " + e);
        });
    // Write request data
    postRequest.write(jsonPayload);
    postRequest.end();
}

function checkIfJobIsCompleted(jobId, resultFileUrlJson) {
    let queryPath = `/v1/job/check`;

    // JSON payload for api request
    let jsonPayload = JSON.stringify({
        jobid: jobId
    });

    let reqOptions = {
        host: "api.pdf.co",
        path: queryPath,
        method: "POST",
        headers: {
            "x-api-key": API_KEY,
            "Content-Type": "application/json",
            "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
        }
    };

    // Send request
    var postRequest = https.request(reqOptions, (response) => {
        response.on("data", (d) => {
            response.setEncoding("utf8");

            // Parse JSON response
            let data = JSON.parse(d);
            console.log(`Checking Job #${jobId}, Status: ${data.status}, Time: ${new Date().toLocaleString()}`);

            if (data.status == "working") {
                // Check again after 3 seconds
                setTimeout(function () { checkIfJobIsCompleted(jobId, resultFileUrlJson) }, 3000);
            }
            else if (data.status == "success") {

                request({ method: 'GET', uri: resultFileUrlJson, gzip: true },
                    function (error, response, body) {

                        // Parse JSON response
                        let respJsonFileArray = JSON.parse(body);
                        let part = 1;

                        respJsonFileArray.forEach((url) => {
                            var localFileName = `./part${part}.pdf`;
                            var file = fs.createWriteStream(localFileName);
                            https.get(url, (response2) => {
                                response2.pipe(file)
                                    .on("close", () => {
                                        console.log(`Generated PDF file saved as "${localFileName} file."`);
                                    });
                            });
                            part++;
                        }, this);

                    });
            }
            else {
                console.log(`Operation ended with status: "${data.status}".`);
            }
        })
    });

    // Write request data
    postRequest.write(jsonPayload);
    postRequest.end();
}

Python

import os
import requests # pip install requests
import time
import datetime

# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "******************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

# Source PDF file
SourceFile = ".\\sample.pdf"
# Split by Text
SplitText = "invoice number"
# (!) Make asynchronous job
Async = True

def main(args = None):
    uploadedFileUrl = uploadFile(SourceFile)
    if (uploadedFileUrl != None):
        splitPDF(uploadedFileUrl)


def splitPDF(uploadedFileUrl):
    """Split PDF using PDF.co Web API"""

    # Prepare requests params as JSON
    # See documentation: https://developer.pdf.co/
    parameters = {}
    parameters["async"] = Async
    parameters["searchString"] = SplitText
    parameters["url"] = uploadedFileUrl

    # Prepare URL for 'Split PDF By Text' API request
    url = "{}/pdf/split2".format(BASE_URL)

    # Execute request and get response as JSON
    response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()

        if json["error"] == False:
            # Asynchronous job ID
            jobId = json["jobId"]
            #  URL of the result file
            resultFilePlaceholder = json["url"]

            # Check the job status in a loop.
            # If you don't want to pause the main thread you can rework the code
            # to use a separate thread for the status checking and completion.
            while True:
                status = checkJobStatus(jobId) # Possible statuses: "working", "failed", "aborted", "success".

                # Display timestamp and status (for demo purposes)
                print(datetime.datetime.now().strftime("%H:%M.%S") + ": " + status)

                if status == "success":

                    resJsonImgFiles = requests.get(resultFilePlaceholder)

                    # Download generated files
                    part = 1

                    for resultFileUrl in resJsonImgFiles.json():
                        # Download Result File
                        r = requests.get(resultFileUrl, stream=True)

                        localFileUrl = f"Page{part}.pdf"

                        if r.status_code == 200:
                            with open(localFileUrl, 'wb') as file:
                                for chunk in r:
                                    file.write(chunk)
                            print(f"Result file saved as \"{localFileUrl}\" file.")
                        else:
                            print(f"Request error: {response.status_code} {response.reason}")

                        part = part + 1

                    break
                elif status == "working":
                    # Pause for a few seconds
                    time.sleep(3)
                else:
                    print(status)
                    break
        else:
            # Show service reported error
            print(json["message"])
    else:
        print(f"Request error: {response.status_code} {response.reason}")


def checkJobStatus(jobId):
    """Checks server job status"""

    url = f"{BASE_URL}/job/check?jobid={jobId}"

    response = requests.get(url, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()
        return json["status"]
    else:
        print(f"Request error: {response.status_code} {response.reason}")

    return None


def uploadFile(fileName):
    """Uploads file to the cloud"""

    # 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.

    # Prepare URL for 'Get Presigned URL' API request
    url = "{}/file/upload/get-presigned-url?contenttype=application/octet-stream&name={}".format(
        BASE_URL, os.path.basename(fileName))

    # Execute request and get response as JSON
    response = requests.get(url, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()

        if json["error"] == False:
            # URL to use for file upload
            uploadUrl = json["presignedUrl"]
            # URL for future reference
            uploadedFileUrl = json["url"]

            # 2. UPLOAD FILE TO CLOUD.
            with open(fileName, 'rb') as file:
                requests.put(uploadUrl, data=file, headers={ "x-api-key": API_KEY, "content-type": "application/octet-stream" })

            return uploadedFileUrl
        else:
            # Show service reported error
            print(json["message"])
    else:
        print(f"Request error: {response.status_code} {response.reason}")

    return None


if __name__ == '__main__':
    main()

using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

namespace PDFcoApiExample
{
  class Program
  {
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    const String API_KEY = "***********************************";

    // Source PDF file to split
    const string SourceFile = @".\sample.pdf";

    static void Main(string[] args)
    {
      // Create standard .NET web client instance
      WebClient webClient = new WebClient();

      // Set API Key
      webClient.Headers.Add("x-api-key", API_KEY);

      // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
      // * If you already have a direct file URL, skip to the step 3.

      // Prepare URL for `Get Presigned URL` API call
      string query = Uri.EscapeUriString(string.Format(
        "https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name={0}",
        Path.GetFileName(SourceFile)));

      try
      {
        // Execute request
        string response = webClient.DownloadString(query);

        // Parse JSON response
        JObject json = JObject.Parse(response);

        if (json["error"].ToObject<bool>() == false)
        {
          // Get URL to use for the file upload
          string uploadUrl = json["presignedUrl"].ToString();
          string uploadedFileUrl = json["url"].ToString();

          // 2. UPLOAD THE FILE TO CLOUD.

          webClient.Headers.Add("content-type", "application/octet-stream");
          webClient.UploadFile(uploadUrl, "PUT", SourceFile); // You can use UploadData() instead if your file is byte[] or Stream
          webClient.Headers.Remove("content-type");

          // 3. SPLIT UPLOADED PDF By Text

          // URL for `Split PDF By Text` API call
          var url = "https://api.pdf.co/v1/pdf/split2";

          // Prepare requests params as JSON
          Dictionary<string, object> parameters = new Dictionary<string, object>();
          parameters.Add("searchString", "invoice number");
          parameters.Add("url", uploadedFileUrl);

          // Convert dictionary of params to JSON
          string jsonPayload = JsonConvert.SerializeObject(parameters);

          // Execute POST request with JSON payload
          response = webClient.UploadString(url, jsonPayload);

          // Parse JSON response
          json = JObject.Parse(response);

          if (json["error"].ToObject<bool>() == false)
          {
            // Download generated PDF files
            int part = 1;
            foreach (JToken token in json["urls"])
            {
              string resultFileUrl = token.ToString();
              string localFileName = String.Format(@".\part{0}.pdf", part);

              webClient.DownloadFile(resultFileUrl, localFileName);

              Console.WriteLine("Downloaded \"{0}\".", localFileName);
              part++;
            }
          }
          else
          {
            Console.WriteLine(json["message"].ToString());
          }
        }
        else
        {
          Console.WriteLine(json["message"].ToString());
        }
      }
      catch (WebException e)
      {
        Console.WriteLine(e.ToString());
      }

      webClient.Dispose();


      Console.WriteLine();
      Console.WriteLine("Press any key...");
      Console.ReadKey();
    }
  }
}

Java

package com.company;

import com.google.gson.JsonArray;
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import okhttp3.*;

import java.io.*;
import java.net.*;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main
{
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    final static String API_KEY = "***********************************";

    // Source PDF file to split
    final static Path SourceFile = Paths.get(".\\sample.pdf");

    // Split By Text
    final static String SplitText = "invoice number";

    public static void main(String[] args) throws IOException
    {
        // Create HTTP client instance
        OkHttpClient webClient = new OkHttpClient();

        // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
        // * If you already have a direct file URL, skip to the step 3.

        // Prepare URL for `Get Presigned URL` API call
        String query = String.format(
                "https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=%s",
                SourceFile.getFileName());

        // Prepare request
        Request request = new Request.Builder()
                .url(query)
                .addHeader("x-api-key", API_KEY) // (!) Set API Key
                .build();
        // Execute request
        Response response = webClient.newCall(request).execute();

        if (response.code() == 200)
        {
            // Parse JSON response
            JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

            boolean error = json.get("error").getAsBoolean();
            if (!error)
            {
                // Get URL to use for the file upload
                String uploadUrl = json.get("presignedUrl").getAsString();
                // Get URL of uploaded file to use with later API calls
                String uploadedFileUrl = json.get("url").getAsString();

                // 2. UPLOAD THE FILE TO CLOUD.

                if (uploadFile(webClient, API_KEY, uploadUrl, SourceFile))
                {
                    // 3. SPLIT UPLOADED PDF

                    SplitPdf(webClient, API_KEY, SplitText, uploadedFileUrl);
                }
            }
            else
            {
                // Display service reported error
                System.out.println(json.get("message").getAsString());
            }
        }
        else
        {
            // Display request error
            System.out.println(response.code() + " " + response.message());
        }
    }

    public static void SplitPdf(OkHttpClient webClient, String apiKey, String splitText, String uploadedFileUrl) throws IOException
    {
        // Prepare URL for `Split PDF By Text` API call
        String query = "https://api.pdf.co/v1/pdf/split2";

        // Make correctly escaped (encoded) URL
        URL url = null;
        try
        {
            url = new URI(null, query, null).toURL();
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();
        }

        // Create JSON payload
    String jsonPayload = String.format("{\"searchString\": \"%s\", \"url\": \"%s\"}",
                splitText,
                uploadedFileUrl);

        // Prepare request body
        RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonPayload);

        // Prepare request
        Request request = new Request.Builder()
            .url(url)
            .addHeader("x-api-key", API_KEY) // (!) Set API Key
            .addHeader("Content-Type", "application/json")
            .post(body)
            .build();

        // Execute request
        Response response = webClient.newCall(request).execute();


        if (response.code() == 200)
        {
            // Parse JSON response
            JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

            boolean error = json.get("error").getAsBoolean();
            if (!error)
            {
                // Download generated PDF files
                JsonArray urls = json.get("urls").getAsJsonArray();

                int part = 1;
                for (JsonElement element: urls)
                {
                    String resultFileUrl = element.getAsString();
                    String localFileName = String.format(".\\part%s.pdf", part);

                    downloadFile(webClient, resultFileUrl, Paths.get(localFileName).toFile());

                    System.out.println(String.format("Splitted part saved as \"%s\".", localFileName));
                    part++;
                }
            }
            else
            {
                // Display service reported error
                System.out.println(json.get("message").getAsString());
            }
        }
        else
        {
            // Display request error
            System.out.println(response.code() + " " + response.message());
        }
    }

    public static boolean uploadFile(OkHttpClient webClient, String apiKey, String url, Path sourceFile) throws IOException
    {
        // Prepare request body
        RequestBody body = RequestBody.create(MediaType.parse("application/octet-stream"), sourceFile.toFile());

        // Prepare request
        Request request = new Request.Builder()
                .url(url)
                .addHeader("x-api-key", apiKey) // (!) Set API Key
                .addHeader("content-type", "application/octet-stream")
                .put(body)
                .build();

        // Execute request
        Response response = webClient.newCall(request).execute();

        return (response.code() == 200);
    }

    public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
    {
        // Prepare request
        Request request = new Request.Builder()
                .url(url)
                .build();
        // Execute request
        Response response = webClient.newCall(request).execute();

        byte[] fileBytes = response.body().bytes();

        // Save downloaded bytes to file
        OutputStream output = new FileOutputStream(destinationFile);
        output.write(fileBytes);
        output.flush();
        output.close();

        response.close();
    }
}

PHP

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>PDF Splitting Results</title>
</head>
<body>

<?php
// Note: If you have input files large than 200kb we highly recommend to check "async" mode example.

// Get submitted form data
$apiKey = $_POST["apiKey"]; // The authentication key (API Key). Get your own by registering at https://app.pdf.co
$splitText = $_POST["splitText"];


// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have a direct PDF file link, go to the step 3.

// Create URL
$url = "https://api.pdf.co/v1/file/upload/get-presigned-url" .
    "?name=" . urlencode($_FILES["file"]["name"]) .
    "&contenttype=application/octet-stream";

// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
// Execute request
$result = curl_exec($curl);

if (curl_errno($curl) == 0)
{
    $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

    if ($status_code == 200)
    {
        $json = json_decode($result, true);

        // Get URL to use for the file upload
        $uploadFileUrl = $json["presignedUrl"];
        // Get URL of uploaded file to use with later API calls
        $accessFileUrl = $json["url"];

        // 2. UPLOAD THE FILE TO CLOUD.

        $localFile = $_FILES["file"]["tmp_name"];
        $fileHandle = fopen($localFile, "r");

        curl_setopt($curl, CURLOPT_URL, $uploadFileUrl);
        curl_setopt($curl, CURLOPT_HTTPHEADER, array("content-type: application/octet-stream"));
        curl_setopt($curl, CURLOPT_PUT, true);
        curl_setopt($curl, CURLOPT_INFILE, $fileHandle);
        curl_setopt($curl, CURLOPT_INFILESIZE, filesize($localFile));

        // Execute request
        curl_exec($curl);

        fclose($fileHandle);

        if (curl_errno($curl))
        {
            // Display request error
            echo "Error: " . curl_error($curl);
        }
        else
        {
            $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

            if ($status_code == 200)
            {
                // 3. SPLIT UPLOADED PDF DOCUMENT
                SplitPdf($apiKey, $accessFileUrl, $splitText);
            }
            else
            {
                // Display service reported error
                echo "<p>Status code: " . $status_code . "</p>";
                echo "<p>" . $result . "</p>";
            }
        }
    }
    else
    {
        // Display service reported error
        echo "<p>Status code: " . $status_code . "</p>";
        echo "<p>" . $result . "</p>";
    }

    curl_close($curl);
}
else
{
    // Display CURL error
    echo "Error: " . curl_error($curl);
}


function SplitPdf($apiKey, $fileUrl, $splitText)
{
    // Prepare URL for `Split PDF By Text` API call
    $url = "https://api.pdf.co/v1/pdf/split2";

    // Prepare requests params
    $parameters = array();
    $parameters["name"] = "part.pdf";
    $parameters["url"] = $fileUrl;
    $parameters["searchString"] = $splitText;

    // Create Json payload
    $data = json_encode($parameters);

    // Create request
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_POST, true);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

    // Execute request
    $result = curl_exec($curl);

    if (curl_errno($curl) == 0)
    {
        $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        if ($status_code == 200)
        {
            $json = json_decode($result, true);

            if (!isset($json["error"]) || $json["error"] == false)
            {
                // Display links to splitted parts
                $resultFiles = $json["urls"];
                foreach ($resultFiles as &$resultFileUrl)
                    echo "<p><a href=" . $resultFileUrl . ">" . $resultFileUrl . "</p>";
            }
            else
            {
                // Display service reported error
                echo "<p>Error: " . $json["message"] . "</p>";
            }
        }
        else
        {
            // Display request error
            echo "<p>Status code: " . $status_code . "</p>";
            echo "<p>" . $result . "</p>";
        }
    }
    else
    {
        // Display request error
        echo "Error: " . curl_error($curl);
    }

    // Cleanup
    curl_close($curl);
}

?>

</body>
</html>

On Github#

Footnotes

1(1,2)

Supports publicly accessible links from any source, including Google Drive, Dropbox, and PDF.co Built-In Files Storage. To upload files via the API, check out the File Upload section. Note: If you experience intermittent Access Denied or Too Many Requests errors, please try adding cache: to enable built-in URL caching (e.g., cache:https://example.com/file1.pdf). For data security, you have the option to encrypt output files and decrypt input files. Learn more about user-controlled data encryption.

2(1,2)

Main response codes as follows:

Code	Description
`200`	Success
`400`	Bad request. Typically happens because of bad input parameters, or because the input URLs can’t be reached, possibly due to access restrictions like needing a login or password.
`401`	Unauthorized
`402`	Not enough credits
`445`	Timeout error. To process large documents or files please use asynchronous mode (set the `async` parameter to `true`) and then check status using the /job/check endpoint. If a file contains many pages then specify a page range using the `pages` parameter. The number of pages of the document can be obtained using the /pdf/info endpoint.

Note

For more see the complete list of available response codes.

3(1,2)

PDF.co Request size: API requests do not support request sizes of more than 4 megabytes in size. Please ensure that request sizes do not exceed this limit.

Was this page helpful?

PDF Split#

Available Methods#

/pdf/split/#

Attributes#

Query parameters#

Payload 3#

Response 2#

CURL#

/pdf/split2/#

Attributes#

Query parameters#

Payload 3#

Response 2#

CURL#

Code samples#

On Github#

Are you a human?

Payload 3 #

Response 2 #

Payload 3 #

Response 2 #