PDF Search and Delete Text#

Available Methods#

/pdf/edit/delete-text#

Search and remove text in a PDF.

  • Method: POST

  • Endpoint: /v1/pdf/edit/delete-text

Attributes#

Note

Attributes are case-sensitive and should be inside JSON for POST request, for example:

{
    "url": "https://example.com/file1.pdf"
}

Attribute

Description

Required

url

URL to the source file. 1

yes

httpusername

HTTP auth user name if required to access source url.

no

httppassword

HTTP auth password if required to access source url.

no

searchStrings[]

The array of strings to search.

yes

replacementLimit

Limit the number of searches & replacements for every item. The default value is 0 which means unlimited searches and replacements so every found occurrence will be replaced.

no

caseSensitive

Set to false to use case-insensitive search.

no

regex

Set to true to use regular expression for search string(s).

no

name

File name for the generated output, the input must be in string format.

no

expiration

Set the expiration time for the output link in minutes (default is 60 i.e 60 minutes or 1 hour), After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.

no

pages

Specify page indices as comma-separated values or ranges to process (e.g. "0, 1, 2-" or "1, 2, 3-7"). The first-page index is 0, Use "!" before a number for inverted page numbers (e.g. "!0" for the last page). If not specified, the default configuration processes all pages. The input must be in string format.

no

password

Password of PDF file, the input must be in string format.

no

async

Set async to true for long processes to run in the background, API will then return a jobId which you can use with the Background Job Check endpoint to check the status of the process and retrieve the output while you can proceed with other tasks.

no

profiles

Use this parameter to set additional configurations for fine-tuning and extra options. Explore the Profiles section for more.

no

Note

To apply a block out color to the text which was deleted use the profiles attribute with, e.g.:

"profiles": "{'UsePatch': true, 'PatchColor': '#000000', 'RemoveTextUnderPatch': true}"

Query parameters#

No query parameters accepted.

Payload 3#

{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
    "name": "pdfWithTextDeleted",
    "caseSensitive": "false",
    "searchString": "Invoice",
    "replacementLimit": 0,
    "async": false
}

Response 2#

{
    "url": "https://pdf-temp-files.s3.us-west-2.amazonaws.com/ZOSEQZFNVCYLD5N5CJFVIYQKBVLR8OKD/pdfWithTextDeleted.pdf?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzECYaDKOO4WmO5C5shyOYYSKCAVsAo6VkB5HQjTBd9dMlJujQdEkPfNdPeLfq2mF54s2ESZBmIAJ5UgDUo3J9R475CCS4M3nuuo%2FSJwRy5gNiJdb1ZY0uCtP87x83nH%2B%2BSDu5JK%2F%2BEOrd3MREt8KE3BsQOrv%2FKMdnK%2BT5nJ2x2hC87vHue%2FudY7%2FWX54vx4tfFobEyhEozLbPnwYyKOdEsYYWH7e8tm7XV4UeKxCoKMaXSEPvOod80hR62qXnEI42fOsON3M%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHLUVIAIPX/20230220/us-west-2/s3/aws4_request&X-Amz-Date=20230220T205521Z&X-Amz-SignedHeaders=host&X-Amz-Signature=9f79c1a30d4f373e495e735e908375dad2ae6dcafcee761a477748c2b8298605",
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "pdfWithTextDeleted.pdf",
    "credits": 21,
    "duration": 189,
    "remainingCredits": 96235635
}

CURL#

curl --location --request POST 'https://api.pdf.co/v1/pdf/edit/delete-text' \
--header 'Content-Type: application/json' \
--header 'x-api-key: *******************' \
--data-raw '{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
    "name": "pdfWithTextDeleted",
    "caseSensitive": "false",
    "searchString": "Invoice",
    "replacementLimit": 0,
    "async": false
}'


Code samples#

var https = require("https");
var path = require("path");
var fs = require("fs");


// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const API_KEY = "***********************************";


// Direct URL of source PDF file.
// You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
const SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
// PDF document password. Leave empty for unprotected documents.
const Password = "";
// Destination PDF file name
const DestinationFile = "./result.pdf";


// Prepare request to `Delete Text from PDF` API endpoint
var queryPath = `/v1/pdf/edit/delete-text`;
// JSON payload for api request
var jsonPayload = JSON.stringify({
    name: path.basename(destinationFile), password: password, url: SourceFileUrl, searchString: 'conspicuous'
});

var reqOptions = {
    host: "api.pdf.co",
    method: "POST",
    path: queryPath,
    headers: {
        "x-api-key": API_KEY,
        "Content-Type": "application/json",
        "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
    }
};

// Send request
var postRequest = https.request(reqOptions, (response) => {
    response.on("data", (d) => {
        // Parse JSON response
        var data = JSON.parse(d);
        if (data.error == false) {
            // Download PDF file
            var file = fs.createWriteStream(DestinationFile);
            https.get(data.url, (response2) => {
                response2.pipe(file)
                    .on("close", () => {
                        console.log(`Generated PDF file saved as "${DestinationFile}" file.`);
                    });
            });
        }
        else {
            // Service reported error
            console.log(data.message);
        }
    });
}).on("error", (e) => {
    // Request error
    console.log(e);
});

// Write request data
postRequest.write(jsonPayload);
postRequest.end();
import os
import requests # pip install requests

# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "******************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

# Direct URL of source PDF file.
# You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
SourceFileURL = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf"
# PDF document password. Leave empty for unprotected documents.
Password = ""
# Destination PDF file name
DestinationFile = ".\\result.pdf"

def main(args = None):
    deleteTextFromPdf(SourceFileURL, DestinationFile)


def deleteTextFromPdf(uploadedFileUrl, destinationFile):
    """Delete Text from PDF using PDF.co Web API"""

    # Prepare requests params as JSON
    parameters = {}
    parameters["name"] = os.path.basename(destinationFile)
    parameters["password"] = Password
    parameters["url"] = uploadedFileUrl
    parameters["searchString"] = "conspicuous"

    # Prepare URL for 'Delete Text from PDF' API request
    url = "{}/pdf/edit/delete-text".format(BASE_URL)

    # Execute request and get response as JSON
    response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()

        if json["error"] == False:
            #  Get URL of result file
            resultFileUrl = json["url"]
            # Download result file
            r = requests.get(resultFileUrl, stream=True)
            if (r.status_code == 200):
                with open(destinationFile, 'wb') as file:
                    for chunk in r:
                        file.write(chunk)
                print(f"Result file saved as \"{destinationFile}\" file.")
            else:
                print(f"Request error: {response.status_code} {response.reason}")
        else:
            # Show service reported error
            print(json["message"])
    else:
        print(f"Request error: {response.status_code} {response.reason}")

if __name__ == '__main__':
    main()
using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

namespace PDFcoApiExample
{
  class Program
  {
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    const String API_KEY = "***********************************";

    // Direct URL of source PDF file.
        // You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
    const string SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
    // PDF document password. Leave empty for unprotected documents.
    const string Password = "";
    // Destination PDF file name
    const string DestinationFile = @".\result.pdf";

    static void Main(string[] args)
    {
      // Create standard .NET web client instance
      WebClient webClient = new WebClient();

      // Set API Key
      webClient.Headers.Add("x-api-key", API_KEY);

      // Prepare requests params as JSON
      Dictionary<string, string> parameters = new Dictionary<string, string>();
      parameters.Add("name", Path.GetFileName(DestinationFile));
      parameters.Add("password", Password);
      parameters.Add("url", SourceFileUrl);
      parameters.Add("searchString", "conspicuous");

      // Convert dictionary of params to JSON
      string jsonPayload = JsonConvert.SerializeObject(parameters);

      // URL of `Delete Text from PDF` API call
      string url = "https://api.pdf.co/v1/pdf/edit/delete-text";

      try
      {
        // Execute POST request with JSON payload
        string response = webClient.UploadString(url, jsonPayload);

        // Parse JSON response
        JObject json = JObject.Parse(response);

        if (json["error"].ToObject<bool>() == false)
        {
          // Get URL of generated PDF file
          string resultFileUrl = json["url"].ToString();

          // Download PDF file
          webClient.DownloadFile(resultFileUrl, DestinationFile);

          Console.WriteLine("Generated PDF file saved as \"{0}\" file.", DestinationFile);
        }
        else
        {
          Console.WriteLine(json["message"].ToString());
        }
      }
      catch (WebException e)
      {
        Console.WriteLine(e.ToString());
      }

      webClient.Dispose();


      Console.WriteLine();
      Console.WriteLine("Press any key...");
      Console.ReadKey();
    }
  }
}
package com.company;

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import okhttp3.*;

import java.io.*;
import java.net.*;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main
{
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    final static String API_KEY = "**********************************";

    // Direct URL of source PDF file.
    // You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
    final static String SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
    // PDF document password. Leave empty for unprotected documents.
    final static String Password = "";
    // Destination PDF file name
    final static Path DestinationFile = Paths.get(".\\result.pdf");


    public static void main(String[] args) throws IOException
    {
        // Create HTTP client instance
        OkHttpClient webClient = new OkHttpClient();

        // Prepare URL for `Delete Text from PDF` API call
        String query = "https://api.pdf.co/v1/pdf/edit/delete-text";

        // Make correctly escaped (encoded) URL
        URL url = null;
        try
        {
            url = new URI(null, query, null).toURL();
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();
        }

        // Create JSON payload
        String jsonPayload = String.format("{\"name\": \"%s\", \"password\": \"%s\", \"url\": \"%s\", \"searchString\": \"conspicuous\"}",
                DestinationFile.getFileName(),
                Password,
                SourceFileUrl);

        // Prepare request body
        RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonPayload);

        // Prepare request
        Request request = new Request.Builder()
            .url(url)
            .addHeader("x-api-key", API_KEY) // (!) Set API Key
            .addHeader("Content-Type", "application/json")
            .post(body)
            .build();

        // Execute request
        Response response = webClient.newCall(request).execute();

        if (response.code() == 200)
        {
            // Parse JSON response
            JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

            boolean error = json.get("error").getAsBoolean();
            if (!error)
            {
                // Get URL of generated PDF file
                String resultFileUrl = json.get("url").getAsString();

                // Download PDF file
                downloadFile(webClient, resultFileUrl, DestinationFile.toFile());

                System.out.printf("Generated PDF file saved as \"%s\" file.", DestinationFile.toString());
            }
            else
            {
                // Display service reported error
                System.out.println(json.get("message").getAsString());
            }
        }
        else
        {
            // Display request error
            System.out.println(response.code() + " " + response.message());
        }
    }

    public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
    {
        // Prepare request
        Request request = new Request.Builder()
                .url(url)
                .build();
        // Execute request
        Response response = webClient.newCall(request).execute();

        byte[] fileBytes = response.body().bytes();

        // Save downloaded bytes to file
        OutputStream output = new FileOutputStream(destinationFile);
        output.write(fileBytes);
        output.flush();
        output.close();

        response.close();
    }
}
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Cloud API asynchronous "Delete Text from PDF" job example (allows to avoid timeout errors).</title>
</head>
<body>

<?php

// Cloud API asynchronous "Delete Text from PDF" job example.

// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
$apiKey = "***********************************";

// Direct URL of source PDF file. Check another example if you need to upload a local file to the cloud.
// You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
$sourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
// PDF document password. Leave empty for unprotected documents.
$password = "";

// Prepare URL for `Delete Text from PDF` API call
$url = "https://api.pdf.co/v1/pdf/edit/delete-text";

// Prepare requests params
$parameters = array();
$parameters["password"] = $password;
$parameters["url"] = $sourceFileUrl;
$parameters["searchString"] = "conspicuous";
$parameters["async"] = true; // (!) Make asynchronous job

// Create Json payload
$data = json_encode($parameters);

// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

// Execute request
$result = curl_exec($curl);

if (curl_errno($curl) == 0)
{
    $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

    if ($status_code == 200)
    {
        $json = json_decode($result, true);

        if (!isset($json["error"]) || $json["error"] == false)
        {
            // URL of generated PDF file that will available after the job completion
            $resultFileUrl = $json["url"];
            // Asynchronous job ID
            $jobId = $json["jobId"];

            // Check the job status in a loop
            do
            {
                $status = CheckJobStatus($jobId, $apiKey); // Possible statuses: "working", "failed", "aborted", "success".

                // Display timestamp and status (for demo purposes)
                echo "<p>" . date(DATE_RFC2822) . ": " . $status . "</p>";

                if ($status == "success")
                {
                    // Display link to the file with conversion results
                    echo "<div><h2>Conversion Result:</h2><a href='" . $resultFileUrl . "' target='_blank'>" . $resultFileUrl . "</a></div>";
                    break;
                }
                else if ($status == "working")
                {
                    // Pause for a few seconds
                    sleep(3);
                }
                else
                {
                    echo $status . "<br/>";
                    break;
                }
            }
            while (true);
        }
        else
        {
            // Display service reported error
            echo "<p>Error: " . $json["message"] . "</p>";
        }
    }
    else
    {
        // Display request error
        echo "<p>Status code: " . $status_code . "</p>";
        echo "<p>" . $result . "</p>";
    }
}
else
{
    // Display CURL error
    echo "Error: " . curl_error($curl);
}

// Cleanup
curl_close($curl);


function CheckJobStatus($jobId, $apiKey)
{
    $status = null;

  // Create URL
  $url = "https://api.pdf.co/v1/job/check";

  // Prepare requests params
  $parameters = array();
  $parameters["jobid"] = $jobId;

  // Create Json payload
  $data = json_encode($parameters);

  // Create request
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_POST, true);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

    // Execute request
    $result = curl_exec($curl);

    if (curl_errno($curl) == 0)
    {
        $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        if ($status_code == 200)
        {
            $json = json_decode($result, true);

            if (!isset($json["error"]) || $json["error"] == false)
            {
                $status = $json["status"];
            }
            else
            {
                // Display service reported error
                echo "<p>Error: " . $json["message"] . "</p>";
            }
        }
        else
        {
            // Display request error
            echo "<p>Status code: " . $status_code . "</p>";
            echo "<p>" . $result . "</p>";
        }
    }
    else
    {
        // Display CURL error
        echo "Error: " . curl_error($curl);
    }

    // Cleanup
    curl_close($curl);

    return $status;
}

?>

</body>
</html>

On Github#

Footnotes

1

Supports publicly accessible links from any source, including Google Drive, Dropbox, and PDF.co Built-In Files Storage. To upload files via the API, check out the File Upload section. Note: If you experience intermittent Access Denied or Too Many Requests errors, please try adding cache: to enable built-in URL caching (e.g., cache:https://example.com/file1.pdf). For data security, you have the option to encrypt output files and decrypt input files. Learn more about user-controlled data encryption.

2

Main response codes as follows:

Code

Description

200

Success

400

Bad request. Typically happens because of bad input parameters, or because the input URLs can’t be reached, possibly due to access restrictions like needing a login or password.

401

Unauthorized

402

Not enough credits

445

Timeout error. To process large documents or files please use asynchronous mode (set the async parameter to true) and then check status using the /job/check endpoint. If a file contains many pages then specify a page range using the pages parameter. The number of pages of the document can be obtained using the /pdf/info endpoint.

3

PDF.co Request size: API requests do not support request sizes of more than 4 megabytes in size. Please ensure that request sizes do not exceed this limit.