PDF Search and Delete Text#

Available Methods#

/pdf/edit/delete-text

/pdf/edit/delete-text#

Search and remove text in a PDF.

Method: POST
Endpoint: /v1/pdf/edit/delete-text

Attributes#

Note

Attributes are case-sensitive and should be inside JSON for POST request, for example:

{
    "url": "https://example.com/file1.pdf"
}

Important

When using regular expressions in JSON payloads, ensure that backslashes are properly escaped. For example, a single backslash \ should be written as \\.

Attribute	Description	Required
`url`	URL to the source file. 1	yes
`httpusername`	HTTP auth user name if required to access source `url`.	no
`httppassword`	HTTP auth password if required to access source `url`.	no
`searchStrings[]`	The array of strings to search.	yes
`replacementLimit`	Limit the number of searches & replacements for every item. The default value is `0` which means unlimited searches and replacements so every found occurrence will be replaced.	no
`caseSensitive`	Set to `false` to use case-insensitive search.	no
`regex`	Set to `true` to use regular expression for search string(s).	no
`name`	File name for the generated output, the input must be in string format.	no
`expiration`	Set the expiration time for the output link in minutes (default is `60` i.e 60 minutes or 1 hour), After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.	no
`pages`	Specify page indices as comma-separated values or ranges to process (e.g. `"0, 1, 2-"` or `"1, 2, 3-7"`). The first-page index is `0`, Use `"!"` before a number for inverted page numbers (e.g. `"!0"` for the last page). If not specified, the default configuration processes all `pages`. The input must be in string format.	no
`password`	Password of PDF file, the input must be in string format.	no
`async`	Set `async` to `true` for long processes to run in the background, API will then return a `jobId` which you can use with the Background Job Check endpoint to check the status of the process and retrieve the output while you can proceed with other tasks.	no
`profiles`	Use this parameter to set additional configurations for fine-tuning and extra options. Explore the Profiles section for more.	no

Note

To apply a block out color to the text which was deleted use the profiles attribute with, e.g.:

"profiles": "{'UsePatch': true, 'PatchColor': '#000000', 'RemoveTextUnderPatch': true}"

Query parameters#

No query parameters accepted.

Payload 3 #

{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
    "name": "pdfWithTextDeleted",
    "caseSensitive": "false",
    "searchString": "Invoice",
    "replacementLimit": 0,
    "async": false
}

Response 2 #

{
    "url": "https://pdf-temp-files.s3.us-west-2.amazonaws.com/ZOSEQZFNVCYLD5N5CJFVIYQKBVLR8OKD/pdfWithTextDeleted.pdf?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzECYaDKOO4WmO5C5shyOYYSKCAVsAo6VkB5HQjTBd9dMlJujQdEkPfNdPeLfq2mF54s2ESZBmIAJ5UgDUo3J9R475CCS4M3nuuo%2FSJwRy5gNiJdb1ZY0uCtP87x83nH%2B%2BSDu5JK%2F%2BEOrd3MREt8KE3BsQOrv%2FKMdnK%2BT5nJ2x2hC87vHue%2FudY7%2FWX54vx4tfFobEyhEozLbPnwYyKOdEsYYWH7e8tm7XV4UeKxCoKMaXSEPvOod80hR62qXnEI42fOsON3M%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHLUVIAIPX/20230220/us-west-2/s3/aws4_request&X-Amz-Date=20230220T205521Z&X-Amz-SignedHeaders=host&X-Amz-Signature=9f79c1a30d4f373e495e735e908375dad2ae6dcafcee761a477748c2b8298605",
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "pdfWithTextDeleted.pdf",
    "credits": 21,
    "duration": 189,
    "remainingCredits": 96235635
}

CURL#

curl --location --request POST 'https://api.pdf.co/v1/pdf/edit/delete-text' \
--header 'Content-Type: application/json' \
--header 'x-api-key: *******************' \
--data-raw '{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
    "name": "pdfWithTextDeleted",
    "caseSensitive": "false",
    "searchString": "Invoice",
    "replacementLimit": 0,
    "async": false
}'

Code samples#

JavaScript / Node.js

var https = require("https");
var path = require("path");
var fs = require("fs");


// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const API_KEY = "***********************************";


// Direct URL of source PDF file.
// You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
const SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
// PDF document password. Leave empty for unprotected documents.
const Password = "";
// Destination PDF file name
const DestinationFile = "./result.pdf";


// Prepare request to `Delete Text from PDF` API endpoint
var queryPath = `/v1/pdf/edit/delete-text`;
// JSON payload for api request
var jsonPayload = JSON.stringify({
    name: path.basename(destinationFile), password: password, url: SourceFileUrl, searchString: 'conspicuous'
});

var reqOptions = {
    host: "api.pdf.co",
    method: "POST",
    path: queryPath,
    headers: {
        "x-api-key": API_KEY,
        "Content-Type": "application/json",
        "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
    }
};

// Send request
var postRequest = https.request(reqOptions, (response) => {
    response.on("data", (d) => {
        // Parse JSON response
        var data = JSON.parse(d);
        if (data.error == false) {
            // Download PDF file
            var file = fs.createWriteStream(DestinationFile);
            https.get(data.url, (response2) => {
                response2.pipe(file)
                    .on("close", () => {
                        console.log(`Generated PDF file saved as "${DestinationFile}" file.`);
                    });
            });
        }
        else {
            // Service reported error
            console.log(data.message);
        }
    });
}).on("error", (e) => {
    // Request error
    console.log(e);
});

// Write request data
postRequest.write(jsonPayload);
postRequest.end();

Python

import os
import requests # pip install requests

# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "******************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

# Direct URL of source PDF file.
# You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
SourceFileURL = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf"
# PDF document password. Leave empty for unprotected documents.
Password = ""
# Destination PDF file name
DestinationFile = ".\\result.pdf"

def main(args = None):
    deleteTextFromPdf(SourceFileURL, DestinationFile)


def deleteTextFromPdf(uploadedFileUrl, destinationFile):
    """Delete Text from PDF using PDF.co Web API"""

    # Prepare requests params as JSON
    parameters = {}
    parameters["name"] = os.path.basename(destinationFile)
    parameters["password"] = Password
    parameters["url"] = uploadedFileUrl
    parameters["searchString"] = "conspicuous"

    # Prepare URL for 'Delete Text from PDF' API request
    url = "{}/pdf/edit/delete-text".format(BASE_URL)

    # Execute request and get response as JSON
    response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()

        if json["error"] == False:
            #  Get URL of result file
            resultFileUrl = json["url"]
            # Download result file
            r = requests.get(resultFileUrl, stream=True)
            if (r.status_code == 200):
                with open(destinationFile, 'wb') as file:
                    for chunk in r:
                        file.write(chunk)
                print(f"Result file saved as \"{destinationFile}\" file.")
            else:
                print(f"Request error: {response.status_code} {response.reason}")
        else:
            # Show service reported error
            print(json["message"])
    else:
        print(f"Request error: {response.status_code} {response.reason}")

if __name__ == '__main__':
    main()

using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

namespace PDFcoApiExample
{
  class Program
  {
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    const String API_KEY = "***********************************";

    // Direct URL of source PDF file.
        // You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
    const string SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
    // PDF document password. Leave empty for unprotected documents.
    const string Password = "";
    // Destination PDF file name
    const string DestinationFile = @".\result.pdf";

    static void Main(string[] args)
    {
      // Create standard .NET web client instance
      WebClient webClient = new WebClient();

      // Set API Key
      webClient.Headers.Add("x-api-key", API_KEY);

      // Prepare requests params as JSON
      Dictionary<string, string> parameters = new Dictionary<string, string>();
      parameters.Add("name", Path.GetFileName(DestinationFile));
      parameters.Add("password", Password);
      parameters.Add("url", SourceFileUrl);
      parameters.Add("searchString", "conspicuous");

      // Convert dictionary of params to JSON
      string jsonPayload = JsonConvert.SerializeObject(parameters);

      // URL of `Delete Text from PDF` API call
      string url = "https://api.pdf.co/v1/pdf/edit/delete-text";

      try
      {
        // Execute POST request with JSON payload
        string response = webClient.UploadString(url, jsonPayload);

        // Parse JSON response
        JObject json = JObject.Parse(response);

        if (json["error"].ToObject<bool>() == false)
        {
          // Get URL of generated PDF file
          string resultFileUrl = json["url"].ToString();

          // Download PDF file
          webClient.DownloadFile(resultFileUrl, DestinationFile);

          Console.WriteLine("Generated PDF file saved as \"{0}\" file.", DestinationFile);
        }
        else
        {
          Console.WriteLine(json["message"].ToString());
        }
      }
      catch (WebException e)
      {
        Console.WriteLine(e.ToString());
      }

      webClient.Dispose();


      Console.WriteLine();
      Console.WriteLine("Press any key...");
      Console.ReadKey();
    }
  }
}

Java

package com.company;

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import okhttp3.*;

import java.io.*;
import java.net.*;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main
{
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    final static String API_KEY = "**********************************";

    // Direct URL of source PDF file.
    // You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
    final static String SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
    // PDF document password. Leave empty for unprotected documents.
    final static String Password = "";
    // Destination PDF file name
    final static Path DestinationFile = Paths.get(".\\result.pdf");


    public static void main(String[] args) throws IOException
    {
        // Create HTTP client instance
        OkHttpClient webClient = new OkHttpClient();

        // Prepare URL for `Delete Text from PDF` API call
        String query = "https://api.pdf.co/v1/pdf/edit/delete-text";

        // Make correctly escaped (encoded) URL
        URL url = null;
        try
        {
            url = new URI(null, query, null).toURL();
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();
        }

        // Create JSON payload
        String jsonPayload = String.format("{\"name\": \"%s\", \"password\": \"%s\", \"url\": \"%s\", \"searchString\": \"conspicuous\"}",
                DestinationFile.getFileName(),
                Password,
                SourceFileUrl);

        // Prepare request body
        RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonPayload);

        // Prepare request
        Request request = new Request.Builder()
            .url(url)
            .addHeader("x-api-key", API_KEY) // (!) Set API Key
            .addHeader("Content-Type", "application/json")
            .post(body)
            .build();

        // Execute request
        Response response = webClient.newCall(request).execute();

        if (response.code() == 200)
        {
            // Parse JSON response
            JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

            boolean error = json.get("error").getAsBoolean();
            if (!error)
            {
                // Get URL of generated PDF file
                String resultFileUrl = json.get("url").getAsString();

                // Download PDF file
                downloadFile(webClient, resultFileUrl, DestinationFile.toFile());

                System.out.printf("Generated PDF file saved as \"%s\" file.", DestinationFile.toString());
            }
            else
            {
                // Display service reported error
                System.out.println(json.get("message").getAsString());
            }
        }
        else
        {
            // Display request error
            System.out.println(response.code() + " " + response.message());
        }
    }

    public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
    {
        // Prepare request
        Request request = new Request.Builder()
                .url(url)
                .build();
        // Execute request
        Response response = webClient.newCall(request).execute();

        byte[] fileBytes = response.body().bytes();

        // Save downloaded bytes to file
        OutputStream output = new FileOutputStream(destinationFile);
        output.write(fileBytes);
        output.flush();
        output.close();

        response.close();
    }
}

PHP

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Cloud API asynchronous "Delete Text from PDF" job example (allows to avoid timeout errors).</title>
</head>
<body>

<?php

// Cloud API asynchronous "Delete Text from PDF" job example.

// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
$apiKey = "***********************************";

// Direct URL of source PDF file. Check another example if you need to upload a local file to the cloud.
// You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
$sourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
// PDF document password. Leave empty for unprotected documents.
$password = "";

// Prepare URL for `Delete Text from PDF` API call
$url = "https://api.pdf.co/v1/pdf/edit/delete-text";

// Prepare requests params
$parameters = array();
$parameters["password"] = $password;
$parameters["url"] = $sourceFileUrl;
$parameters["searchString"] = "conspicuous";
$parameters["async"] = true; // (!) Make asynchronous job

// Create Json payload
$data = json_encode($parameters);

// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

// Execute request
$result = curl_exec($curl);

if (curl_errno($curl) == 0)
{
    $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

    if ($status_code == 200)
    {
        $json = json_decode($result, true);

        if (!isset($json["error"]) || $json["error"] == false)
        {
            // URL of generated PDF file that will available after the job completion
            $resultFileUrl = $json["url"];
            // Asynchronous job ID
            $jobId = $json["jobId"];

            // Check the job status in a loop
            do
            {
                $status = CheckJobStatus($jobId, $apiKey); // Possible statuses: "working", "failed", "aborted", "success".

                // Display timestamp and status (for demo purposes)
                echo "<p>" . date(DATE_RFC2822) . ": " . $status . "</p>";

                if ($status == "success")
                {
                    // Display link to the file with conversion results
                    echo "<div><h2>Conversion Result:</h2><a href='" . $resultFileUrl . "' target='_blank'>" . $resultFileUrl . "</a></div>";
                    break;
                }
                else if ($status == "working")
                {
                    // Pause for a few seconds
                    sleep(3);
                }
                else
                {
                    echo $status . "<br/>";
                    break;
                }
            }
            while (true);
        }
        else
        {
            // Display service reported error
            echo "<p>Error: " . $json["message"] . "</p>";
        }
    }
    else
    {
        // Display request error
        echo "<p>Status code: " . $status_code . "</p>";
        echo "<p>" . $result . "</p>";
    }
}
else
{
    // Display CURL error
    echo "Error: " . curl_error($curl);
}

// Cleanup
curl_close($curl);


function CheckJobStatus($jobId, $apiKey)
{
    $status = null;

  // Create URL
  $url = "https://api.pdf.co/v1/job/check";

  // Prepare requests params
  $parameters = array();
  $parameters["jobid"] = $jobId;

  // Create Json payload
  $data = json_encode($parameters);

  // Create request
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_POST, true);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

    // Execute request
    $result = curl_exec($curl);

    if (curl_errno($curl) == 0)
    {
        $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        if ($status_code == 200)
        {
            $json = json_decode($result, true);

            if (!isset($json["error"]) || $json["error"] == false)
            {
                $status = $json["status"];
            }
            else
            {
                // Display service reported error
                echo "<p>Error: " . $json["message"] . "</p>";
            }
        }
        else
        {
            // Display request error
            echo "<p>Status code: " . $status_code . "</p>";
            echo "<p>" . $result . "</p>";
        }
    }
    else
    {
        // Display CURL error
        echo "Error: " . curl_error($curl);
    }

    // Cleanup
    curl_close($curl);

    return $status;
}

?>

</body>
</html>

On Github#

Footnotes

1

Supports publicly accessible links from any source, including Google Drive, Dropbox, and PDF.co Built-In Files Storage. To upload files via the API, check out the File Upload section. Note: If you experience intermittent Access Denied or Too Many Requests errors, please try adding cache: to enable built-in URL caching (e.g., cache:https://example.com/file1.pdf). For data security, you have the option to encrypt output files and decrypt input files. Learn more about user-controlled data encryption.

2

Main response codes as follows:

Code	Description
`200`	Success
`400`	Bad request. Typically happens because of bad input parameters, or because the input URLs can’t be reached, possibly due to access restrictions like needing a login or password.
`401`	Unauthorized
`402`	Not enough credits
`445`	Timeout error. To process large documents or files please use asynchronous mode (set the `async` parameter to `true`) and then check status using the /job/check endpoint. If a file contains many pages then specify a page range using the `pages` parameter. The number of pages of the document can be obtained using the /pdf/info endpoint.

Note

For more see the complete list of available response codes.

3

PDF.co Request size: API requests do not support request sizes of more than 4 megabytes in size. Please ensure that request sizes do not exceed this limit.

Was this page helpful?

PDF Search and Delete Text#

Available Methods#

/pdf/edit/delete-text#

Attributes#

Query parameters#

Payload 3#

Response 2#

CURL#

Code samples#

On Github#

Are you a human?

Payload 3 #

Response 2 #