PDF Make Text Searchable or Unsearchable#

Available Methods#

/pdf/makesearchable
/pdf/makeunsearchable

/pdf/makesearchable#

This method converts scanned PDF documents (where pages are fully or partially made from scanned images) or image files into a text-searchable PDF. It runs OCR and adds an invisible text layer on top of your document that can be used for text search, text indexing, etc.

Method: POST
Endpoint: /v1/pdf/makesearchable

Attributes#

Note

Attributes are case-sensitive and should be inside JSON for POST request, for example:

{
    "url": "https://example.com/file1.pdf"
}

Attribute	Description	Required
`url`	URL to the source file. 1 If the URL is an image file (`jpg`, `png`, `tif`) then the result will be converted into a text-searchable PDF.	yes
`httpusername`	HTTP auth user name if required to access source `url`.	no
`httppassword`	HTTP auth password if required to access source `url`.	no
`lang`	Set the language for OCR (text from image) to use for scanned PDF, PNG, and JPG documents input when extracting text. The default is `eng`. Other languages are also supported: `deu`, `spa`, `chi_sim`, `jpn`, and many others, see Language Support. You can also use 2 languages simultaneously like this: `eng+deu` or `jpn+kor` (any combination).	no
`pages`	Comma-separated indices of pages (or page ranges) that you want to use. The first-page index is always 0. For example, if you have a 7-page document that you want to be split into 3 separate PDFs but a different number of pages it would go like this: 0, 1, 2- or 1, 2, 3-7 which will result in 1 PDF with page one, 1 PDF with page two and one PDF with the rest of the pages. You can also use inverted page numbers adding `!` before the number. E.g. `!0` means “the last page”, `1-!1` means “from the second to the penultimate page”, and `!1-` - “last two pages”. Also, you can use a single asterisk (`*`) character as the range to split the document into separate pages. The input must be in string format.	no
`password`	Password of PDF file, the input must be in string format.	no
`async`	Set `async` to `true` for long processes to run in the background, API will then return a `jobId` which you can use with the Background Job Check endpoint to check the status of the process and retrieve the output while you can proceed with other tasks.	no
`name`	File name for the generated output, the input must be in string format.	no
`expiration`	Set the expiration time for the output link in minutes (default is `60` i.e 60 minutes or 1 hour), After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.	no
`profiles`	Use this parameter to set additional configurations for fine-tuning and extra options. Explore the Profiles section for more.	no

Query parameters#

No query parameters accepted.

Payload#

{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-make-searchable/sample.pdf",
    "lang": "eng",
    "pages": "",
    "name": "result.pdf",
    "password": "",
    "async": "false",
    "profiles": ""
}

Response 2 #

{
    "url": "https://pdf-temp-files.s3.amazonaws.com/a0d52f35504e47148d1771fce875db7b/result.pdf",
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "result.pdf",
    "remainingCredits": 99033681,
    "credits": 35
}

CURL#

curl --location --request POST 'https://api.pdf.co/v1/pdf/makesearchable' \
--header 'x-api-key: ' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-make-searchable/sample.pdf",
    "lang": "eng",
    "pages": "",
    "name": "result.pdf",
    "password": "",
    "async": "false",
    "profiles": ""
}'

Language Support#

Code	Description
`afr`	Afrikaans
`amh`	Amharic
`ara`	Arabic
`asm`	Assamese
`aze`	Azerbaijani
`aze_cyrl`	Azerbaijani - Cyrillic
`bel`	Belarusian
`ben`	Bengali
`bod`	Tibetan
`bos`	Bosnian
`bul`	Bulgarian
`cat`	Catalan; Valencian
`ceb`	Cebuano
`ces`	Czech
`chi_sim`	Chinese - Simplified
`chi_tra`	Chinese - Traditional
`chr`	Cherokee
`cym`	Welsh
`dan`	Danish
`deu`	German
`dzo`	Dzongkha
`ell`	Greek, Modern (1453-)
`eng`	English
`enm`	English, Middle (1100-1500)
`epo`	Esperanto
`est`	Estonian
`eus`	Basque
`fas`	Persian
`fin`	Finnish
`fra`	French
`frk`	Frankish
`frm`	French, Middle (ca. 1400-1600)
`gle`	Irish
`glg`	Galician
`grc`	Greek, Ancient (-1453)
`guj`	Gujarati
`hat`	Haitian; Haitian Creole
`heb`	Hebrew
`hin`	Hindi
`hrv`	Croatian
`hun`	Hungarian
`iku`	Inuktitut
`ind`	Indonesian
`isl`	Icelandic
`ita`	Italian
`ita_old`	Italian - Old
`jav`	Javanese
`jpn`	Japanese
`kan`	Kannada
`kat`	Georgian
`kat_old`	Georgian - Old
`kaz`	Kazakh
`khm`	Central Khmer
`kir`	Kirghiz; Kyrgyz
`kor`	Korean
`kur`	Kurdish
`lao`	Lao
`lat`	Latin
`lav`	Latvian
`lit`	Lithuanian
`mal`	Malayalam
`mar`	Marathi
`mkd`	Macedonian
`mlt`	Maltese
`msa`	Malay
`mya`	Burmese
`nep`	Nepali
`nld`	Dutch; Flemish
`nor`	Norwegian
`ori`	Oriya
`pan`	Panjabi; Punjabi
`pol`	Polish
`por`	Portuguese
`pus`	Pushto; Pashto
`ron`	Romanian; Moldavian; Moldovan
`rus`	Russian
`san`	Sanskrit
`sin`	Sinhala; Sinhalese
`slk`	Slovak
`slv`	Slovenian
`spa`	Spanish; Castilian
`spa_old`	Spanish; Castilian - Old
`sqi`	Albanian
`srp`	Serbian
`srp_latn`	Serbian - Latin
`swa`	Swahili
`swe`	Swedish
`syr`	Syriac
`tam`	Tamil
`tel`	Telugu
`tgk`	Tajik
`tgl`	Tagalog
`tha`	Thai
`tir`	Tigrinya
`tur`	Turkish
`uig`	Uighur; Uyghur
`ukr`	Ukrainian
`urd`	Urdu
`uzb`	Uzbek
`uzb_cyrl`	Uzbek - Cyrillic
`vie`	Vietnamese
`yid`	Yiddish

/pdf/makeunsearchable#

This method converts PDF files into a “text unsearchable” version by converting your PDF into a “scanned” PDF file which is effectively a flat image.

Method: POST
Endpoint: /v1/pdf/makeunsearchable

Attributes#

Note

Attributes are case-sensitive and should be inside JSON for POST request, for example:

{
    "url": "https://example.com/file1.pdf"
}

Attribute	Description	Required
`url`	URL to the source file. 1	yes
`httpusername`	HTTP auth user name if required to access source `url`.	no
`httppassword`	HTTP auth password if required to access source `url`.	no
`pages`	Comma-separated indices of pages (or page ranges) that you want to use. The first-page index is always 0. For example, if you have a 7-page document that you want to be split into 3 separate PDFs but a different number of pages it would go like this: 0, 1, 2- or 1, 2, 3-7 which will result in 1 PDF with page one, 1 PDF with page two and one PDF with the rest of the pages. You can also use inverted page numbers adding `!` before the number. E.g. `!0` means “the last page”, `1-!1` means “from the second to the penultimate page”, and `!1-` - “last two pages”. Also, you can use a single asterisk (`*`) character as the range to split the document into separate pages. The input must be in string format.	no
`password`	Password of PDF file, the input must be in string format.	no
`async`	Set `async` to `true` for long processes to run in the background, API will then return a `jobId` which you can use with the Background Job Check endpoint to check the status of the process and retrieve the output while you can proceed with other tasks.	no
`name`	File name for the generated output, the input must be in string format.	no
`expiration`	Set the expiration time for the output link in minutes (default is `60` i.e 60 minutes or 1 hour), After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.	no
`profiles`	Use this parameter to set additional configurations for fine-tuning and extra options. Explore the Profiles section for more.	no

Query parameters#

No query parameters accepted.

Payload#

{
    "url": "pdfco-test-files.s3.us-west-2.amazonaws.compdf-to-text/sample.pdf",
    "pages": "",
    "name": "result.pdf",
    "password": "",
    "async": "false",
    "profiles": ""
}

Response 2 #

{
    "url": "https://pdf-temp-files.s3.amazonaws.com/6b755238963a472abf67fd5e7ffafd79/result.pdf",
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "result.pdf",
    "remainingCredits": 327244,
    "credits": 35
}

CURL#

curl --location --request POST 'https://api.pdf.co/v1/pdf/makeunsearchable' \
--header 'x-api-key: ' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "pdfco-test-files.s3.us-west-2.amazonaws.compdf-to-text/sample.pdf",
    "pages": "",
    "name": "result.pdf",
    "password": "",
    "async": "false",
    "profiles": ""
}'

Code samples#

Footnotes

1(1,2)

Supports links from Google Drive, Dropbox, and PDF.co Built-In Files Storage. To upload files via the API check out the File Upload section. Note: If you experience intermittent Too Many Requests or Access Denied errors, please try to add cache: to enable built-in URL caching. (e.g cache:https://example.com/file1.pdf) For data security, you have the option to encrypt output files and decrypt input files. Learn more about user-controlled data encryption.

2(1,2)

Main response codes as follows:

Code	Description
`200`	Success
`400`	Bad request. Typically happens because of bad input parameters, or because the input URLs can’t be reached, possibly due to access restrictions like needing a login or password.
`401`	Unauthorized
`402`	Not enough credits
`445`	Timeout error. To process large documents or files please use asynchronous mode (set the `async` parameter to `true`) and then check status using the /job/check endpoint. If a file contains many pages then specify a page range using the `pages` parameter. The number of pages of the document can be obtained using the /pdf/info endpoint.

Note

For more see the complete list of available response codes.

Was this page helpful?

PDF Make Text Searchable or Unsearchable#

Available Methods#

/pdf/makesearchable#

Attributes#

Query parameters#

Payload#

Response 2#

CURL#

Language Support#

/pdf/makeunsearchable#

Attributes#

Query parameters#

Payload#

Response 2#

CURL#

Code samples#

Are you a human?

Response 2 #

Response 2 #