Extraction
Editing
- PDF Add
- PDF Search Text and Replace
- PDF Search and Delete Text
PDF Conversion
- PDF to CSV
- PDF to JSON
- PDF to Text
- PDF to Excel
- PDF to XML
- PDF to HTML
- PDF to Image
- PDF from Document
- PDF from Image
- PDF from URL
- PDF from HTML
- PDF from Email
Excel Conversion
- Convert from Excel
PDF Merging & Splitting
- Merge
- PDF Split
Forms
- PDF Info Reader
Find & Search
- PDF Find
- PDF Change Text Searchable
Document, File & System
- PDF Compress
- PDF Optimize (DEPRECATED)
- PDF Info Reader
- Background & Job Check
- Get Account Balance Info
- Document Classifier
- Email
- File Upload
- PDF Add/Remove Password
Pages
- PDF Delete Pages
- PDF Rotate
Barcodes
- Barcode
Document Parser
Parse Document
This API method extracts data from documents based on a document parser extraction template. With this API method, you can extract data from custom areas by searching form fields, tables, multiple pages, and more.
POST /v1/pdf/documentparser
Please refer to the Document Parser Template Editor and PDF.co Document Parser: Template Creation Guide for more information.
Attributes
Attributes are case-sensitive and should be inside JSON for POST request. for example:
{ "url": "https://example.com/file1.pdf" }
Attribute | Type | Required | Default | Description |
---|---|---|---|---|
url | string | Yes | - | URL to the source file url attribute |
callback | string | No | - | The callback URL (or Webhook) used to receive the POST data. see Webhooks & Callbacks. This is only applicable when async is set to true . |
httpusername | string | No | - | HTTP auth user name if required to access source URL. |
httppassword | string | No | - | HTTP auth password if required to access source URL. |
templateId | integer | No | - | Set ID of document parser template to be used. View and manage your templates at Document Parser |
template | string | No | - | The raw format of the document parser template to be used directly. see Template |
password | string | No | - | Password for the PDF file. |
inline | boolean | No | false | Set to true to include the results directly in the response, in addition to providing a URL to the generated output file. Applies only when async mode is enabled. |
async | boolean | No | false | Set async to true for long processes to run in the background, API will then return a jobId which you can use with the Background Job Check endpoint. Also see Webhooks & Callbacks |
pages | string | No | all pages | Specify page indices as comma-separated values or ranges to process (e.g. “0, 1, 2-” or “1, 2, 3-7”). The first-page index is 0. Use ”!” before a number for inverted page numbers (e.g. “!0” for the last page). If not specified, the default configuration processes all pages. The input must be in string format. |
name | string | No | - | File name for the generated output, the input must be in string format. |
expiration | integer | No | 60 | Set the expiration time for the output link in minutes. After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage. |
outputFormat | string | No | JSON | The format of the output file. The output format can be JSON , CSV , or XML . |
profiles | object | No | - | See Profiles for more information. |
DataEncryptionAlgorithm | string | No | - | Controls the encryption algorithm used for data encryption. See User-Controlled Encryption for more information. The available algorithms are: AES128 , AES192 , AES256 . |
DataEncryptionKey | string | No | - | Controls the encryption key used for data encryption. See User-Controlled Encryption for more information. |
DataEncryptionIV | string | No | - | Controls the encryption IV used for data encryption. See User-Controlled Encryption for more information. |
DataDecryptionAlgorithm | string | No | - | Controls the decryption algorithm used for data decryption. See User-Controlled Encryption for more information. The available algorithms are: AES128 , AES192 , AES256 . |
DataDecryptionKey | string | No | - | Controls the decryption key used for data decryption. See User-Controlled Encryption for more information. |
DataDecryptionIV | string | No | - | Controls the decryption IV used for data decryption. See User-Controlled Encryption for more information. |
Query parameters
No query parameters accepted.
Responses
Parameter | Type | Description |
---|---|---|
pageCount | integer | Number of pages in the PDF document. |
error | boolean | Indicates whether an error occurred (false means success) |
status | string | Status code of the request (200, 404, 500, etc.). For more information, see Response Codes. |
credits | integer | Number of credits consumed by the request |
remainingCredits | integer | Number of credits remaining in the account |
duration | integer | Time taken for the operation in milliseconds |
body | object | No |
objects | array[object] | |
elapsed | float | Processing time in seconds |
templateName | string | Name of the parsing template used |
templateVersion | string | Version of the parsing template |
timestamp | string | Timestamp when the parsing occurred |
Example
Payload
To see the request size limits, please refer to the Request Size Limits.
{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "JSON",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}
Example
Response
To see the main response codes, please refer to the Response Codes page.
{
"body": {
"objects": [
{
"name": "companyName",
"objectType": "field",
"value": "Amazon Web Services, Inc",
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "companyName2",
"objectType": "field",
"value": "Amazon Web Services, Inc",
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "invoiceId",
"objectType": "field",
"value": "123456789",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "dateIssued",
"objectType": "field",
"value": "2018-04-03T00:00:00",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "dateDue",
"objectType": "field",
"value": "2018-04-03T00:00:00",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "bankAccount",
"objectType": "field",
"value": "123456789012",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "total",
"objectType": "field",
"value": 6.58,
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "subTotal",
"objectType": "field",
"value": ""
},
{
"name": "tax",
"objectType": "field",
"value": 1.01,
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"objectType": "table",
"name": "table",
"rows": []
}
],
"templateName": "Generic Invoice [en]",
"templateVersion": "4",
"timestamp": "2020-08-21T19:23:31"
},
"pageCount": 1,
"error": false,
"status": 200,
"name": "sample-invoice.json",
"remainingCredits": 60803
}
Code Samples
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: {{x-api-key}}' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "JSON",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}'
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: {{x-api-key}}' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "JSON",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}'
package com.company;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import com.google.gson.JsonPrimitive;
import okhttp3.*;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class Main
{
// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
final static String API_KEY = "***********************************";
public static void main(String[] args) throws IOException
{
// Source PDF file
final Path SourceFile = Paths.get(".\\MultiPageTable.pdf");
// PDF document password. Leave empty for unprotected documents.
final String Password = "";
// Destination JSON file name
final Path DestinationFile = Paths.get(".\\result.json");
// Template text. Use Document Parser (https://pdf.co/document-parser, https://app.pdf.co/document-parser)
// to create templates.
// Read template from file:
String templateText = new String(Files.readAllBytes(Paths.get(".\\MultiPageTable-template1.yml")), StandardCharsets.UTF_8);
// Create HTTP client instance
OkHttpClient webClient = new OkHttpClient();
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have a direct file URL, skip to the step 3.
// Prepare URL for `Get Presigned URL` API call
String query = String.format(
"https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=%s",
SourceFile.getFileName());
// Prepare request
Request request = new Request.Builder()
.url(query)
.addHeader("x-api-key", API_KEY) // (!) Set API Key
.build();
// Execute request
Response response = webClient.newCall(request).execute();
if (response.code() == 200)
{
// Parse JSON response
JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();
boolean error = json.get("error").getAsBoolean();
if (!error)
{
// Get URL to use for the file upload
String uploadUrl = json.get("presignedUrl").getAsString();
// Get URL of uploaded file to use with later API calls
String uploadedFileUrl = json.get("url").getAsString();
// 2. UPLOAD THE FILE TO CLOUD.
if (uploadFile(webClient, API_KEY, uploadUrl, SourceFile))
{
// 3. PARSE UPLOADED PDF DOCUMENT
ParseDocument(webClient, API_KEY, DestinationFile, Password, uploadedFileUrl, templateText);
}
}
else
{
// Display service reported error
System.out.println(json.get("message").getAsString());
}
}
else
{
// Display request error
System.out.println(response.code() + " " + response.message());
}
}
public static void ParseDocument(OkHttpClient webClient, String apiKey, Path destinationFile,
String password, String uploadedFileUrl, String templateText) throws IOException
{
// Prepare POST request body in JSON format
JsonObject jsonBody = new JsonObject();
jsonBody.add("url", new JsonPrimitive(uploadedFileUrl));
jsonBody.add("template", new JsonPrimitive(templateText));
RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonBody.toString());
// Prepare request to `Document Parser` API
Request request = new Request.Builder()
.url("https://api.pdf.co/v1/pdf/documentparser")
.addHeader("x-api-key", API_KEY) // (!) Set API Key
.addHeader("Content-Type", "application/json")
.post(body)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
if (response.code() == 200)
{
// Parse JSON response
JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();
boolean error = json.get("error").getAsBoolean();
if (!error)
{
// Get URL of generated JSON file
String resultFileUrl = json.get("url").getAsString();
// Download JSON file
downloadFile(webClient, resultFileUrl, destinationFile.toFile());
System.out.printf("Generated JSON file saved as \"%s\" file.", destinationFile.toString());
}
else
{
// Display service reported error
System.out.println(json.get("message").getAsString());
}
}
else
{
// Display request error
System.out.println(response.code() + " " + response.message());
}
}
public static boolean uploadFile(OkHttpClient webClient, String apiKey, String url, Path sourceFile) throws IOException
{
// Prepare request body
RequestBody body = RequestBody.create(MediaType.parse("application/octet-stream"), sourceFile.toFile());
// Prepare request
Request request = new Request.Builder()
.url(url)
.addHeader("x-api-key", apiKey) // (!) Set API Key
.addHeader("content-type", "application/octet-stream")
.put(body)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
return (response.code() == 200);
}
public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
{
// Prepare request
Request request = new Request.Builder()
.url(url)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
byte[] fileBytes = response.body().bytes();
// Save downloaded bytes to file
OutputStream output = new FileOutputStream(destinationFile);
output.write(fileBytes);
output.flush();
output.close();
response.close();
}
}
import os
import requests # pip install requests
# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "*************************************"
# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"
# Source PDF file
SourceFile = ".\\MultiPageTable.pdf"
# Destination JSON file name
DestinationFile = ".\\result.json"
// Template text. Use Document Parser (https://pdf.co/document-parser, https://app.pdf.co/document-parser)
# to create templates.
# Read template from file:
file_read = open(".\\MultiPageTable-template1.yml", mode='r', encoding="utf-8",errors="ignore")
Template = file_read.read()
file_read.close()
def main(args = None):
uploadedFileUrl = uploadFile(SourceFile)
if (uploadedFileUrl != None):
PerformDocumentParser(uploadedFileUrl, Template, DestinationFile)
def PerformDocumentParser(uploadedFileUrl, template, destinationFile):
# Content
data = {
'url': uploadedFileUrl,
'template': template
}
# Prepare URL for 'Document Parser' API request
url = "{}/pdf/documentparser".format(BASE_URL)
# Execute request and get response as JSON
response = requests.post(url, data= data, headers={ "x-api-key": API_KEY })
if (response.status_code == 200):
json = response.json()
if json["error"] == False:
# Get URL of result file
resultFileUrl = json["url"]
# Download result file
r = requests.get(resultFileUrl, stream=True)
if (r.status_code == 200):
with open(destinationFile, 'wb') as file:
for chunk in r:
file.write(chunk)
print(f"Result file saved as \"{destinationFile}\" file.")
else:
print(f"Request error: {response.status_code} {response.reason}")
else:
# Show service reported error
print(json["message"])
else:
print(f"Request error: {response.status_code} {response.reason}")
def uploadFile(fileName):
"""Uploads file to the cloud"""
# 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.
# Prepare URL for 'Get Presigned URL' API request
url = "{}/file/upload/get-presigned-url?contenttype=application/octet-stream&name={}".format(
BASE_URL, os.path.basename(fileName))
# Execute request and get response as JSON
response = requests.get(url, headers={"x-api-key": API_KEY})
if (response.status_code == 200):
json = response.json()
if json["error"] == False:
# URL to use for file upload
uploadUrl = json["presignedUrl"]
# URL for future reference
uploadedFileUrl = json["url"]
# 2. UPLOAD FILE TO CLOUD.
with open(fileName, 'rb') as file:
requests.put(uploadUrl, data=file,
headers={"x-api-key": API_KEY, "content-type": "application/octet-stream"})
return uploadedFileUrl
else:
# Show service reported error
print(json["message"])
else:
print(f"Request error: {response.status_code} {response.reason}")
return None
if __name__ == '__main__':
main()
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
namespace PDFcoApiExample
{
class Program
{
// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const String API_KEY = "***********************************";
// Source PDF file
const string SourceFile = @".\MultiPageTable.pdf";
// PDF document password. Leave empty for unprotected documents.
const string Password = "";
// Destination TXT file name
const string DestinationFile = @".\result.json";
static void Main(string[] args)
{
// Template text. Use Document Parser (https://pdf.co/document-parser, https://app.pdf.co/document-parser)
// to create templates.
// Read template from file:
String templateText = File.ReadAllText(@".\MultiPageTable-template1.yml");
// Create standard .NET web client instance
WebClient webClient = new WebClient();
// Set API Key
webClient.Headers.Add("x-api-key", API_KEY);
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have a direct file URL, skip to the step 3.
// Prepare URL for `Get Presigned URL` API call
string query = Uri.EscapeUriString(string.Format(
"https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name={0}",
Path.GetFileName(SourceFile)));
try
{
// Execute request
string response = webClient.DownloadString(query);
// Parse JSON response
JObject json = JObject.Parse(response);
if (json["error"].ToObject<bool>() == false)
{
// Get URL to use for the file upload
string uploadUrl = json["presignedUrl"].ToString();
string uploadedFileUrl = json["url"].ToString();
// 2. UPLOAD THE FILE TO CLOUD.
webClient.Headers.Add("content-type", "application/octet-stream");
webClient.UploadFile(uploadUrl, "PUT", SourceFile); // You can use UploadData() instead if your file is byte[] or Stream
webClient.Headers.Remove("content-type");
// 3. PARSE UPLOADED PDF DOCUMENT
// URL of `Document Parser` API call
string url = "https://api.pdf.co/v1/pdf/documentparser";
Dictionary<string, string> requestBody = new Dictionary<string, string>();
requestBody.Add("template", templateText);
requestBody.Add("name", Path.GetFileName(DestinationFile));
requestBody.Add("url", uploadedFileUrl);
// Convert dictionary of params to JSON
string jsonPayload = JsonConvert.SerializeObject(requestBody);
// Execute request
response = webClient.UploadString(url, "POST", jsonPayload);
// Parse response
json = JObject.Parse(response);
if (json["error"].ToObject<bool>() == false)
{
// Get URL of generated JSON file
string resultFileUrl = json["url"].ToString();
// Download JSON file
webClient.DownloadFile(resultFileUrl, DestinationFile);
Console.WriteLine("Generated JSON file saved as \"{0}\" file.", DestinationFile);
}
else
{
Console.WriteLine(json["message"].ToString());
}
}
else
{
Console.WriteLine(json["message"].ToString());
}
}
catch (WebException e)
{
Console.WriteLine(e.ToString());
}
webClient.Dispose();
Console.WriteLine();
Console.WriteLine("Press any key...");
Console.ReadKey();
}
}
}
package com.company;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import com.google.gson.JsonPrimitive;
import okhttp3.*;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class Main
{
// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
final static String API_KEY = "***********************************";
public static void main(String[] args) throws IOException
{
// Source PDF file
final Path SourceFile = Paths.get(".\\MultiPageTable.pdf");
// PDF document password. Leave empty for unprotected documents.
final String Password = "";
// Destination JSON file name
final Path DestinationFile = Paths.get(".\\result.json");
// Template text. Use Document Parser (https://pdf.co/document-parser, https://app.pdf.co/document-parser)
// to create templates.
// Read template from file:
String templateText = new String(Files.readAllBytes(Paths.get(".\\MultiPageTable-template1.yml")), StandardCharsets.UTF_8);
// Create HTTP client instance
OkHttpClient webClient = new OkHttpClient();
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have a direct file URL, skip to the step 3.
// Prepare URL for `Get Presigned URL` API call
String query = String.format(
"https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=%s",
SourceFile.getFileName());
// Prepare request
Request request = new Request.Builder()
.url(query)
.addHeader("x-api-key", API_KEY) // (!) Set API Key
.build();
// Execute request
Response response = webClient.newCall(request).execute();
if (response.code() == 200)
{
// Parse JSON response
JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();
boolean error = json.get("error").getAsBoolean();
if (!error)
{
// Get URL to use for the file upload
String uploadUrl = json.get("presignedUrl").getAsString();
// Get URL of uploaded file to use with later API calls
String uploadedFileUrl = json.get("url").getAsString();
// 2. UPLOAD THE FILE TO CLOUD.
if (uploadFile(webClient, API_KEY, uploadUrl, SourceFile))
{
// 3. PARSE UPLOADED PDF DOCUMENT
ParseDocument(webClient, API_KEY, DestinationFile, Password, uploadedFileUrl, templateText);
}
}
else
{
// Display service reported error
System.out.println(json.get("message").getAsString());
}
}
else
{
// Display request error
System.out.println(response.code() + " " + response.message());
}
}
public static void ParseDocument(OkHttpClient webClient, String apiKey, Path destinationFile,
String password, String uploadedFileUrl, String templateText) throws IOException
{
// Prepare POST request body in JSON format
JsonObject jsonBody = new JsonObject();
jsonBody.add("url", new JsonPrimitive(uploadedFileUrl));
jsonBody.add("template", new JsonPrimitive(templateText));
RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonBody.toString());
// Prepare request to `Document Parser` API
Request request = new Request.Builder()
.url("https://api.pdf.co/v1/pdf/documentparser")
.addHeader("x-api-key", API_KEY) // (!) Set API Key
.addHeader("Content-Type", "application/json")
.post(body)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
if (response.code() == 200)
{
// Parse JSON response
JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();
boolean error = json.get("error").getAsBoolean();
if (!error)
{
// Get URL of generated JSON file
String resultFileUrl = json.get("url").getAsString();
// Download JSON file
downloadFile(webClient, resultFileUrl, destinationFile.toFile());
System.out.printf("Generated JSON file saved as \"%s\" file.", destinationFile.toString());
}
else
{
// Display service reported error
System.out.println(json.get("message").getAsString());
}
}
else
{
// Display request error
System.out.println(response.code() + " " + response.message());
}
}
public static boolean uploadFile(OkHttpClient webClient, String apiKey, String url, Path sourceFile) throws IOException
{
// Prepare request body
RequestBody body = RequestBody.create(MediaType.parse("application/octet-stream"), sourceFile.toFile());
// Prepare request
Request request = new Request.Builder()
.url(url)
.addHeader("x-api-key", apiKey) // (!) Set API Key
.addHeader("content-type", "application/octet-stream")
.put(body)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
return (response.code() == 200);
}
public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
{
// Prepare request
Request request = new Request.Builder()
.url(url)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
byte[] fileBytes = response.body().bytes();
// Save downloaded bytes to file
OutputStream output = new FileOutputStream(destinationFile);
output.write(fileBytes);
output.flush();
output.close();
response.close();
}
}
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document Parse Results</title>
</head>
<body>
<?php
// Get submitted form data
$apiKey = $_POST["apiKey"]; // The authentication key (API Key). Get your own by registering at https://app.pdf.co
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have the direct PDF file link, go to the step 3.
// Create URL
$url = "https://api.pdf.co/v1/file/upload/get-presigned-url" .
"?contenttype=application/octet-stream";
// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
// Execute request
$result = curl_exec($curl);
if (curl_errno($curl) == 0)
{
$status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($status_code == 200)
{
$json = json_decode($result, true);
// Get URL to use for the file upload
$uploadFileUrl = $json["presignedUrl"];
// Get URL of uploaded file to use with later API calls
$uploadedFileUrl = $json["url"];
// 2. UPLOAD THE FILE TO CLOUD.
$localFile = $_FILES["fileInput"]["tmp_name"];
$fileHandle = fopen($localFile, "r");
curl_setopt($curl, CURLOPT_URL, $uploadFileUrl);
curl_setopt($curl, CURLOPT_HTTPHEADER, array("content-type: application/octet-stream"));
curl_setopt($curl, CURLOPT_PUT, true);
curl_setopt($curl, CURLOPT_INFILE, $fileHandle);
curl_setopt($curl, CURLOPT_INFILESIZE, filesize($localFile));
// Execute request
curl_exec($curl);
fclose($fileHandle);
if (curl_errno($curl) == 0)
{
$status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($status_code == 200)
{
// Read all template texts
$templateText = file_get_contents($_FILES["fileTemplate"]["tmp_name"]);
// 3. PARSE UPLOADED PDF DOCUMENT
ParseDocument($apiKey, $uploadedFileUrl, $templateText);
}
else
{
// Display request error
echo "<p>Status code: " . $status_code . "</p>";
echo "<p>" . $result . "</p>";
}
}
else
{
// Display CURL error
echo "Error: " . curl_error($curl);
}
}
else
{
// Display service reported error
echo "<p>Status code: " . $status_code . "</p>";
echo "<p>" . $result . "</p>";
}
curl_close($curl);
}
else
{
// Display CURL error
echo "Error: " . curl_error($curl);
}
function ParseDocument($apiKey, $uploadedFileUrl, $templateText)
{
// (!) Make asynchronous job
$async = TRUE;
// Prepare URL for Document parser API call.
// See documentation: https://developer.pdf.co/api/document-parser
$url = "https://api.pdf.co/v1/pdf/documentparser";
// Prepare requests params
$parameters = array();
$parameters["url"] = $uploadedFileUrl;
$parameters["template"] = $templateText;
$parameters["async"] = $async;
// Create Json payload
$data = json_encode($parameters);
// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
// Execute request
$result = curl_exec($curl);
echo $result . "<br/>";
if (curl_errno($curl) == 0)
{
$status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($status_code == 200)
{
$json = json_decode($result, true);
if (!isset($json["error"]) || $json["error"] == false)
{
// URL of generated JSON file that will available after the job completion
$resultFileUrl = $json["url"];
// Asynchronous job ID
$jobId = $json["jobId"];
// Check the job status in a loop
do
{
$status = CheckJobStatus($jobId, $apiKey); // Possible statuses: "working", "failed", "aborted", "success".
// Display timestamp and status (for demo purposes)
echo "<p>" . date(DATE_RFC2822) . ": " . $status . "</p>";
if ($status == "success")
{
// Display link to JSON file with information about parsed fields
echo "<div><h2>Parsing Result:</h2><a href='" . $resultFileUrl . "' target='_blank'>" . $resultFileUrl . "</a></div>";
break;
}
else if ($status == "working")
{
// Pause for a few seconds
sleep(3);
}
else
{
echo $status . "<br/>";
break;
}
}
while (true);
}
else
{
// Display service reported error
echo "<p>Error: " . $json["message"] . "</p>";
}
}
else
{
// Display request error
echo "<p>Status code: " . $status_code . "</p>";
echo "<p>" . $result . "</p>";
}
}
else
{
// Display CURL error
echo "Error: " . curl_error($curl);
}
}
function CheckJobStatus($jobId, $apiKey)
{
$status = null;
// Create URL
$url = "https://api.pdf.co/v1/job/check";
// Prepare requests params
$parameters = array();
$parameters["jobid"] = $jobId;
// Create Json payload
$data = json_encode($parameters);
// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
// Execute request
$result = curl_exec($curl);
if (curl_errno($curl) == 0)
{
$status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($status_code == 200)
{
$json = json_decode($result, true);
if (!isset($json["error"]) || $json["error"] == false)
{
$status = $json["status"];
}
else
{
// Display service reported error
echo "<p>Error: " . $json["message"] . "</p>";
}
}
else
{
// Display request error
echo "<p>Status code: " . $status_code . "</p>";
echo "<p>" . $result . "</p>";
}
}
else
{
// Display CURL error
echo "Error: " . curl_error($curl);
}
// Cleanup
curl_close($curl);
return $status;
}
?>
</body>
</html>
Was this page helpful?
Assistant
Responses are generated using AI and may contain mistakes.