Analyze Single Page Document

The Analyze Single Page Document activity uses the Amazon Textract AnalyzeDocument API to synchronously analyze a local document (DocumentPath) or a document stored in an S3 bucket (Bucket,DocumentName, and Version). If your document includes a table, you have the option to indicate if the first row contains column headers (DiscoverColumnHeaders) and/or ignore empty rows (IgnoreEmptyRows).

After analyzing the document, the activity returns the document properties in a PageDetail object (Page) that you can use as input variables in other activities outside of the Amazon Textract Activities Package.

How It Works

The following steps and message sequence diagram is an example of how the activity works from design time (i.e., the activity dependencies and input/output properties) to run time.

Complete the Setup steps.
Add the Amazon Scope activity to your project.
Add the Analyze Single Page Document inside the Amazon Scope activity.
Enter values for the Local Path or the S3 Storage input properties.
Create and enter a PageDetail variable for your Output (Page) property.
Run the activity.
- Your input properties are sent to the AnalyzeDocument API.
- The API returns the PageDetail value (Page) to your output property variable.

Properties

The values for the following properties are specified when adding this activity to your project in UiPath Studio.

Common

DisplayName

The display name of the activity.

Attributes	Details
Type	`String`
Required	Yes
Default value	Amazon Scope
Allowed values	Enter a `String` or `String` variable.
Notes	N/A

Local Path

DocumentPath

The local location of the file that you want to analyze.

Attributes	Details
Type	`String`
Required	Yes (if S3 Storage properties are empty)
Default value	Empty
Allowed values	Enter a `String` or `String` variable.
Notes	Supported document formats: PNG and JPEG (PDF is not supported in synchronous calls).

Options

AnalysisType

Specifies the types of analysis to perform. Use Tables to return information about the tables that are detected in the input document and Forms to return detected form data.

Attributes	Details
Type	enum
Required	No.
Default value	All
Allowed values	All, Tables, Forms
Notes	N/A

DiscoverColumnHeaders

Indicates whether the tables in the document include column headers.

Attributes	Details
Type	Checkbox
Required	No
Default value	Not Selected
Allowed values	Selected or Not Selected
Notes	N/A

IgnoreEmptyRows

Indicates whether empty rows in the document tables should be ignored when analyzing the document.

Attributes	Details
Type	Checkbox
Required	No
Default value	Not Selected
Allowed values	Selected or Not Selected
Notes	N/A

S3 Storage

Bucket

The name of the Amazon S3 bucket where the document is stored.

Attributes	Details
Type	`String`
Required	Yes
Default value	Empty
Allowed values	Enter a `String` or `String` variable.
Notes	The AWS Region for the S3 bucket that contains the document must match the Region that you selected in the Amazon Scope activity. For Amazon Textract to process a file in an S3 bucket, the user must have permission to access the S3 bucket; for more information, see step 6 in the Create IAM User section of the Setup guide.

DocumentName

The case-sensitive name of the file in the specfied Bucket that you want to analyze.

Attributes	Details
Type	`String`
Required	Yes
Default value	Empty
Allowed values	Enter a `String` or `String` variable.
Notes	Supported document formats: PNG, JPEG, and PDF.

Version

If the bucket has versioning enabled, you can specify the object version.

Attributes	Details
Type	`String`
Required	No
Default value	Empty
Allowed values	Enter a `String` or `String` variable.
Notes	N/A

Misc

Private

If selected, the values of variables and arguments are no longer logged at Verbose level.

Attributes	Details
Type	Checkbox
Required	No
Default value	Not Selected
Allowed values	Selected or Not Selected
Notes	N/A

Output

Page

The properties extracted from the specified document.

Attributes	Details
Type	`PageDetail`
Required	No (recommended if you plan to use the output data in subsequent activities)
Default value	Empty
Allowed values	Enter a `PageDetail` variable
Notes	See Page Detail for the description of the of the PageDetail object and its properties.