# Analyze multipage document

> The **Analyze Multipage Document** activity uses the Amazon Textract [StartDocumentAnalysis](https://docs.aws.amazon.com/textract/latest/dg/API_StartDocumentAnalysis.html) and [GetDocumentAnalysis](https://docs.aws.amazon.com/textract/latest/dg/API_GetDocumentAnalysis.html) APIs to analyze a multi-page document stored in an S3 bucket (**Bucket**, **DocumentName**, and **Version**). If your document includes a table, you have the option to indicate if the first row contains column headers (**DiscoverColumnHeaders**) and/or ignore empty rows (**IgnoreEmptyRows**).

The **Analyze Multipage Document** activity uses the Amazon Textract [StartDocumentAnalysis](https://docs.aws.amazon.com/textract/latest/dg/API_StartDocumentAnalysis.html) and [GetDocumentAnalysis](https://docs.aws.amazon.com/textract/latest/dg/API_GetDocumentAnalysis.html) APIs to analyze a multi-page document stored in an S3 bucket (**Bucket**, **DocumentName**, and **Version**). If your document includes a table, you have the option to indicate if the first row contains column headers (**DiscoverColumnHeaders**) and/or ignore empty rows (**IgnoreEmptyRows**).

After analyzing the document, the activity returns the document properties in a `PageDetailCollection` object (**Pages**) that you can use as input variables in other activities outside of the Amazon Textract Activities Package.

The **Analyze Multipage Document** activity is essentially a combination of the [Start Document Analysis](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-start-document-analysis#start-document-analysis), [Get Document Analysis Status](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-get-document-analysis-status#get-document-analysis-status), and [Get Document Analysis](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-get-document-analysis#get-document-analysis) activities in a single activity.
:::important
In previous versions of this activity, the (**Pages**) output parameter returned a `PageDetail[]` object. In 2.0 this has been changed to a `PageDetailCollection` to allow us to return the RawJson property for the method call, which was not possible with an array.
:::

## How it works

The following steps and message sequence diagram is an example of how the activity works from design time (that is, the activity dependencies and input/output properties) to run time.

1. Complete the [Setup](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-setup#setup) steps.
2. Add the [Amazon Scope](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-scope#amazon-scope) activity to your project.
3. Add the **Analyze Single Page Document** inside the **Amazon Scope** activity.
4. Enter values for the [S3 Storage](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-analyze-single-page-document#s3-storage) input properties.
5. Create and enter a `PageDetailCollection` variable for your [Output](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-analyze-multipage-document#output) property.
6. Run the activity.
   * Your input properties are sent to the [AnalyzeDocument](https://docs.aws.amazon.com/textract/latest/dg/API_AnalyzeDocument.html) API.
   * The API returns the `PageDetail` value to your output property variable.

     ![Analyze Multipage Document message sequence diagram](https://dev-assets.cms.uipath.com/assets/images/marketplace/marketplace-docs-image-34467-8f25d7fa-15a3967c.webp)

## Properties

The values for the following properties are specified when adding this activity to your project in UiPath Studio.

![Analyze Multipage Document properties panel in UiPath Studio](https://dev-assets.cms.uipath.com/assets/images/marketplace/marketplace-docs-image-36588-036008fd-92f714d3.webp)

### Common

#### DisplayName

The display name of the activity.

| Attributes | Details |
| --- | --- |
| **Type** | `String` |
| **Required** | Yes |
| **Default value** | *Analyze Multipage Document* |
| **Allowed values** | Enter a `String` or `String` variable. |
| **Notes** | N/A |

### Input

Unlike the [Get Document Analysis Status](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-get-document-analysis-status#get-document-analysis-status), which requires an external delay mechanism to poll the service for status changes, the Analyze Multipage Document includes the following, optional input properties to set an initial status check delay (**InitialDelay**) and status check interval (**StatusCheckInterval**).

#### InitialDelay

The amount of time to wait before the activity calls the Amazon Textract GetDocumentAnalysis API to retrieve the JobStatus value.

| Attributes | Details |
| --- | --- |
| **Type** | `Int32` (milliseconds) |
| **Required** | No |
| **Default value** | *15000* (not shown) |
| **Allowed values** | Enter a `Int32` or `Int32` variable. |
| **Notes** | Enter your value in milliseconds (e.g., *30000* for 30 seconds); your value must be greater or equal to *15000*.  When analyzing a large document, it's recommended that you enter the estimated time it takes for the Amazon Textract service to complete its analysis. For example, if your document takes up to 2 minutes to analyze, you should enter 120000 as your value and use the **StatusCheckInterval** property to indicate how often you want to check for an updated status if the job doesn't complete within the 2-minute estimate. |

#### StatusCheckInterval

The amount of time to wait between calls to the Amazon Textract GetDocumentAnalysis API to retrieve the JobStatus value.

| Attributes | Details |
| --- | --- |
| **Type** | `Int32` (milliseconds) |
| **Required** | No |
| **Default value** | *10000* (not shown) |
| **Allowed values** | Enter a `Int32` or `Int32` variable. |
| **Notes** | Enter your value in milliseconds (e.g., *15000* for 30 seconds); your value must be greater or equal to *10000*.  The objective of this property is to help manage the number of calls that your activity makes to the Amazon Textract API. |

### Options

#### AnalysisType

Specifies the types of analysis to perform. Use Tables to return information about the tables that are detected in the input document and Forms to return detected form data.

| Attributes | Details |
| --- | --- |
| **Type** | enum |
| **Required** | No. |
| **Default value** | All |
| **Allowed values** | All, Tables, Forms |
| **Notes** | N/A |

#### DiscoverColumnHeaders

Indicates whether the tables in the document include column headers.

| Attributes | Details |
| --- | --- |
| **Type** | Checkbox |
| **Required** | No |
| **Default value** | Not Selected |
| **Allowed values** | Selected or Not Selected |
| **Notes** | N/A |

#### IgnoreEmptyRows

Indicates whether empty rows in the document tables should be ignored when analyzing the document.

| Attributes | Details |
| --- | --- |
| **Type** | Checkbox |
| **Required** | No |
| **Default value** | Not Selected |
| **Allowed values** | Selected or Not Selected |
| **Notes** | N/A |

### S3 Storage

#### Bucket

The name of the S3 bucket where the document is stored.

| Attributes | Details |
| --- | --- |
| **Type** | `String` |
| **Required** | Yes |
| **Default value** | Empty |
| **Allowed values** | Enter a `String` or `String` variable. |
| **Notes** | The AWS Region for the S3 bucket that contains the document must match the **Region** that you selected in the **Amazon Scope** activity.  For Amazon Textract to process a file in an S3 bucket, the user must have permission to access the S3 bucket; for more information, see **step 6** in the [Create IAM User](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-setup#setup) section of the **Setup** guide. |

#### DocumentName

The case-sensitive name of the file in the specified **Bucket** that you want to analyze.

| Attributes | Details |
| --- | --- |
| **Type** | `String` |
| **Required** | Yes |
| **Default value** | Empty |
| **Allowed values** | Enter a `String` or `String` variable. |
| **Notes** | Supported document formats: PNG, JPEG, and PDF. |

#### Version

If the bucket has versioning enabled, you can specify the object version.

| Attributes | Details |
| --- | --- |
| **Type** | `String` |
| **Required** | No |
| **Default value** | Empty |
| **Allowed values** | Enter a `String` or `String` variable. |
| **Notes** | N/A |

### Misc

#### Private

If selected, the values of variables and arguments are no longer logged at Verbose level.

| Attributes | Details |
| --- | --- |
| **Type** | Checkbox |
| **Required** | No |
| **Default value** | Not Selected |
| **Allowed values** | Selected or Not Selected |
| **Notes** | N/A |

### Output

#### Pages

The properties extracted from the specified document returned as an array.

| Attributes | Details |
| --- | --- |
| **Type** | `PageDetailCollection` |
| **Required** | No (recommended if you plan to use the output data in subsequent activities) |
| **Default value** | Empty |
| **Allowed values** | Enter a `PageDetailCollection` variable |
| **Notes** | Each object from the array represents the results for one individual page. This is a change from previous versions which returned a `PageDetail[]` object.  See [Page Detail](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-page-detail#the-page-detail-object) for the description of the *PageDetail* object and its properties. |

## Example

The following image shows an example of the activity dependency relationship and input/output property values.

For step-by-step instructions and examples, see the [Quickstart](https://docs.uipath.com/activities/other/latest/legacy-integrations/amazon-textract-quickstarts#quickstarts) guides.

![Analyze Multipage Document activity dependency and input/output property values](https://dev-assets.cms.uipath.com/assets/images/marketplace/marketplace-docs-image-35064-a99d956c-6affd902.webp)

![Analyze Multipage Document output example in UiPath Studio](https://dev-assets.cms.uipath.com/assets/images/marketplace/marketplace-docs-image-36244-bde8391e-b89866eb.webp)
