- Overview
- Model building
- Model validation
- Model deployment
- API
- Frequently asked questions

Unstructured and complex documents user guide
The Autopilot for taxonomy generation feature for Unstructured and Complex Documents projects accelerates the setup of your project. The feature does this by intelligently generating a starting taxonomy from user-provided samples and a use case description.
The feature streamlines taxonomy creation by analyzing sample documents, interpreting your use case description, and recommending a tailored structured taxonomy. Autopilot reduces the manual effort typically involved when defining large numbers of fields. Once you receive the recommended taxonomy, you can create a project with it and make any adjustments as needed.
1. Upload your sample documents
Upload one or two representative documents that contain examples of all the data you want to extract. Representative means documents that reflect the layout, structure, and field types you want the model to learn from.
The combined limit of the documents uploaded is 100 pages. The model does not use any pages beyond this limit.
2. Describe what you want to extract
Provide a clear, specific description of your goal in natural language.
| Example | Description |
|---|---|
| Create a taxonomy for this document. | This is not recommended because it is too broad and may generate an excessively detailed taxonomy. |
| Extract key brokerage account details, including account owner, account number, market value, and account type. | This is recommended because it is focused and efficient. |
- Do not include fields other than those mentioned above.
- For the employee name, extract only the first name.
3. Generate the taxonomy
The system analyzes your input and proposes a taxonomy with field groups, individual fields, and extraction instructions.
4. Iterate and finalize
If needed, you can rephrase or expand your input prompt to generate a new recommended taxonomy. You can also easily edit the taxonomy in the project once you create it.
To use the Autopilot for taxonomy generation feature, proceed as follows:
- In the Unstructured and Complex Documents capability, select Create project.
- In the Create project pop-up window that appears, fill in and configure the following:
- Project name - Enter a name for your project.
- Use Autopilot to generate project taxonomy - Use the toggle to enable or disable the feature.
- Tell us about your document extraction project - Describe what you want to extract.
- Drop your documents here - Upload one or two documents, but make sure these do not exceed a total of 100 pages.
- Select Generate with Autopilot when you are ready to generate the taxonomy.
- On the new page, you can view the generated taxonomy under the columns Field group and field name and Instructions.
Note: Autopilot can make mistakes. Make sure you double check the taxonomy. If needed, update instructions and regenerate a new one or make corrections within the project once created.
- At this point, you can also update the current taxonomy by adding more files through the plus button or editing the description,
or both, and selecting Generate.
Note: If you want to reset everything, select the Clear button.
- When you are happy with the generated taxonomy, select Continue to create a project with the taxonomy pre-populated and your context documents uploaded.
- The combined limit of documents uploaded is 100 pages. The model does not use any pages beyond this limit.
- You cannot edit the generated taxonomy directly. You either need to update your context and regenerate a new taxonomy, or you need to create your project with the recommended taxonomy and then edit it through the Manage Taxonomy page or Document Annotation view.