- 概述
- 入门指南
- 构建模型
- 使用模型
- ML 包
- 1040 - ML 包
- 1040 附表 C - ML 包
- 1040 附表 D - ML 包
- 1040 附表 E - ML 包
- 1040x - ML 包
- 3949a - ML 包
- 4506T - ML 包
- 941x - ML 包
- 9465 - ML 包
- ACORD125 - ML 包
- ACORD126 - ML 包
- ACORD131 - ML 包
- ACORD140 - ML 包
- ACORD25 - ML 包
- 银行对账单 - ML 包
- 提单 - ML 包
- 公司注册证书 - ML 包
- 原产地证书 - ML 包
- 检查 - ML 包
- 儿童产品证书 - ML 包
- CMS1500 - ML 包
- 欧盟符合性声明 - ML 包
- 财务报表 (Financial statements) - ML 包
- FM1003 - ML 包
- I9 - ML 包
- ID Cards - ML 包
- Invoices - ML 包
- InvoicesAustralia - ML 包
- 中国发票 - ML 包
- 希伯来语发票 - ML 包
- 印度发票 - ML 包
- 日本发票 - ML 包
- 装运发票 - ML 包
- 装箱单 - ML 包
- 工资单 - ML 包
- 护照 - ML 包
- 采购订单 - ML 包
- 收据 - ML 包
- 汇款通知书 - ML 包
- UB04 - ML 包
- 水电费账单 - ML 包
- 车辆所有权证明 - ML 包
- W2 - ML 包
- W9 - ML 包
- 公共端点
- 支持的语言
- Insights 仪表板
- 数据与安全性
- 许可
- 如何
生成式功能
生成式 AI 是 AI 技术的一种形式,它利用机器学习 (ML) 模型创建和生成新的内容、数据或信息。
大多数生成式 AI 任务的关键是大型语言模型 (LLM)。这些是基于大量文本数据进行训练的 ML 模型,旨在生成拟人化文本。LLM 还可以通过拟人化的方式完成句子或段落来理解和回应提示。
Primarily applied during the automatic annotation process of documents in the Build step, these generative models accelerate taxonomy design and help in training models efficiently.
Pre-annotation in Document Understanding is done using a combination of generative and specialized models, based on the document type's schema. The schema clearly defines the fields you want to extract from a particular document type.
To get a deeper understanding of how Generative Annotation works and how you can use it efficiently in your projects, check the Annotate documents page.
Generative extraction is a crucial feature within Document UnderstandingTM that uses the power of generative AI models. These models are configured using activities and are primarily used at runtime for data extraction.
Generative extraction is capable of deciphering and extracting specific information from unstructured or semi-structured documents. For instance, it can scan through an invoice and accurately retrieve details such as the date, billed amount, and company name. This enables fast, efficient, and highly accurate information gathering from various types of documents.
- Document Understanding activities package:
- Extract Document Data, Prompt parameter after choosing the Generative extractor.
- Document Understanding ML activities package:
- IntelligentOCR activities package:
- Data Extraction Scope, ApplyAutoValidation parameter.
You can also use Document Understanding APIs to leverage generative extraction features.
Generative classification uses AI models to automatically classify documents immediately after they are uploaded.
This automatic classification process leverages ML models to 'read' the content of a document, understand its context, and consequently classify it into predefined categories. This way, the system can handle and organize multiple types of documents efficiently.
By accurately classifying unstructured or semi-structured documents, Generative Classification improves the document processing workflow, saves time, and enhances the overall document management.
You can also use Document Understanding APIs to leverage generative classification features.
Generative validation is a distinctive feature in Document Understanding that plays an important role during the validation process. This feature is primarily used after the extraction step to validate the confidence score for the extraction made using specialized models.
When a ML model's confidence score for a document extraction is low, generative validation is used to cross-check the output. This validation process involves both the specialized and generative ML models working together to ensure accuracy.
If both models yield the same output, human validation can be bypassed, leading to a significant enhancement in the time efficiency of validation. This process not only saves valuable time in the document validation step but also improves the performance of your models by employing a secondary generative model to cross-verify the output, ensuring a higher level of accuracy.
- Document Understanding activities package:
- Extract Document Data, Auto-validation parameter
- IntelligentOCR activities package:
- Data Extraction Scope, ApplyAutoValidation and AutoValidationConfidenceThreshold parameters
You can also use Document Understanding APIs to leverage generative validation features.