IXP - Document Understanding API 経由でモデルを使用する

ixp

latest

false

非構造化ドキュメントと複雑なドキュメントユーザーガイド

概要
モデルの構築
モデルの検証
モデルのデプロイ
モデルを使用する
- ワークフロー経由でモデルを利用する
- Document Understanding API 経由でモデルを利用する
API
- API 監査イベント
よくある質問
- よくある質問

重要 :

このコンテンツの一部は機械翻訳によって処理されており、完全な翻訳を保証するものではありません。新しいコンテンツの翻訳は、およそ 1 ～ 2 週間で公開されます。

Document Understanding API 経由でモデルを利用する

タグベースまたは extractorId ベースの抽出エンドポイントを使用して、Document Understanding Framework API を介して IXP の非構造化ドキュメントおよび複雑なドキュメントプロジェクトにアクセスします。

IXP の非構造化ドキュメントプロジェクトと複雑なドキュメントプロジェクトには、同じ Document Understanding フレームワーク API を使用してアクセスできます。IXP プロジェクトは Discovery で ProjectType: "IXP" として表示され、 タグベースのエンドポイント と extractorId ベースのエンドポイント の両方の抽出をサポートします。

前提条件

Document Understanding または IXP API を呼び出す前に、Automation Cloud に登録されている外部アプリケーションが必要です。これにより、OAuth 認証に使用される AppID と AppSecret が提供されます。

外部アプリケーションを作成する

テナントレベルで Orchestrator に移動します。
[ アクセス権を管理]、[ アカウントとグループを管理] の順に選択します。
UiPath Administration のヘッダーから [外部アプリケーション] を選択します。
[アプリケーションを追加] を選択します。
[アプリケーション名] に入力します (例: DU API Client)。
アプリシークレットを取得するために必要な [ 機密アプリケーション] を選択します。
[ リソース] で [ スコープを追加] を選択します。

[リソース] ドロップダウンから [Document Understanding] を選択します。
[ アプリケーションスコープ ] タブに切り替えます。
必要なスコープを確認します。
- Du.Digitization.Api — ドキュメントをデジタル化する
- Du.Classification.Api — ドキュメントを分類する
- Du.Extraction.Api — データの抽出
- Du.Validation.Api — 検証タスクを作成する
- Du.DataDeletion.Api — ドキュメントデータを削除する
[保存] を選択します。

[追加] を選択して登録を作成します。

注:

[ アプリシークレットを直ちにコピー] ポップアップは一度だけ表示され、回復することはできません。後で編集画面から新しい認証を生成できます。

アプリケーション ID は、[外部アプリケーション] ページにいつでも表示されます。

アクセストークンを取得する

アプリ ID とアプリシークレットを使用して、クライアント資格情報フローで OAuth トークンを要求します。

curl -X POST 'https://cloud.uipath.com/identity_/connect/token' \
  -d 'grant_type=client_credentials' \
  -d 'client_id=<APP_ID>' \
  -d 'client_secret=<APP_SECRET>' \
  -d 'scope=Du.Digitization.Api Du.Extraction.Api'
curl -X POST 'https://cloud.uipath.com/identity_/connect/token' \
  -d 'grant_type=client_credentials' \
  -d 'client_id=<APP_ID>' \
  -d 'client_secret=<APP_SECRET>' \
  -d 'scope=Du.Digitization.Api Du.Extraction.Api'

応答:

{
  "access_token": "eyJh...CRaKrg",
  "expires_in": 3600,
  "token_type": "Bearer",
  "scope": "Du.Digitization.Api Du.Extraction.Api"
}
{
  "access_token": "eyJh...CRaKrg",
  "expires_in": 3600,
  "token_type": "Bearer",
  "scope": "Du.Digitization.Api Du.Extraction.Api"
}

トークンは 1 時間後に期限切れになります。これを、後続のすべての API 呼び出しの Authorization: Bearer <token> として使用します。

注:

アプリシークレットを紛失した場合は、[ 管理] > [外部アプリケーション] に移動してアプリを編集し、[アプリシークレット] の [ 新しく生成 ] を選択します。すべての連携を新しいシークレットで更新します。

主な違い

以下の表に、Document Understanding プロジェクトと IXP プロジェクトの主な違いを示します。

	Document Understanding (クラシックまたはモダン)	IXP
projectType	`Classic` OR `Modern`	`IXP`
分類	はい	いいえ (抽出のみ)
抽出ルーティング	`tag` + `documentTypeId` (推奨) または `extractorId`	`tag` + `documentTypeId` または`extractorId` (`gpt_ixp_[version]`)
バージョン管理	抽出器/分類器	タグ (ステージング、運用)
抽出モデル	特殊または生成	生成 AI のみ (GPT-4o、Gemini)
スキーマ定義	プロジェクト内、またはプロンプト経由	IXP UI で定義 (タクソノミー)

IXP ワークフロー

プロジェクトとタグを検出する
デジタル化と抽出を (並列で) 行います。
検証 (任意)。

注:

IXP は抽出データのみを処理するため、分類手順はありません。

並列のデジタル化と抽出 (IXP のみ)

IXP プロジェクトの場合、デジタル化結果のポーリングをスキップし、デジタル化を送信したらすぐに抽出を開始できます。バックエンドは両方の操作を並行して実行します。デジタル化と IXP の抽出は同時に進行し、両方の抽出が完了した後にのみ最終的な抽出結果が返されます。

これは IXP 固有の最適化であり、Document Understanding のクラシックプロジェクトまたはモダンプロジェクトでは機能しません。デジタル化が完了するまで待ってから抽出を呼び出す必要があります。

最適化されたフロー:

# 1. Start digitization (fire and forget — do not poll for result).
POST /projects/{projectId}/digitization/start
# → returns { "documentId": "..." }
# 2. Immediately start extraction with the documentId (no need to wait).
POST /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/start
# → returns { "operationId": "..." }
# 3. Poll extraction result only — it waits for both digitization and extraction.
GET /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/result/{operationId}
# 1. Start digitization (fire and forget — do not poll for result).
POST /projects/{projectId}/digitization/start
# → returns { "documentId": "..." }
# 2. Immediately start extraction with the documentId (no need to wait).
POST /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/start
# → returns { "operationId": "..." }
# 3. Poll extraction result only — it waits for both digitization and extraction.
GET /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/result/{operationId}

このフローにより、デジタル化と抽出の間のアイドル時間がなくなり、合計レイテンシが短縮されます。

手順 1: IXP プロジェクトを検出する

# List all projects — filter for type "IXP"
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
# List all projects — filter for type "IXP"
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

応答から、IXP プロジェクトの idを確認します。

タグを取得する (パブリッシュ済みバージョン)

タグは、IXP のユーザーインターフェイスでステージングまたは運用としてマークされたパブリッシュ済みモデルバージョンに対応します。各タグには、関連する抽出器とドキュメントの種類が含まれます。タグを取得するには、次のコマンドを実行します。

curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/tags?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/tags?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

ドキュメントの種類を取得

ドキュメントの種類を取得するには、次のコマンドを実行します。

curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/document-types?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/document-types?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

手順 2: ドキュメントをデジタル化する

Document Understanding と同様に、ファイルをアップロードして documentIdを取得します。

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/digitization/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@document.pdf;type=application/pdf'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/digitization/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@document.pdf;type=application/pdf'

戻り値は { "documentId": "..." }です。

ステップ 3: 抽出

IXP の抽出では、以下のルーティングアプローチがサポートされています。

タグベース - tag と documentTypeIdによるルーティングこれは、運用ワークフローまたはステージングワークフローに推奨されます。
抽出 ID ベース - gpt_ixp_[version]の形式でextractorIdでルーティングします。例: gpt_ixp_67) Document Understanding のクラシックプロジェクトまたはモダンプロジェクトの場合と同じです。

タグベースの抽出

Discovery からの documentTypeId でタグベースのパスを使用します。

同期 (最大 5 ページ)

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

非同期 (複数ページ)

開始：

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

戻り値は { "operationId": "..." }です。結果のポーリング:

curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/result/<operationId>?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/result/<operationId>?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

statusSucceededまたはFailedになるまでポーリングします。

抽出器 ID ベースの抽出

Document Understanding のクラシックまたはモダンと同じ抽出器ベースのエンドポイントを使用します。IXP の抽出器 ID は、検出の応答に表示される形式 gpt_ixp_[version]に従います。同期 (最大 5 ページ):

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

非同期 (複数ページ):

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

手順 4: 検証 (任意)

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/validation/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
    "documentId": "<documentId>",
    "actionTitle": "Review IXP extraction",
    "actionPriority": "Medium",
    "actionCatalog": "default_du_actions",
    "actionFolder": "Shared",
    "storageBucketName": "du_storage_bucket",
    "storageBucketDirectoryPath": "du_storage_bucket",
    "extractionResult": { }
  }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/validation/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
    "documentId": "<documentId>",
    "actionTitle": "Review IXP extraction",
    "actionPriority": "Medium",
    "actionCatalog": "default_du_actions",
    "actionFolder": "Shared",
    "storageBucketName": "du_storage_bucket",
    "storageBucketDirectoryPath": "du_storage_bucket",
    "extractionResult": { }
  }'

IXP の抽出応答の構造

API v1 または v1.1

v1 および v1.1 では、IXP のフィールドグループは応答の FieldType: "Table" にマッピングされ、個々のフィールドは表の列として表示されます。元の IXP のデータ型に関係なく、すべての値がテキスト (文字列) として表されます。

{
  "extractionResult": {
    "DocumentId": "...",
    "ResultsDocument": {
      "DocumentTypeId": "00000000-0000-0000-0000-000000000000",
      "DocumentTypeName": "Default",
      "Fields": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "FieldType": "Table",
          "Values": []
        }
      ],
      "Tables": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "Values": [
            {
              "Cells": [
                { "FieldId": "Fleet Code", "Value": "FL-7892", "Confidence": 0.95 },
                { "FieldId": "Fuel type", "Value": "Diesel", "Confidence": 0.97 }
              ]
            }
          ]
        }
      ]
    }
  }
}
{
  "extractionResult": {
    "DocumentId": "...",
    "ResultsDocument": {
      "DocumentTypeId": "00000000-0000-0000-0000-000000000000",
      "DocumentTypeName": "Default",
      "Fields": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "FieldType": "Table",
          "Values": []
        }
      ],
      "Tables": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "Values": [
            {
              "Cells": [
                { "FieldId": "Fleet Code", "Value": "FL-7892", "Confidence": 0.95 },
                { "FieldId": "Fuel type", "Value": "Diesel", "Confidence": 0.97 }
              ]
            }
          ]
        }
      ]
    }
  }
}

Document Understanding (v1 または v1.1) との構造上の主な違いは次のとおりです。

すべてのフィールドは フィールドグループに属し、応答 Table タイプとして表示されます。
単一値フィールドでも、表の行構造にラップされます。
Tables 配列には、実際のセル値が含まれます。

API v2

v2 では、IXP のフィールドグループは Tableではなく FieldType: "FieldGroup" にマッピングされます。これは、IXP フィールドグループの概念の正確なマッピングです。各フィールドでは、すべてを文字列として表すのではなく、Text、Number、Date、MonetaryQuantity などの実際の IXP データ型が保持されます。

詳しくは、「 API v1 から v2 に移行する」をご覧ください。

{
  "extractionResult": {
    "ResultsDocument": {
      "Fields": [
        {
          "FieldId": "Default.Seller",
          "FieldName": "Seller",
          "FieldType": "FieldGroup",
          "IsMissing": false,
          "DataSource": "Automatic",
          "Values": [
            {
              "Components": [
                {
                  "FieldId": "Default.Seller.Name",
                  "FieldName": "Name",
                  "FieldType": "Text",
                  "Values": [
                    {
                      "Value": "John Doe",
                      "Confidence": 0.9999834
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  }
}
{
  "extractionResult": {
    "ResultsDocument": {
      "Fields": [
        {
          "FieldId": "Default.Seller",
          "FieldName": "Seller",
          "FieldType": "FieldGroup",
          "IsMissing": false,
          "DataSource": "Automatic",
          "Values": [
            {
              "Components": [
                {
                  "FieldId": "Default.Seller.Name",
                  "FieldName": "Name",
                  "FieldType": "Text",
                  "Values": [
                    {
                      "Value": "John Doe",
                      "Confidence": 0.9999834
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  }
}

v1との主な違い:

FieldType: "FieldGroup" は FieldType: "Table"に置き換わります。
Tables配列が削除されます。フィールドグループは Fieldsで直接返されます。
個々のフィールドがすべて文字列であるのではなく、それぞれの IXP データ型が保持されます。
フィールド ID はドット表記 (例: Default.Seller.Name) を使用します。

IXP 検出の応答構造

IXP プロジェクトでは、タグと projectVersion を使用してバージョン管理を公開しています。

{
  "id": "044fedbc-40a6-8078-8f06-02a0d362ab44",
  "name": "Transcom Invoices - Andras",
  "type": "IXP",
  "properties": ["SupportsTags", "SupportsVersions"],
  "extractors": [
    {
      "id": "gpt_ixp_67",
      "documentTypeId": "00000000-0000-0000-0000-000000000000",
      "projectVersion": 67
    }
  ],
  "projectVersions": [
    { "version": 67, "tag": "live", "deployed": true }
  ],
  "classifiers": []
}
{
  "id": "044fedbc-40a6-8078-8f06-02a0d362ab44",
  "name": "Transcom Invoices - Andras",
  "type": "IXP",
  "properties": ["SupportsTags", "SupportsVersions"],
  "extractors": [
    {
      "id": "gpt_ixp_67",
      "documentTypeId": "00000000-0000-0000-0000-000000000000",
      "projectVersion": 67
    }
  ],
  "projectVersions": [
    { "version": 67, "tag": "live", "deployed": true }
  ],
  "classifiers": []
}

たとえば、タグ名は、IXP ユーザーインターフェイスの [Production] ラベルまたは [Stage] ラベル live マップされます。

IXP の抽出エンドポイントを呼び出す際には、以下の点に留意してください。

プロンプトは不要: Document Understanding の生成 AI 抽出器や分類器とは異なり、IXP の抽出スキーマは IXP プロジェクトのタクソノミーで事前に定義されています。API 呼び出しで prompts を渡しません。
タグ = モデルバージョン: 呼び出す運用バージョンまたはステージングバージョンに対応するタグを使用します。
DocumentTypeId: IXP プロジェクトでは通常、1 つの既定のドキュメントの種類 (00000000-0000-0000-0000-000000000000) が使用されます。
ページ制限: GPT-4o は最大 50 ページ、Gemini は 1 回の呼び出しで最大 500 ページ。
使用状況の測定: IXP の抽出は、ご利用の料金プランに応じて次のように請求されます。
- フレックスプラン: 1 ページあたり 1 AI ユニット。ページがすでにアップストリームで分類されている場合 (Document Understanding モダンプロジェクトなど) は 1 ページあたり 0.8 AI ユニット。
- ユニファイドプライシング: ページあたり 0.2 プラットフォームユニット失敗した要求はユニットを消費しません。
データ保持: デジタル化には 7 日間、抽出には 24 時間かかります。

注:

Document Understanding ライセンスと IXP ライセンスは一緒に使用できます。詳しくは、「使用状況の測定と請求ロジック (フレックスプラン) 」および「 IXP フレックスプライシングプラン」をご覧ください。

このページは役に立ちましたか?

前へワークフロー経由でモデルを利用する

次へAPI 監査イベント

前提条件​

外部アプリケーションを作成する​

アクセス トークンを取得する​

主な違い​

IXP ワークフロー​

並列のデジタル化と抽出 (IXP のみ)​

手順 1: IXP プロジェクトを検出する​

タグを取得する (パブリッシュ済みバージョン)​

ドキュメントの種類を取得​

手順 2: ドキュメントをデジタル化する​

ステップ 3: 抽出​

タグベースの抽出​

同期 (最大 5 ページ)​

非同期 (複数ページ)​

抽出器 ID ベースの抽出​

手順 4: 検証 (任意)​

IXP の抽出応答の構造​

API v1 または v1.1​

API v2​

IXP 検出の応答構造​