activities

latest

false

重要 :

请注意，此内容已使用机器翻译进行了部分本地化。新发布内容的本地化可能需要 1-2 周的时间才能完成。

用户界面自动化活动

表格数据提取

“表格提取”是 Studio 新式体验的一部分，使您能够使用“用户界面自动化”活动包自动从应用程序中提取结构化数据，并将其保存为 DataTable 对象，然后可以在自动化流程中进一步使用。

此流程可以使用 Studio 中的“表格提取录制器” 来完成，如果您当前的项目中安装了“用户界面自动化v21.4”或更高版本的包，并且您已选择“新式体验” ，则可以从功能区访问该录制器。

在工作流中使用“提取表格数据”活动时，也会使用相同的向导。

使用表格提取记录器

如果您在项目中选择了“新式体验”，并安装了用户界面自动化活动包，则可以在 Studio 的功能区中找到“表格提取”记录器。

Clicking the Table Extraction button in the Ribbon opens up the Table Extraction wizard.

此向导可让您轻松配置“提取表格数据”活动的整套功能。

要在可用的 UI 框架（ Default 、 UIAutomation或Active Accessibility ）之间切换，您可以从下拉菜单中选择一个选项或按F4 。

此外， “信息”部分将指导您完成成功提取任何结构化数据所需的所有步骤。该部分可以折叠，以显示有关您当前所在步骤的更多信息。

要开始提取数据，只需单击“添加数据”按钮。这将开始指定一系列相似元素的过程，这些元素可用于标识要创建的表格。这将启动“指明”流程，该流程会高亮显示您当前正在使用的应用程序中检测到的所有元素。通过选择按钮，则可以提取所提取数据的 URL 和图像来源。这些内容将作为新列添加到最终表格中。

As you can see above, after clicking a column header, the wizard prompts you with a message, asking whether you want to extract all of the available columns, which are automatically identified. Selecting Yes scrapes the entire table.

如果您选择的元素（最低共同祖先）仅与第一列中的一个元素更接近（最低共同祖先），则系统会自动将其视为新列的第一个元素。

If the table spans multiple pages, you can simply click Next Button and select the next page navigation button or link.

可以单独编辑或删除每一列，使您能够根据需要自定义最终表格。

Once you have selected all the data you want, simply clicking the Save and return to Studio button automatically closes the wizard and saves everything you have done in your workflow.

编辑数据提取数据

You can resume editing an already scraped table by using the Edit extract data option in the contextual menu in the body of the Extract Table Data activity. Using this option reopens the wizard with all of the configurations performed earlier and enables you to pick up where you left off.

编辑列

单击要编辑的列旁边的齿轮图标，将打开“列设置” 窗口。

Here, you can edit the Column Name. This can be done by simply using the text box and specifying the name you want for the column in the final table.

The Parse data as drop-down menu enables you to select between the three main types of data you can use for the columns, Text, Number, and Date & Time.

The Sample text box displays a sample of a value in the column being parsed as the data type you chose in the Parse data as drop-down.

文本

The Sort drop-down menu specifies whether you want to sort the data in the column or not. By default, None is selected, meaning the data is not sorted in any way. If you want to sort the data in the column alphabetically, you can do so by selecting Ascending or Descending, depending on the method you prefer.

数字

Selecting Number in the Parse data as drop-down displays other, number-specific options.

The Sort drop-down menu specifies whether you want to sort the data in the column or not. By default, None is selected, meaning the data is not sorted in any way. If you want to sort the data in the column alphanumerically, you can do so by selecting Ascending or Descending, depending on the method you prefer.

The Decimal separator specifies the symbol you want to use for decimal separation in your final table. By default, this symbol is ..

The Thousands separator specifies the symbol you want to use for thousand separation in your final table. By default, this symbol is ,.

备注：

When scraping numbers, they are parsed according to the selected options, and separators and other symbols (e.g. $) are removed.

日期与时间

Selecting Date & Time in the Parse data as drop-down displays other options, specific to date and time formats.

If the column you are editing does not match the format that is specified, the Column Settings window lets you know in the Sample section.

The Sort drop-down menu specifies whether you want to sort the data in the column or not. By default, None is selected, meaning the data is not sorted in any way. If you want to sort the data in the column by date, you can do so by selecting Ascending or Descending, depending on the method you prefer.

The Data parse format drop-down enables you to select from a multitude of date and time formats that are supported.

备注：

When selecting dates, they are formatted according to the format that is selected in your operating system. The parsing format selected in the wizard is just to identify the data you are scraping.

设置部分

The Settings section lets you choose if you want to limit the extraction of elements in the table. By default, this option is set to No limit, which does not limit the extraction in any way, scraping the entire visible table.

The Max rows option limits the scraping according to the number of rows that is mentioned in the field to the right. By default, this is set to 1000 rows.

The Max pages option limits the scraping according to the number of pages that is mentioned in the field to the right. By default, this is set to 100 pages.

预览部分

The Preview section specifies how many columns and rows are identified for the table you have indicated. Also, by clicking the eye button, you can see a preview of the extracted table.

在离线模式下编辑时禁用预览。

提取元数据

The Extract metadata property contains an XML definition of the path identifying the data to be extracted for each column. The path is built starting from the data extraction target (defined by your selector) to the column elements. The path uses attributes such as tag, idx, and text.

示例：

<extract> 
<!—columns data identified by a path > 
<column exact='1' name=’Description’ attr='text'> 
<webctrl tag='div' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='span' idx='1' /> 
</column> 
<column exact='1' name=’Currency’ attr='text'> 
<webctrl tag='div' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='span' idx='2' /> 
</column> 
</extract>
<extract> 
<!—columns data identified by a path > 
<column exact='1' name=’Description’ attr='text'> 
<webctrl tag='div' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='span' idx='1' /> 
</column> 
<column exact='1' name=’Currency’ attr='text'> 
<webctrl tag='div' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='span' idx='2' /> 
</column> 
</extract>

当 tag、 idx和 text 属性不足以识别用户指定的示例数据时，系统会生成 CSS 选取器，而不是路径。此选取器使用示例元素的公共类。

示例：

<extract> 
<!—column data identified by a path > 
<column exact='1' name='Description' attr='text'> 
<webctrl tag='li' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='a' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='h3' idx='1' /> 
</column> 
<!—column data identified by a css-selector > 
<column css-selector='.currency-value' name='Currency' attr='text' /> 
</extract>
<extract> 
<!—column data identified by a path > 
<column exact='1' name='Description' attr='text'> 
<webctrl tag='li' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='a' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='h3' idx='1' /> 
</column> 
<!—column data identified by a css-selector > 
<column css-selector='.currency-value' name='Currency' attr='text' /> 
</extract>

For the Description column, tag and index attributes are used to identify the column data.

For the Currency column, the elements are identified via the CSS-selector which contains the common class of the samples.

（可选）CSS 选取器（如果可用）也可用于“说明”：

<extract> 
<!—columns data identified by css-selectors > 
<column css-selector='.product-title ' name='Description' attr='text' /> 
<column css-selector='.currency-value' name='Currency' attr='text' /> 
</extract>
<extract> 
<!—columns data identified by css-selectors > 
<column css-selector='.product-title ' name='Description' attr='text' /> 
<column css-selector='.currency-value' name='Currency' attr='text' /> 
</extract>

行定义使用与列相同的标识方法，用于提取相关数据。行包含每列中的一个元素。

示例：

<extract> 
<! -- row definition - ->  
<row exact='1'> 
<webctrl tag='li' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='a' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
</row> 
<column exact='1' name='Description' attr='text'> 
<webctrl tag='li' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='a' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='h3' idx='1' /> 
</column> 
<column css-selector='.currency-value' name='Column' attr='text' /> 
</extract>
<extract> 
<! -- row definition - ->  
<row exact='1'> 
<webctrl tag='li' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='a' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
</row> 
<column exact='1' name='Description' attr='text'> 
<webctrl tag='li' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='a' idx='1' /> 
<webctrl tag='div' idx='2' /> 
<webctrl tag='div' idx='1' /> 
<webctrl tag='h3' idx='1' /> 
</column> 
<column css-selector='.currency-value' name='Column' attr='text' /> 
</extract>

表格设置

This property contains an XML definition of the column settings, as they were defined in the scraping wizard. Column properties like Name or Format can be changed directly in this XML definition and will be used at runtime when building the output data table.

示例：

<table xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema' Type='Structured'> 
<Column xsi:type='DataColumn' ReferenceName='Column0' Name=’Description'> 
<Format xsi:type='TextColumnFormat' /> 
</Column> 
<Column xsi:type='DataColumn' ReferenceName='Column2' Name=’Currency'> 
<Format xsi:type='TextColumnFormat' /> 
</Column> 
</Table>
<table xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema' Type='Structured'> 
<Column xsi:type='DataColumn' ReferenceName='Column0' Name=’Description'> 
<Format xsi:type='TextColumnFormat' /> 
</Column> 
<Column xsi:type='DataColumn' ReferenceName='Column2' Name=’Currency'> 
<Format xsi:type='TextColumnFormat' /> 
</Column> 
</Table>

在此页面上

使用表格提取记录器
编辑数据提取数据
编辑列
设置部分
预览部分
提取元数据
表格设置

此页面有帮助吗？

前一个用户界面元素提取

下一个Linux 机器人

用户界面自动化活动

使用表格提取记录器​

编辑数据提取数据​

编辑列​

文本​

数字​

日期与时间​

设置部分​

预览部分​

提取元数据​

表格设置​