communications-mining
latest
false
UiPath logo, featuring letters U and I in white

Communications Mining Developer Guide

Last updated Dec 20, 2024

Self-hosted EWS integration

The EWS appliance is delivered as a Docker image. The sections below explain how to configure and deploy the appliance.

Configuration

The appliance expects a JSON config file to be present. This section explains the contents of the file. Refer to the Deployment section for instructions on how to make the config file available to the appliance.

With OAuth 2.0

You can authenticate with client secret or with client certificate.

The token grant flow used is the client credentials flow.

With client secret

{
  "ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
  "auth_type": "oauth2",
  "auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
  "auth_oauth_client_id": "<client_id>",
  "auth_oauth_client_secret": "<client_secret>",
  "access_type": "impersonation",
  "mailboxes": {
    "abc@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    },
    "xyz@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    }
  }
}{
  "ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
  "auth_type": "oauth2",
  "auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
  "auth_oauth_client_id": "<client_id>",
  "auth_oauth_client_secret": "<client_secret>",
  "access_type": "impersonation",
  "mailboxes": {
    "abc@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    },
    "xyz@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    }
  }
}

With client certificate

{
  "ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
  "auth_type": "oauth2",
  "auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
  "auth_oauth_client_id": "<client_id>",
  "auth_oauth_client_credential_private_key": "<private_key>",
  "auth_oauth_client_credential_thumbprint": "<thumbprint>",
  "access_type": "impersonation",
  "mailboxes": {
    "abc@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    },
    "xyz@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    }
  }
}{
  "ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
  "auth_type": "oauth2",
  "auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
  "auth_oauth_client_id": "<client_id>",
  "auth_oauth_client_credential_private_key": "<private_key>",
  "auth_oauth_client_credential_thumbprint": "<thumbprint>",
  "access_type": "impersonation",
  "mailboxes": {
    "abc@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    },
    "xyz@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    }
  }
}

With NTLM

{
  "host": "https://exchange-server.example.com",
  "port": 443,
  "auth_type": "ntlm",
  "auth_user": "ews-service-user@example.com",
  "access_type": "delegate",
  "mailboxes": {
    "abc@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    },
    "xyz@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    }
  }
}{
  "host": "https://exchange-server.example.com",
  "port": 443,
  "auth_type": "ntlm",
  "auth_user": "ews-service-user@example.com",
  "access_type": "delegate",
  "mailboxes": {
    "abc@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    },
    "xyz@example.com": {
      "bucket": {
        "owner": "project-name",
        "name": "bucket-name"
      },
      "start_from": "bucket",
      "start_timestamp": "2020-01-01T00:00:00+00:00"
    }
  }
}
First, replace the dummy values in host, port, and auth_user with their real values, and change access_type if required. See the configuration reference for a description of these parameters and their allowed values.
Then, provide the service user password to the appliance as a REINFER_EWS_AUTH_PASS environment variable - see the Deployment section. The full list of environment variables that you can set to override values in the config is:
NAMEDESCRIPTION
REINFER_EWS_AUTH_USERExchange server user
REINFER_EWS_AUTH_PASSExchange server password
REINFER_EWS_ACCESS_TYPEAccess type: "delegate" or "impersonation"
REINFER_EWS_HOSTExchange server host
REINFER_EWS_PORTExchange server port

Mailbox configuration

You can specify one or more mailboxes in your configuration. For each mailbox, you have to provide the mailbox address and specify the following parameters:

NAMEDESCRIPTION
bucket.ownerProject of the bucket in which the mailbox should be synced.
bucket.nameName of the bucket in which the mailbox should be synced.
start_fromWhether to start from last synced time ("bucket") or ignore last synced time and always start from start_timestamp ("config"). Should be set to "bucket" for normal operation, but "config" can be useful in some cases when debugging.
start_timestampTimestamp from which to start syncing email. If not set, all emails will be synced.

The configuration uses the default values for a number of settings such as polling frequency or batch size. To customize your configuration further, refer to the configuration reference.

Buckets

The Exchange integration syncs raw email data into Communications Mining buckets. Same as other Communications Mining resources, a bucket is created in a project which allows you to control access to the bucket. In order to read from a bucket, upload to a bucket, or manage buckets, the user needs the respective permissions in the project the bucket is in.

Deployment

You can deploy the EWS appliance either with Kubernetes or with Docker.

Deploying with Kubernetes allows you to run multiple instances of the EWS appliance, with each instance handling a subset of mailboxes to be synced.

With Kubernetes

Using Kubernetes is a popular way to run and manage containerized applications. This section shows you how to deploy the EWS appliance using Kubernetes. It assumes that you have basic familiarity with Kubernetes and have kubectl installed. Please check this documentation if you need help getting started with Kubernetes.

In order to deploy to Kubernetes, you need to create a YAML file describing your application. To start, copy the example below.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: reinfer-ews-appliance
  labels:
    app: reinfer-ews-appliance
spec:
  podManagementPolicy: Parallel
  replicas: 1
  selector:
    matchLabels:
      app: reinfer-ews-appliance
  serviceName: reinfer-ews-appliance
  template:
    metadata:
      labels:
        app: reinfer-ews-appliance
      name: reinfer-ews-appliance
    spec:
      containers:
        - args:
            - "reinfer-ews"
            - "--bind"
            - "0.0.0.0:8000"
            - "--reinfer-api-endpoint"
            - "https://<mydomain>.reinfer.io/api/"
            - "--shard-name"
            - "$(POD_NAME)"
            # This value should match `spec.replicas` above
            - "--total-shards"
            - "1"
          env:
            - name: REINFER_EWS_CONFIG
              value: "/mnt/config/example_ews_config"
            - name: REINFER_API_TOKEN
              valueFrom:
                secretKeyRef:
                  key: reinfer-api-token
                  name: reinfer-credentials
            - name: REINFER_EWS_AUTH_PASS
              valueFrom:
                secretKeyRef:
                  key: ews-auth-pass
                  name: reinfer-credentials
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
          image: "your.private.registry.com/reinfer/ews-appliance:TAG"
          name: reinfer-ews-appliance
          resources:
            requests:
              cpu: 0.05
              memory: 128Mi
          volumeMounts:
            - mountPath: /mnt/config
              name: config-vol
      volumes:
        - configMap:
            name: ews-config
            items:
              - key: example_ews_config
                path: example_ews_config
          name: config-volapiVersion: apps/v1
kind: StatefulSet
metadata:
  name: reinfer-ews-appliance
  labels:
    app: reinfer-ews-appliance
spec:
  podManagementPolicy: Parallel
  replicas: 1
  selector:
    matchLabels:
      app: reinfer-ews-appliance
  serviceName: reinfer-ews-appliance
  template:
    metadata:
      labels:
        app: reinfer-ews-appliance
      name: reinfer-ews-appliance
    spec:
      containers:
        - args:
            - "reinfer-ews"
            - "--bind"
            - "0.0.0.0:8000"
            - "--reinfer-api-endpoint"
            - "https://<mydomain>.reinfer.io/api/"
            - "--shard-name"
            - "$(POD_NAME)"
            # This value should match `spec.replicas` above
            - "--total-shards"
            - "1"
          env:
            - name: REINFER_EWS_CONFIG
              value: "/mnt/config/example_ews_config"
            - name: REINFER_API_TOKEN
              valueFrom:
                secretKeyRef:
                  key: reinfer-api-token
                  name: reinfer-credentials
            - name: REINFER_EWS_AUTH_PASS
              valueFrom:
                secretKeyRef:
                  key: ews-auth-pass
                  name: reinfer-credentials
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
          image: "your.private.registry.com/reinfer/ews-appliance:TAG"
          name: reinfer-ews-appliance
          resources:
            requests:
              cpu: 0.05
              memory: 128Mi
          volumeMounts:
            - mountPath: /mnt/config
              name: config-vol
      volumes:
        - configMap:
            name: ews-config
            items:
              - key: example_ews_config
                path: example_ews_config
          name: config-vol

Before you can deploy the appliance using this YAML file, there are a few additional steps you need to perform.

First, replace <mydomain>.reinfer.io with your tenant API endpoint.
Second, since we would like to avoid storing credentials as cleartext in our YAML file, the REINFER_TOKEN and REINFER_EWS_AUTH_PASS environment variables are populated from Kubernetes secrets. Create the secrets like so:
kubectl create secret generic reinfer-credentials \
  --from-literal=reinfer-api-token=<REINFER_TOKEN> \
  --from-literal=ews-auth-pass=<MSEXCHANGE_PASSWORD>kubectl create secret generic reinfer-credentials \
  --from-literal=reinfer-api-token=<REINFER_TOKEN> \
  --from-literal=ews-auth-pass=<MSEXCHANGE_PASSWORD>

Finally, since we would like to load the appliance config from a local file, we need to mount that file into the pod. We do this by storing the data in a Kubernetes ConfigMap and mounting the ConfigMap as a volume. Create the ConfigMap like so:

kubectl create configmap ews-config \
  --from-file=example_ews_config=your-ews-config.jsonkubectl create configmap ews-config \
  --from-file=example_ews_config=your-ews-config.json
Note:

As an alternative to storing the config file locally, you can upload it to Communications Mining and let the EWS appliance fetch it via the Communications Mining API. This is described here. If both local and remote config files are specified, the appliance will use the local config file.

You can now create your statefulset and check that everything is running:

kubectl apply -f reinfer-ews.yaml
kubectl get stskubectl apply -f reinfer-ews.yaml
kubectl get sts

With Docker

Alternatively, you can run the EWS appliance in Docker. The command below will start the appliance with the same parameters that are used in the Kubernetes section.

EWS_CONFIG_DIR=
REINFER_API_TOKEN=
TAG=

sudo docker run \
  -v $EWS_CONFIG_DIR:/mnt/config \
  --env REINFER_EWS_CONFIG=/mnt/config/your_ews_config.json \
  --env REINFER_API_TOKEN=$REINFER_API_TOKEN \
  eu.gcr.io/reinfer-gcr/ews:$TAG \
  --reinfer-api-endpoint https://<mydomain>.reinfer.io/api/ \
  &> ews_$(date -Iseconds).logEWS_CONFIG_DIR=
REINFER_API_TOKEN=
TAG=

sudo docker run \
  -v $EWS_CONFIG_DIR:/mnt/config \
  --env REINFER_EWS_CONFIG=/mnt/config/your_ews_config.json \
  --env REINFER_API_TOKEN=$REINFER_API_TOKEN \
  eu.gcr.io/reinfer-gcr/ews:$TAG \
  --reinfer-api-endpoint https://<mydomain>.reinfer.io/api/ \
  &> ews_$(date -Iseconds).log
  • Replace <mydomain>.reinfer.io with your tenant API endpoint.
  • Replace your_ews_config.json by the name of your EWS config JSON file.

The appliance will run continuously syncing emails into the Communications Mining platform. If stopped and started again, it will pick up from the last stored bucket sync state.

With Docker (local storage)

The EWS appliance can save extracted emails locally instead of pushing them into the Communications Mining platform.

EWS_LOCAL_DIR=
CONFIG_OWNER=
CONFIG_KEY=
TAG=

sudo docker run \
  -v $EWS_LOCAL_DIR:/mnt/ews \
  eu.gcr.io/reinfer-gcr/ews:$TAG \
  --local-files-prefix /mnt/ews \
  --remote-config-owner $CONFIG_OWNER --remote-config-key $CONFIG_KEY \
  &> ews_$(date -Iseconds).logEWS_LOCAL_DIR=
CONFIG_OWNER=
CONFIG_KEY=
TAG=

sudo docker run \
  -v $EWS_LOCAL_DIR:/mnt/ews \
  eu.gcr.io/reinfer-gcr/ews:$TAG \
  --local-files-prefix /mnt/ews \
  --remote-config-owner $CONFIG_OWNER --remote-config-key $CONFIG_KEY \
  &> ews_$(date -Iseconds).log
  • The appliance expects to find the config in $EWS_LOCAL_DIR/config/$CONFIG_OWNER/$CONFIG_KEY.json. You can alternatively provide the path to the config by setting the $REINFER_EWS_CONFIG environment variable as shown in the previous example.
  • The appliance will save the sync state to $EWS_LOCAL_DIR/state. If stopped and started again, it will pick up from the last stored sync state.
  • The appliance will save data to $EWS_LOCAL_DIR/data.

With Docker (Azure Blob Storage)

The EWS appliance can save extracted emails to Azure Blob Storage instead of pushing them into the Communications Mining platform.

EWS_CONFIG_DIR=
AZ_STORAGE_ACCOUNT_NAME=
AZ_CONTAINER_NAME=
TAG=

sudo docker run \
  -v $EWS_CONFIG_DIR:/mnt/config \
  --env REINFER_EWS_CONFIG=/mnt/config/your_ews_config.json \
  eu.gcr.io/reinfer-gcr/ews:$TAG \
  --private-file-prefix az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME \
  &> ews_$(date -Iseconds).logEWS_CONFIG_DIR=
AZ_STORAGE_ACCOUNT_NAME=
AZ_CONTAINER_NAME=
TAG=

sudo docker run \
  -v $EWS_CONFIG_DIR:/mnt/config \
  --env REINFER_EWS_CONFIG=/mnt/config/your_ews_config.json \
  eu.gcr.io/reinfer-gcr/ews:$TAG \
  --private-file-prefix az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME \
  &> ews_$(date -Iseconds).log
  • You should provide the path to the config by setting the $REINFER_EWS_CONFIG environment variable.
  • The appliance authenticates against Azure Blob Storage using one of the DefaultAzureCredential methods. Please use a method that is convenient for you. Regardless of the method you use, please grant the "Storage Blob Data Contributor" role to the appliance.

  • The appliance will save the sync state to az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME/state. If stopped and started again, it will pick up from the last stored sync state.
  • The appliance will save data to az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME/data.

Store configuration in Communications Mining

Instead of providing a local config file to the appliance like you did if you followed the EWS appliance deployment guide, you can instead manage the config file in Communications Mining. Note that if both local and remote config files are specified, the appliance will default to using the local config file.

First, upload your JSON config file to Communications Mining:

curl -H "Authorization: Bearer $REINFER_TOKEN" \
  -H "Content-Type: multipart/form-data" \
  -F 'file=@your-ews-config.json' \
  -XPUT https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>curl -H "Authorization: Bearer $REINFER_TOKEN" \
  -H "Content-Type: multipart/form-data" \
  -F 'file=@your-ews-config.json' \
  -XPUT https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>

To see the current config:

curl -H "Authorization: Bearer $REINFER_TOKEN" \
  -XGET https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>curl -H "Authorization: Bearer $REINFER_TOKEN" \
  -XGET https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>
Then, in the kubernetes YAML file, set the --remote-config-owner parameter to the project name, and the --remote-config-key parameter to the config name.

Reference

Application parameters

See the table below for a list of available application parameters. You can learn more about running the EWS appliance here.

PARAMETERDESCRIPTION
--reinfer-api-endpointEndpoint to connect to the Reinfer API. Mutually exclusive with --local-files-prefix.
--local-files-prefixPath to store synced emails and bucket sync state. Mutually exclusive with --reinfer-api-endpoint and REINFER_API_TOKEN.
--remote-config-ownerProject that owns the remote EWS appliance config file.
--remote-config-keyName of the remote EWS appliance config file.
--debug-levelDebug level. 0 = No debug, 1 = Service debug, 2 = Full debug. Default: 1.
--shard-nameShard name i.e. ews-N to extract shard number from. When running in Kubernetes, you can set it to the pod name.
--total-shardsThe total number of instances in the appliance cluster. When running in Kubernetes, must be set to the same value as the number of instances in the StatefulSet.
--restart-on-unrecoverable-errorsIf enabled, unrecoverable failures will result in the entire service being restarted without crashing.

Configuration parameters

See the table below for a list of available configuration parameters. You can learn more about writing the EWS appliance configuration file here.

NAMEDESCRIPTION
hostExchange server host. Can be overriden by the REINFER_EWS_HOST environment variable.
portExchange server port. Default: 80. Can be overriden by the REINFER_EWS_PORT environment variable.
auth_typeOnly "ntlm" allowed.
auth_userExchange server user. Can be overriden by the REINFER_EWS_AUTH_USER environment variable.
auth_passwordExchange server password. Can be overriden by the REINFER_EWS_AUTH_PASS environment variable.
access_typeAccess type: "delegate" or "impersonation". Default: "delegate". Can be overriden by the REINFER_EWS_ACCESS_TYPE environment variable.
ews_ssl_verifyIf set to "false", will not verify certificates. Default: "true".
poll_frequencyHow long to wait between batches, in seconds. Default: 15.
poll_message_sleepHow long to wait between individual emails in a batch, in seconds. Default: 0.1.
max_concurrent_uploadsNumber of concurrent uploads to Communications Mining, between 0 and 32. Default: 8.
emails_per_folderMax number of emails to fetch from each folder per batch, between 1 and 100,000. Default: 2520. This setting allows the appliance to make progress on all folders evenly in case there is a very large folder.
reinfer_batch_sizeHow many emails to fetch per batch, between 1 and 1000. Default: 80.
mailboxesList of mailboxes to fetch. See here for an explanation of how to configure the mailboxes.
audit_emailIf you have configured the appliance with a remote config, Communications Mining will send an email to this address whenever the config is updated. Default: None.
ews_ssl_ciphersMake EWS appliance use specific ciphers. The ciphers should be a string in the OpenSSL cipher list format. Default: None.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.