Introduction

eccenca DataIntegration APIs can be used to control, initiate and setup all task and activities related to the ★ Build step (such as datasets, transformations, linking tasks etc.).

Media Types

The default media type of most responses is application/json. Other possible response media types can be reached by changing the Accept header of the request.

Possible values of this HTTP header field are API dependent and listed as part of the specific HTTP method.

Dependent on the specific API, eccenca DataIntegration works with the following application media types which correspond to the following specification documents:

Media TypeSpecification Document
application/x-www-form-urlencodedHTML 4.01 Specification, Forms
application/jsonThe JavaScript Object Notation (JSON) Data Interchange Format
application/xmlXML Media Types
application/n-triplesRDF 1.1 N-Triples - A line-based syntax for an RDF graph
application/problem+jsonProblem Details for HTTP APIs

Security Schemes

The default security scheme is OAuth 2.0.
However, this can be changed in the configuration.

Logout (/logout)

Logout (/logout)

Full URL: /logout

Valid HTTP methods are:

POST

Logs the user out and redirects her to the loggedOut page.

Response

The expected response:

  • HTTP Status Code 303:

Version (/version)

Version (/version)

Full URL: /version

Valid HTTP methods are:

GET

The version of the eccenca DataIntegration application.

Response

The expected response:

  • HTTP Status Code 200:

    • text/plain:

      • example:

        1.0.0

Logged Out Page (/loggedOut)

Logged Out Page (/loggedOut)

Full URL: /loggedOut

Valid HTTP methods are:

GET

A HTML page signalling the user that she is logged out.

Response

The expected response:

  • HTTP Status Code 200:

    • text/html:

Health (/health)

Health (/health)

Full URL: /health

Valid HTTP methods are:

GET

Returns config and health information about components DataIntegration depends on like DataPlatform and Spark. The status property values are either ‘UP’ or ‘DOWN’. If any sub-component ist down, then the parent component is also marked down. Not configured components are hidden.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
            "details": {
                "sparkHealthIndicator": {
                    "details": {
                        "appName": "eccenca DataIntegration Spark executor",
                        "sparkContextStarts": true,
                        "startTime": 1529565769538
                    },
                    "status": "UP"
                }
            },
            "status": "UP"
        }
        
  • HTTP Status Code 503:

    • application/json:

      • example:

        {
            "status": "DOWN",
            "details": {
                "dataPlatformHealthIndicator": {
                    "details": {
                        "authorizationUrl": "http://docker.local/dataplatform/oauth/authorize",
                        "dataPlatformHealthCheck": {
                            "errorMessage": "Could not connect to the DataPlatform: Connection refused: docker.local/127.0.0.1:80",
                            "healthEndpoint": "http://docker.local/dataplatform/actuator/health"
                        },
                        "dataPlatformUrl": "http://docker.local/dataplatform",
                        "oAuthEnabled": true,
                        "tokenUrl": "http://docker.local/dataplatform/oauth/token"
                    },
                    "status": "DOWN"
                },
                "sparkHealthIndicator": {
                    "details": {
                        "appName": "eccenca DataIntegration Spark executor",
                        "sparkContextStarts": true,
                        "startTime": 1529565583387
                    },
                    "status": "UP"
                }
            }
        }
        

Core API (/core)

Offers access to basic functionality of the runtime.

All Plugins (/plugins)

Full URL: /core/plugins

Lists all available plugins. The returned JSON format stays as close to JSON Schema as possible.

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • addMarkdownDocumentation:

    If true, MarkDown documentation will be added for plugins if available.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "pluginId1": {
            "title": "human-readable plugin label",
            "description": "human-readable plugin description.",
            "markdownDocumentation": "Documentation:\n\n* Optional\n* more detailed\n* Markdown documentation",
            "type": "object",
            "properties": {
              "parameterName1": {
                "title": "parameter label",
                "description": "parameter description",
                "type": "string",
                "value": "",
                "advanced": false,
                "autoCompletion" : {
                  "allowOnlyAutoCompletedValues" : true,
                  "autoCompleteValueWithLabels" : true,
                  "autoCompletionDependsOnParameters" : ["otherParamName"]
                }
              }
            },
            "required": []
          },
          "pluginId2": {
            "title": "human-readable plugin label",
            "description": "human-readable plugin description.",
            "type": "object",
            "properties": {},
            "required": []
          }
        }
        

All Plugins of a Type (/plugins/{pluginType})

Full URL: /core/plugins/{pluginType}

Lists all available plugins that implement the given plugin type. The returned JSON format stays as close to JSON Schema as possible.

URI parameters:

  • pluginType, required:

    • type: (string)

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • addMarkdownDocumentation:

    If true, MarkDown documentation will be added for plugins if available.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "pluginId1": {
            "title": "human-readable plugin label",
            "description": "human-readable plugin description.",
            "markdownDocumentation": "Documentation:\n\n* Optional\n* more detailed\n* Markdown documentation",
            "type": "object",
            "properties": {
              "parameterName1": {
                "title": "parameter label",
                "description": "parameter description",
                "type": "string",
                "value": "",
                "advanced": false,
                "autoCompletion" : {
                  "allowOnlyAutoCompletedValues" : true,
                  "autoCompleteValueWithLabels" : true,
                  "autoCompletionDependsOnParameters" : ["otherParamName"]
                }
              }
            },
            "required": []
          },
          "pluginId2": {
            "title": "human-readable plugin label",
            "description": "human-readable plugin description.",
            "type": "object",
            "properties": {},
            "required": []
          }
        }
        

Workspace API (/workspace)

The Workspace API provides read / write access to the workspace.

Projects (/projects)

Full URL: /workspace/projects

This is the resource collection of projects in the workspace.

Valid HTTP methods are:

GET

Get a list with all projects.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
          {
            "name": "example_project",
            "tasks": {
              "dataset": [
                "dataset1",
                "dataset2"
              ],
              "transform": [
                "transformation1"
              ],
              "linking": [],
              "workflow": [
                "workflow1"
              ],
              "custom": []
            }
          },
          {
            "name": "lending",
            "tasks": {
              "dataset": [
                "cmem",
                "links",
                "loans_csv",
                "output_csv",
                "unemployment_csv"
              ],
              "transform": [
                "generateOutput",
                "transform_loans",
                "transform_loans_csv",
                "transform_unemployment"
              ],
              "linking": [
                "link_loans_unemployment",
                "linkingtest"
              ],
              "workflow": [
                "workflow1"
              ],
              "custom": []
            }
          }
        ]

Project (/{project})

Full URL: /workspace/projects/{project}

URI parameters:

  • project, required:

    • type: (string)

Valid HTTP methods are:

GET

Get a single project.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "name": "projectID",
          "tasks": {
            "dataset": [
              "dataset1",
              "dataset2",
            ],
            "transform": [
              "transform1",
              "transform2"
            ],
            "linking": [ ],
            "workflow": [ ],
            "custom": [ ]
          }
        }
        

PUT

Create a new empty project.

Response

The expected response:

  • HTTP Status Code 201:

    • application/json:

      • example:

        {"name": "project name"}
        
  • HTTP Status Code 409:

    • application/json:

      • example:

        {"error":{"message":"Resource already exists. Creation failed."}}
        

DELETE

Delete a project.

Response

The expected response:

  • HTTP Status Code 200:

Clone Project (/clone)

Full URL: /workspace/projects/{project}/clone

Clones a project.

Valid HTTP methods are:

POST

Query Parameters

This method accepts the following query parameters:

  • newProject, required:

    The name of the cloned project.

    • type: (string)

Import Project (/import/{importPlugin})

Full URL: /workspace/projects/{project}/import/{importPlugin}

Imports a project from the file send with the request. The importPlugin path parameter must be one of the ids returned from the marshallingPlugins endpoint.

URI parameters:

  • importPlugin, required:

    • type: (string)

Valid HTTP methods are:

POST

Export Project (/export/{exportPlugin})

Full URL: /workspace/projects/{project}/export/{exportPlugin}

Export the project with the specified marshalling plugin, where exportPlugin must be the id of one of the ids returned from the marshallingPlugins endpoint.

URI parameters:

  • exportPlugin, required:

    • type: (string)

Valid HTTP methods are:

GET

Project Resources (/resources)

Full URL: /workspace/projects/{project}/resources

Lists all resources available to a specific project. Resources of a project are for example files used as input or output.

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • searchText:

    If defined the resources will be filtered by the search text which searches over the resource names.

    • type: (string)
  • limit:

    Limits the number of resources returned by this endpoint.

    • type: (integer)
  • offset:

    The offset in the result list. Offset and limit allow paging over the results.

    • type: (integer)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [{
          "name": "source.nt",
          "lastModified": "2020-01-09T12:17:12Z",
          "size": 3836164
        }, {
          "name": "target.nt",
          "lastModified": "2020-01-09T12:17:12Z",
          "size": 1288984
        }, {
          "name": "subsource.nt",
          "lastModified": "2020-01-09T12:17:12Z",
          "size": 3836164
        }]
        

Project Resource (/{resourceName})

Full URL: /workspace/projects/{project}/resources/{resourceName}

Allows to retrieve, create and update project resources.

URI parameters:

  • resourceName, required:

    • type: (string)

Valid HTTP methods are:

GET

Retrieves the contents of a resource.

Response

The expected response:

  • HTTP Status Code 200:

PUT

Adds a file from the local file system to the project. There are three options to upload files:

  1. Providing a local resource using the file form parameter.
  2. Providing a remote resource using the resource-url form parameter. The provided resource will be downloaded and added to the project.
  3. Providing the file as body payload. Supplying no body will create an empty resource.

The options are exclusive, i.e., only one option can be used per request.

Body

This method accepts the following body payloads:

  • multipart/form-data:

    • form parameters:

      • file:

        The file to be uploaded.

        • type: (file)
      • resource-url:

        An URL from which the file should be retrieved.

        • type: (string)
  • application/octet-stream:

    • example:

      The raw bytes to be uploaded.
  • text/plain:

    • example:

      The text to be uploaded.
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 204:

  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

DELETE

If the resource exists, delete it. Also returns 200 if the resource does not exist.

Response

The expected response:

  • HTTP Status Code 204:

Project Resource Meta Data (/metadata)

Full URL: /workspace/projects/{project}/resources/{resourceName}/metadata

Retrieves the properties of a specific resource.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "name": "source.nt",
          "relativePath": "source.nt",
          "absolutePath": "/var/dataintegration/workspace/movies/resources/source.nt",
          "size": 3836164,
          "modified":"2020-01-13T14:34:03Z"
        }
        

Project resource usage (/usage)

Full URL: /workspace/projects/{project}/resources/{resourceName}/usage

Returns a list of datasets that are using the specified resource.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        ["dataset 1", "dataset 2"]
        

Project Tasks (/tasks)

Full URL: /workspace/projects/{project}/tasks

Valid HTTP methods are:

POST

Adds a new task to the project. If the ‘id’ parameter is omitted in the request, an ID will be generated from the label – which is then required.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "id": "myTransform",
        "metadata": {
          "label": "task label",
          "description": "task description"
        },
        "data:" {
          "taskType": "Transform",
          "selection": {
            "inputId": "DBpedia",
            "typeUri": "http://dbpedia.org/ontology/Film",
            "restriction": ""
          },
          "outputs": [],
          "targetVocabularies": []
        }
      }
      
Response

The expected response:

  • HTTP Status Code 201:

    • The task has been created successfully.
  • Location:

    • type: (string)

    • example:

      /workspace/projects/project/tasks/03c320c1-501e-4740-9113-75fe192c43dd_tasklabel
  • HTTP Status Code 400:

    • The provided task specification is invalid.
  • HTTP Status Code 409:

    • A task with the given identifier already exists.

Project Task (/{task})

Full URL: /workspace/projects/{project}/tasks/{task}

URI parameters:

  • task, required:

    • type: (string)

Valid HTTP methods are:

PUT

Adds or updates a task.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "id": "myTransform",
        "metadata": {
          "label": "task label",
          "description": "task description"
        },
        "data:" {
          taskType": "Transform",
          "selection": {
            "inputId": "DBpedia",
            "typeUri": "http://dbpedia.org/ontology/Film",
            "restriction": ""
          },
          "outputs": [],
          "targetVocabularies": []
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The task has been added or updated successfully.
  • HTTP Status Code 400:

    • The provided task specification is invalid.

PATCH

Updates selected properties of a task. Only the sent JSON paths will be updated, i.e., the provided JSON is deep merged into the existing task JSON.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "metadata": {
          "description": "task description"
          // "label" will be left unchanged
        }
        // All other properties will be left unchanged
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The task has been updated successfully.
  • HTTP Status Code 400:

    • The provided task specification is invalid.

GET

Retrieves a task from the project.

Query Parameters

This method accepts the following query parameters:

  • withLabels:

    If true, all parameter values will be reified in a new object that has an optional label property. A label is added for all auto-completable parameters that have the ‘autoCompleteValueWithLabels’ property set to true. This guarantees that a user always sees the label of such values. For object type parameters that have set the ‘visibleInDialog’ flag set to true, this reification is done on all levels. For object type parameters that should not be shown in UI dialogs this is still done for the first level of the task itself, but not deeper. These values should never be set or updated by a normal UI dialog anyway and should be ignored by a task dialog.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        
        without labels:
        
          {
            "id": "transform",
            "metadata": {
              "label": "task label",
              "description": "task description"
            },
            "data": {
              "taskType": "Transform",
              "parameters": {
                "selection": {
                  "inputId": "DBpedia",
                  "typeUri": "http://dbpedia.org/ontology/Film",
                  "restriction": ""
                },
                "mappingRule": {
                  "type": "root",
                  "id": "root",
                  "rules": {
                    "uriRule": null,
                    "typeRules": [],
                    "propertyRules": []
                  }
                },
                "outputs": [],
                "targetVocabularies": []
              }
            }
          }
        
        with labels:
        
          {
              "data": {
                  "parameters": {
                      "mappingRule": {
                          "value": {
                              "id": "root",
                              "rules": { ... },
                              "type": "root"
                          }
                      },
                      "output": {
                          "value": ""
                      },
                      "selection": {
                          "value": {
                              "inputId": {
                                  "label": "Some labe",
                                  "value": "datasetresource_1499719467735_loans_csv"
                              },
                              "restriction": {
                                  "value": ""
                              },
                              "typeUri": {
                                  "value": ""
                              }
                          }
                      },
                      "targetVocabularies": {
                          "value": []
                      }
                  },
                  "taskType": "Transform"
              },
              "id": "transform_datasetresource_1499719467735_loans_csv",
              "metadata": {
                  "label": "",
                  "modified": "2020-04-07T11:05:59.574Z"
              },
              "project": "cmem",
              "taskType": "Transform"
          }
        

DELETE

Deletes a task.

Query Parameters

This method accepts the following query parameters:

  • removeDependentTasks, required:

    If true, all tasks that directly or indirectly reference this task are removed as well.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • If the task has been deleted or there is no task with that identifier.

Task Meta Data (/metadata)

Full URL: /workspace/projects/{project}/tasks/{task}/metadata

The metadata of a task. Includes user metadata, such as the task label as well as technical metadata, such as the referenced tasks.

Valid HTTP methods are:

GET

Retrieves all metadata of a task.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          // user-defined metadata
          "label": "Task Label",
          "description": "Task Description",
          "modified": "2018-05-24T14:45:42.637Z",
        
          // general task properties
          "project": "MyProject",
          "id": "MyTask",
          "taskType": "Dataset"
        
          // input and output schemata
          "schemata": {
            "input": [{
              "paths": [
                "",
                ""
              ]
            }],
            "output": {
              "paths": [
                "targetUri",
                "confidence"
              ]
            }
          },
        
          // relations to other tasks
          "relations": {
            "inputTasks": [],
            "outputTasks": [],
            "referencedTasks": ["DBpedia", "linkedmdb"],
            "dependentTasksDirect": ["workflow"],
            "dependentTasksAll": ["workflow"]
          },
        }
        

PUT

Updates the user metadata of a task.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "label": "Task Label",
        "description": "Task Description"
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          // user-defined metadata
          ...
        
          // general task properties
          ...
        
          // input and output schemata
          ...
        
          // relations to other tasks
          ...
        }
        

Clone Task (/clone)

Full URL: /workspace/projects/{project}/tasks/{task}/clone

Clones a task.

Valid HTTP methods are:

POST

Query Parameters

This method accepts the following query parameters:

  • newTask, required:

    The name of the cloned task.

    • type: (string)

Copy Task to Another Project (/copy)

Full URL: /workspace/projects/{project}/tasks/{task}/copy

Copies a task to another project. All tasks that the copied task references (directly or indirectly) are copied as well. Referenced resources are copied only if the target project uses a different resource path than the source project. Using the dryRun attribute, a copy operation can be simulated, i.e., the response listing the tasks to be copied and overwritten can be checked first.

Valid HTTP methods are:

POST

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "targetProject": "targetProjectId",
        "dryRun": true
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "copiedTasks": [ "taskId1", "taskId2", "taskId3" ],
          "overwrittenTasks": [ "taskId1", "taskId2" ]
        }
        

Controlling Activities (/activities)

Full URL: /workspace/projects/{project}/tasks/{task}/activities

The Activity API provides endpoints to manage activities like starting and stopping.

Start Activity Non-Blocking (/start)

Full URL: /workspace/projects/{project}/tasks/{task}/activities/{activity}/start

Valid HTTP methods are:

POST

Starts the activity. The call returns immediately without waiting for the activity to complete.

Body

This method accepts the following body payloads:

  • application/x-www-form-urlencoded: Optionally updates configuration parameters, before starting the activity.
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Start Activity Blocking (/startBlocking)

Full URL: /workspace/projects/{project}/tasks/{task}/activities/{activity}/startBlocking

Valid HTTP methods are:

POST

Starts the activity and returns after it has completed.

Body

This method accepts the following body payloads:

  • application/x-www-form-urlencoded: Optionally updates configuration parameters, before starting the activity.
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Cancel Activity (/cancel)

Full URL: /workspace/projects/{project}/tasks/{task}/activities/{activity}/cancel

Valid HTTP methods are:

POST

Cancel the activity

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Activity Configuration (/config)

Full URL: /workspace/projects/{project}/tasks/{task}/activities/{activity}/config

The configuration resource of an activity contains configuration parameters specific to this activity.

Valid HTTP methods are:

POST

Configure the activity.

Body

This method accepts the following body payloads:

  • application/x-www-form-urlencoded: Updated configuration parameters.
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

GET

Get the configuration parameter settings of the activity.

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.

    • application/json:

      • example:

        {"configKey1": "config value 1", "configKey2": "config value 2"}
        
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Activity Status (/status)

Full URL: /workspace/projects/{project}/tasks/{task}/activities/{activity}/status

Returns the status of the activity. An activity may have the following status names:

  • Idle if the activity has not been started yet.
  • Waiting if the activity has been started and is waiting to be executed.
  • Running if the activity is currently being executed.
  • Canceling if the activity has been requested to stop but has not stopped yet.
  • Finished if the activity has finished execution, either successfully or failed.

Once an activity has been started using the API, the activity transitions to the Waiting status and the isRunning field switches to true. It will remain in Waiting until the execution starts when it transitions to the Running status. If a user cancels the activity during execution, it will transition to Canceling and remain there until it actually stops execution. When the execution finished it transitions to the Finished status and isRunning switches to false. If the activity execution failed, failed will be set to true once the Finished status has been reached.

While running, the progress is tracked by the progress field (0 to 100 percent).

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.

    • application/json:

      • example:

        {
          "project": "project name",
          "task": "transformation1",
          "activity": "transform",
          "statusName": "...",
          "isRunning": true,
          "progress": 85.2,
          "message": "...",
          "failed": false,
          "lastUpdateTime": 1503998693958,
          "startTime": 1503998693001
        }
        
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Activity Value (/value)

Full URL: /workspace/projects/{project}/tasks/{task}/activities/{activity}/value

Retrieves the current value of this activity, if any.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • The serialized value. The type is determined via content negotiation. Defaults to xml.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 406:

    • No serializer is registered for the requested format.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Controlling Activities (/activities)

Full URL: /workspace/projects/{project}/activities

The Activity API provides endpoints to manage activities like starting and stopping.

Start Activity Non-Blocking (/start)

Full URL: /workspace/projects/{project}/activities/{activity}/start

Valid HTTP methods are:

POST

Starts the activity. The call returns immediately without waiting for the activity to complete.

Body

This method accepts the following body payloads:

  • application/x-www-form-urlencoded: Optionally updates configuration parameters, before starting the activity.
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Start Activity Blocking (/startBlocking)

Full URL: /workspace/projects/{project}/activities/{activity}/startBlocking

Valid HTTP methods are:

POST

Starts the activity and returns after it has completed.

Body

This method accepts the following body payloads:

  • application/x-www-form-urlencoded: Optionally updates configuration parameters, before starting the activity.
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Cancel Activity (/cancel)

Full URL: /workspace/projects/{project}/activities/{activity}/cancel

Valid HTTP methods are:

POST

Cancel the activity

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Activity Configuration (/config)

Full URL: /workspace/projects/{project}/activities/{activity}/config

The configuration resource of an activity contains configuration parameters specific to this activity.

Valid HTTP methods are:

POST

Configure the activity.

Body

This method accepts the following body payloads:

  • application/x-www-form-urlencoded: Updated configuration parameters.
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

GET

Get the configuration parameter settings of the activity.

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.

    • application/json:

      • example:

        {"configKey1": "config value 1", "configKey2": "config value 2"}
        
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Activity Status (/status)

Full URL: /workspace/projects/{project}/activities/{activity}/status

Returns the status of the activity. An activity may have the following status names:

  • Idle if the activity has not been started yet.
  • Waiting if the activity has been started and is waiting to be executed.
  • Running if the activity is currently being executed.
  • Canceling if the activity has been requested to stop but has not stopped yet.
  • Finished if the activity has finished execution, either successfully or failed.

Once an activity has been started using the API, the activity transitions to the Waiting status and the isRunning field switches to true. It will remain in Waiting until the execution starts when it transitions to the Running status. If a user cancels the activity during execution, it will transition to Canceling and remain there until it actually stops execution. When the execution finished it transitions to the Finished status and isRunning switches to false. If the activity execution failed, failed will be set to true once the Finished status has been reached.

While running, the progress is tracked by the progress field (0 to 100 percent).

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.

    • application/json:

      • example:

        {
          "project": "project name",
          "task": "transformation1",
          "activity": "transform",
          "statusName": "...",
          "isRunning": true,
          "progress": 85.2,
          "message": "...",
          "failed": false,
          "lastUpdateTime": 1503998693958,
          "startTime": 1503998693001
        }
        
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Activity Value (/value)

Full URL: /workspace/projects/{project}/activities/{activity}/value

Retrieves the current value of this activity, if any.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • The serialized value. The type is determined via content negotiation. Defaults to xml.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 406:

    • No serializer is registered for the requested format.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Managing Datasets (/datasets)

Full URL: /workspace/projects/{project}/datasets

The Dataset API allows the management of datasets.

Dataset (/{dataset})

Full URL: /workspace/projects/{project}/datasets/{dataset}

A named dataset.

URI parameters:

  • dataset, required:

    • type: (string)

Valid HTTP methods are:

GET

Get the dataset configuration parameters

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.

    • application/xml:

      • example:

        
          
          
        
        
    • application/json:

      • example:

        {
          "id": "DatasetName",
          "data": {
            "type": "file",
            "parameters": {
              "file": "dataset.nt",
              "format": "N-TRIPLE",
            }
          }
        }
        
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

PUT

Updates or creates a dataset.

Query Parameters

This method accepts the following query parameters:

  • autoConfigure:

    If true, the dataset parameters will be auto configured. Only works with dataset plugins that support auto configuration, e.g., CSV.

    • type: (string)
Body

This method accepts the following body payloads:

  • application/xml:

    • example:

      
        
        
      
  • application/json:

    • example:

      {
        "id": "DatasetName",
        "data": {
          "type": "file",
          "parameters": {
            "file": "dataset.nt",
            "format": "N-TRIPLE",
          }
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 204:

  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

DELETE

Removes a dataset from the project.

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 204:

    • If the dataset has been deleted or there is no dataset with that identifier.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Auto-Configured Dataset (/autoConfigured)

Full URL: /workspace/projects/{project}/datasets/{dataset}/autoConfigured

An auto-configured version of the dataset.

Valid HTTP methods are:

GET

Get the auto-configured dataset configuration parameters.

Response

The expected response:

  • HTTP Status Code 200:

    • application/xml:

      • example:

        
          
          
        
        
    • application/json:

      • example:

        {
          "id": "DatasetName",
          "data": {
            "type": "file",
            "parameters": {
              "file": "dataset.nt",
              "format": "N-TRIPLE",
            }
          }
        }
        
  • HTTP Status Code 501:

    • text/plain:

      • example:

        The dataset type does not support auto-configuration.
        

Dataset Types (/types)

Full URL: /workspace/projects/{project}/datasets/{dataset}/types

Types of a dataset can be classes of an ontology or in the case of a CSV file, a single type.

Valid HTTP methods are:

GET

Get a list of entity types of this dataset.

Query Parameters

This method accepts the following query parameters:

  • textQuery:

    An optional multi-word text query to filter the types by.

    • type: (string)
  • limit:

    Returns max. that many types in the result. If not specified all types are returned.

    • type: (integer)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        ['', '']
        
  • HTTP Status Code 404:

Dataset Mapping Coverage (/mappingCoverage)

Full URL: /workspace/projects/{project}/datasets/{dataset}/mappingCoverage

Returns the mapping coverage of a this dataset. The mapping coverage is derived from all transformations from the same project that have this dataset as input. It has three categories, fully mapped, partially mapped and unmapped. A source path is fully mapped if it only consists of forward paths and no backward paths or filters. If there are filters or backward paths then it can maximally be considered as partially mapped, although in reality several partial mappings could fully cover a path. The algorithm cannot detect such kind of combined coverage. A path is unmapped if it is not uses as value input in any mapping.

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • type:

    This optional parameter specifies which coverage types should be returned. This is a comma-separated String. Allowed values are ‘fullyMapped’, ‘partiallyMapped’ and ‘unmapped’. Default is all types.

    • type: (string)

    • example:

      partiallyMapped,unmapped
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
            {
                "covered": true,
                "fully": true,
                "path": "target:label"
            },
            {
                "covered": true,
                "fully": false,
                "path": "target:zipCodeArea"
            },
            {
                "covered": false,
                "fully": false,
                "path": "target:issueDate"
            }
        ]
        
  • HTTP Status Code 404:

  • HTTP Status Code 500:

    • text/plain:

      • example:

        The type of data source 'someDataSource' does not support mapping coverage.
        

Dataset Source Path Mapping Coverage (/values)

Full URL: /workspace/projects/{project}/datasets/{dataset}/mappingCoverage/values

Returns mapping value coverage details for a specific dataset path. This is mostly relevant for partially mapped paths. It takes a specific path as input and returns the number of values found at this path, the number of values actually used by all mapping of the project and values that are not covered.

Valid HTTP methods are:

POST

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {"dataSourcePath": "/Person/Properties/Property/Value"}
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
            "coveredValues": 2,
            "missedValues": [
                {
                    "nodeId": "953217152",
                    "value": "V2"
                }
            ],
            "overallValues": 3
        }
        
  • HTTP Status Code 404:

  • HTTP Status Code 500:

    • text/plain:

      • example:

        The type of data source 'someDataSource' does not support mapping value coverage.

Activities (/activities)

Full URL: /workspace/activities

General operations on activities in the workspace.

Request updates on the status of one or many activities. (/updates)

Full URL: /workspace/activities/updates

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • project:

    The name of the project. If empty or not provided, activities from all projects are considered.

    • type: (string)
  • task:

    The name of the task. If empty or not provided, activities from all tasks are considered.

    • type: (string)
  • activity:

    The name of the activity. If empty or not provided, updates from all activities that match the task and project are returned.

    • type: (string)
  • timestamp:

    Only return status updates that happened after this timestamp. Provided in milliseconds since midnight, January 1, 1970 UTC. If not provided or 0, the stati of all matching activities are returned.

    • type: (integer)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
          {
            "project": "ProjectOfThisActivity",
            "task": "TaskOfThisActivity",
            "activity": "FirstActivityName",
            "isRunning": true,
            "statusName": "Running",
            "progress": 50,
            "failed": false,
            "message": "Status message to be shown to the user",
            "startTime": 1560427590357,
            "lastUpdateTime": 1560427593854
          },
          ...
        ]
        

Open a WebSocket to receive updates on the status of one or many activities. (/updatesWebSocket)

Full URL: /workspace/activities/updatesWebSocket

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • project:

    The name of the project. If empty or not provided, activities from all projects are considered.

    • type: (string)
  • task:

    The name of the task. If empty or not provided, activities from all tasks are considered.

    • type: (string)
  • activity:

    The name of the activity. If empty or not provided, updates from all activities that match the task and project are returned.

    • type: (string)
  • timestamp:

    Only return status updates that happened after this timestamp. Provided in milliseconds since midnight, January 1, 1970 UTC. If not provided or 0, the stati of all matching activities are returned.

    • type: (integer)
Response

The expected response:

  • HTTP Status Code 200:

    • A WebSocket that receives a status JSON object whenever a status has been updated. Format is the same as for the /updates endpoint.

Activity Log (/log)

Full URL: /workspace/activities/log

Valid HTTP methods are:

GET

Retrieves the activities log (last 100 entries). The logging is not started until the first call to this method, i.e., it is meant to be polled.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
          {
            "activity": "Types cache accessories" ,
            "level": "INFO", // One of "SEVERE", "WARNING", "INFO", "CONFIG", "FINE", "FINER", "FINEST"
            "message": "Started",
            "timestamp": 1460714719136
          },
          {
            "activity": "Types cache accessories",
            "level": "INFO",
            "message": "Cache read from accessories_cache.xml",
            "timestamp": 1460714719148
          },
          ...
        ]
        

Reload Workspace (/reload)

Full URL: /workspace/reload

Reloads the workspace from the backend.

Valid HTTP methods are:

POST

Trigger the reload. The request blocks until the reload finished.

Reload Workspace Prefixes (/reloadPrefixes)

Full URL: /workspace/reloadPrefixes

Reloads the workspace prefixes from registered or all voabularies from the backend.

Valid HTTP methods are:

POST

Trigger the reload. The request blocks until the reload finished.

Update global vocabulary cache (/updateGlobalVocabularyCache)

Full URL: /workspace/updateGlobalVocabularyCache

Updates a specific vocabulary of the global vocabulary cache.

Valid HTTP methods are:

POST

This request is non-blocking. It can take a while for the cache to be up to date.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "iri": "http://xmlns.com/foaf/0.1/"
      }
      
Response

The expected response:

  • HTTP Status Code 204:

  • HTTP Status Code 400:

Import Workspace (/import/{importPlugin})

Full URL: /workspace/import/{importPlugin}

Imports the entire workspace from the file send with the request. Before importing all existing projects will be removed from the workspace. The importPlugin path parameter must be one of the ids returned from the marshallingPlugins endpoint.

URI parameters:

  • importPlugin, required:

    • type: (string)

Valid HTTP methods are:

POST

Export Workspace (/export/{exportPlugin})

Full URL: /workspace/export/{exportPlugin}

Exports the entire workspace with the specified marshalling plugin, where exportPlugin must be the id of one of the ids returned from the marshallingPlugins endpoint.

URI parameters:

  • exportPlugin, required:

    • type: (string)

Valid HTTP methods are:

GET

Marshalling Plugins (/marshallingPlugins)

Full URL: /workspace/marshallingPlugins

Returns a list of supported workspace/project import/export plugins, e.g. RDF, XML.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
          {
            "id": "rdfTurtle",
            "label": "RDF Turtle",
            "description": "RDF Turtle meta data without resource files.",
            "fileExtension": "ttl",
            "mediaType": "text/turtle"
          },
          {
            "id": "xmlZip",
            "label": "XML/ZIP file",
            "description": "ZIP archive, which includes XML meta data and resource files.",
            "fileExtension": "zip",
            "mediaType": "application/zip"
          }
        ]
        

Search Tasks (/searchTasks)

Full URL: /workspace/searchTasks

List all tasks that fulfill a set of filters. All JSON fields sent in the the request are optional. The request example reflects the default values that are chosen when a field is missing in the request.

Valid HTTP methods are:

POST

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        // Restrict search to a specific project.
        "project": null
        // Only return tasks that match a search term.
        // Currently, the search covers the ID, the label, the description and the task properties.
        "searchTerm": null
        // The format options specify which parts are to be included in the response.
        "formatOptions": {
          "includeMetaData": true, // Include the task meta data.
          "includeTaskData": true, // Include the task data.
          "includeTaskProperties": false, // Include a list of properties as key-value pairs to be displayed to the user.
          "includeRelations": false, // Include relations to other tasks.
          "includeSchemata": false // Include the input and output schemata of each task.
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
          {
            "id": "task identifier (unique inside the project)",
            "project": "project this task belongs to (if any)",
            "metadata": {
              "label": "task label",
              "description": "task description",
              "modified": "2018-03-08T11:04:59.156Z"
            },
            "taskType": "task type, e.g., Dataset",
            "data": {
              // Task data that fully describes the task. Actual structure depends on the task type.
            },
            "properties": [
              // A list of key-value pairs to be displayed to the user.
              {
                "key": "some key",
                "value": "some value"
              },
              ...
            ],
            "relations": {
              "inputTasks": [], // Identifiers of all tasks from which this task is reading data.
              "outputTasks": [], // Identifiers of all tasks to which this task is writing data.
              "referencedTasks": [], // Identifiers of all tasks that are directly referenced by this task. Includes input and output tasks.
              "dependentTasksDirect": [], // Identifiers of all tasks that directly reference this task.
              "dependentTasksAll": [] // Identifiers of all tasks that directly or indirectly reference this task.
            },
            "schemata": {
              "input": ..., // The schemata of the input data of this task.
              "output": ... // The schemata of the output data of this task.
            }
          },
          ...
        ]
        

Global workspace acitivities (/globalWorkspaceActivities)

Full URL: /workspace/globalWorkspaceActivities

Activities that are global to the workspace and not connected to any project or task.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
            {
                "instances": [
                    {
                        "id": "GlobalVocabularyCache"
                    }
                ],
                "name": "GlobalVocabularyCache"
            }
        ]
        

Linking Tasks API (/linking)

The Linking Task API provides endpoints related to linking tasks.

Linking Task (/tasks/{project}/{linkingTask})

Full URL: /linking/tasks/{project}/{linkingTask}

URI parameters:

  • project, required:

    • type: (string)
  • linkingTask, required:

    • type: (string)

Valid HTTP methods are:

GET

Retrieves the linking task in XML.

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.

    • application/xml:

  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

PUT

Updates or creates a linking task.

Query Parameters

This method accepts the following query parameters:

  • source:

    The id of the source dataset.

    • type: (string)
  • target:

    The id of the target dataset.

    • type: (string)
  • sourceType:

    The URI of the type of entities to be selected from the source dataset.

    • type: (string)
  • targetType:

    The URI of the type of entities to be selected from the target dataset.

    • type: (string)
  • sourceRestriction:

    An additional restriction on the source entities.

    • type: (string)
  • targetRestriction:

    An additional restriction on the target entities.

    • type: (string)
  • output:

    The id of the output dataset.

    • type: (string)
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

DELETE

Deletes a linking task.

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Evaluates the current linking rule on all reference links. (/referenceLinksEvaluated)

Full URL: /linking/tasks/{project}/{linkingTask}/referenceLinksEvaluated

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "positive": [
            {
              "source": "http://dbpedia.org/resource/The_River_%281938_film%29",
              "target": "http://data.linkedmdb.org/resource/film/208",
              "confidence": 1,
              "ruleValues": {
                "operatorId": "compareTitles",
                "score": 1,
                "sourceValue": {
                  "operatorId": "movieTitle1",
                  "values": [
                    "The River"
                  ],
                  "error": null
                },
                "targetValue": {
                  "operatorId": "movieTitle2",
                  "values": [
                    "The River"
                  ],
                  "error": null
                }
              }
            },
            ...
          "negative": [
            ...
          ]
        }
        

Linking Task Execution with Payload (/postLinkDatasource)

Full URL: /linking/tasks/{project}/{linkingTask}/postLinkDatasource

Valid HTTP methods are:

POST

Execute a specific link specification against input data from the POST body.

Body

This method accepts the following body payloads:

  • application/xml:

    • example:

      
        
        
          
            
               
              
            
          
          
            
               
              
            
          
        
        
          <https://www.example.com/resource/123456> <http://xmlns.com/foaf/0.1/name> "John Doe" .
        
        
          <https://www.example2.com/resource/abcdef> <http://xmlns.com/foaf/0.1/name> "Doe, John"
        
      
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/n-triples:

      • example:

           .
        

Evaluate a linking task based on a linkage rule that is provided with the request. (/evaluateLinkageRule)

Full URL: /linking/tasks/{project}/{linkingTask}/evaluateLinkageRule

Valid HTTP methods are:

POST

Executes a linking task based on the linkage rule that comes with the POST request.

Query Parameters

This method accepts the following query parameters:

  • linkLimit:

    The max. number of unique links that should be returned from the evaluation.

    • type: (integer)- default value: 1000
  • timeoutInMs:

    The max. time in milliseconds the matching stage of the linking execution is allowed to run. This timeout does not affect the loading stage.

    • type: (integer)- default value: 30000
  • includeReferenceLinks:

    When true, this will return an evaluation of the reference links in addition to freshly matched links.

    • type: (boolean)
Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
          "filter": {
              "limit": null,
              "unambiguous": null
          },
          "linkType": "http://www.w3.org/2002/07/owl#sameAs",
          "operator": {
              "id": "unnamed_3",
              "indexing": true,
              "metric": "equality",
              "parameters": {},
              "required": false,
              "sourceInput": {
                  "id": "unnamed_1",
                  "path": "group",
                  "type": "pathInput"
              },
              "targetInput": {
                  "id": "unnamed_2",
                  "path": "group",
                  "type": "pathInput"
              },
              "threshold": 0,
              "type": "Comparison",
              "weight": 1
          }
      }
      
  • application/xml:

    • example:

      
        
          
          
        
      
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
            {
                "confidence": 1,
                "ruleValues": {
                    "operatorId": "unnamed_6",
                    "score": 1,
                    "sourceValue": {
                        "error": null,
                        "operatorId": "unnamed_4",
                        "values": [
                            "group 1"
                        ]
                    },
                    "targetValue": {
                        "error": null,
                        "operatorId": "unnamed_5",
                        "values": [
                            "group 1"
                        ]
                    }
                },
                "source": "urn:instance:simplecsv#1",
                "target": "urn:instance:simplecsv#1"
            },
            {
              ...
            }
        ]

Transform Task API (/transform)

The Transform Task API provides endpoints related to transformation tasks and rules.

Transform Task (/tasks/{project}/{transformationTask})

Full URL: /transform/tasks/{project}/{transformationTask}

URI parameters:

  • project, required:

    • type: (string)
  • transformationTask, required:

    • type: (string)

Valid HTTP methods are:

GET

Retrieves the transform task.

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.

    • application/xml:

    • application/json:

  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

PUT

Updates or creates a transform task.

Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

DELETE

Deletes a transform task.

Query Parameters

This method accepts the following query parameters:

  • removeDependentTasks:

    If true, transform and linking tasks that directly reference this task are removed as well.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • The request completed successfully.
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

All Mapping Rules (/rules)

Full URL: /transform/tasks/{project}/{transformationTask}/rules

Valid HTTP methods are:

GET

Get all mapping rules of the transformation task. If no accept header is defined, XML is returned.

Response

The expected response:

  • HTTP Status Code 200:

    • The result in XML or JSON.

    • application/xml:

      • example:

        
          
            
              
            
            
              
            
          
          
            
              
              
            
            
              
            
          
        
    • application/json:

      • example:

        {
          "type" : "root",
          "id" : "root",
          "rules" : {
            "uriRule" : {
              "type" : "uri",
              "id" : "uri",
              "pattern" : "http://example.org/{PersonID}"
            },
            "typeRules" : [ {
              "type" : "type",
              "id" : "explicitlyDefinedId",
              "typeUri" : "target:Person"
            } ],
            "propertyRules" : [ {
              "type" : "direct",
              "id" : "directRule",
              "sourcePath" : "/source:name",
              "mappingTarget" : {
                "uri" : "target:name",
                "valueType" : {
                  "nodeType" : "StringValueType"
                },
                "isBackwardProperty": false
              }
            }, {
              "type" : "object",
              "id" : "objectRule",
              "sourcePath" : "/source:address",
              "mappingTarget" : {
                "uri" : "target:address",
                "valueType" : {
                  "nodeType" : "UriValueType"
                },
                "isBackwardProperty": false
              },
              "rules" : {
                "uriRule" : null,
                "typeRules" : [ ],
                "propertyRules" : [ ]
              }
            } ]
          }
        }
        
  • HTTP Status Code 404:

  • HTTP Status Code 406:

PUT

Update all rules of a transform specification. As for GET XML and JSON are supported. The format for PUT is exactly the same as the result that is returned by a GET request.

Body

This method accepts the following body payloads:

  • application/xml:

  • application/json:

Response

The expected response:

  • HTTP Status Code 200:

    • The rules were successfully updated. There is no response body.
  • HTTP Status Code 400:

    • If the provided rule serialization is invalid.
  • HTTP Status Code 404:

    • If no rule with the given identifier could be found.

Mapping Rule (/rule/{rule})

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}

URI parameters:

  • rule, required:

    • type: (string)

Valid HTTP methods are:

GET

Fetch the JSON or XML representation of a single transformation rule. The rule path parameter is determined by the ‘id’ parameter of the rule.

Response

The expected response:

  • HTTP Status Code 200:

    • application/xml:

      • example:

        
          
            
          
          
            
          
        
    • application/json:

      • example:

        {
          "type" : "object",
          "id" : "objectRule",
          "sourcePath" : "/source:address",
          "mappingTarget" : {
            "uri" : "target:address",
            "valueType" : {
              "nodeType" : "UriValueType"
            },
            "isBackwardProperty": false
          },
          "rules" : {
            "uriRule" : null,
            "typeRules" : [ ],
            "propertyRules" : [ ]
          }
        }
        
  • HTTP Status Code 404:

    • If no rule with the given identifier could be found.

PUT

Updates a rule or parts of a rule. The XML and JSON format is the same as returned by the corresponding GET endpoint. For json payloads, the caller may send a fragment that only specifies the parts of the rule that should be updated. The parts that are not sent in the request will remain unchanged.

Body

This method accepts the following body payloads:

  • application/xml:

  • application/json:

    • example:

      {
        "rules": {
          "uriRule": {
            "type": "uri",
            "pattern": "http://example.org/{PersonID}"
          },
          "typeRules": [
            {
              "type": "type",
              "id": "explicitlyDefinedId",
              "typeUri": "target:Person"
            }
          ]
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The rule has been updated successfully. The complete rule is returned.

    • application/json:

      • example:

        {
          "type" : "root",
          "id" : "root",
          "rules" : {
            "uriRule" : {
              "type" : "uri",
              "id" : "uri",
              "pattern" : "http://example.org/{PersonID}"
            },
            "typeRules" : [ {
              "type" : "type",
              "id" : "explicitlyDefinedId",
              "typeUri" : "target:Person"
            } ],
            "propertyRules" : [ ]
          }
        }
        
  • HTTP Status Code 400:

    • If the provided rule serialization is invalid.
  • HTTP Status Code 404:

    • If no rule with the given identifier could be found.

DELETE

Deletes the rule that is identified by the given id.

Response

The expected response:

  • HTTP Status Code 200:

    • The rule has been removed successfully.
  • HTTP Status Code 404:

    • If no rule with the given identifier could be found.

Mapping Rule Children (/rules)

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}/rules

Valid HTTP methods are:

POST

Appends a new child rule to an object mapping rule.

Query Parameters

This method accepts the following query parameters:

  • afterRuleId:

    Optional parameter that specified after which existing rule the new rule should be inserted.

    • type: (string)
Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "type": "direct",
        "sourcePath": "/source:name",
        "mappingTarget": {
          "uri": "target:name",
          "valueType": {
            "nodeType": "StringValueType"
          }
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The rule has been updated successfully. The appended rule is returned. In case the caller did not specify an identifier for the appended rule, the result will contain the generated identifier.

    • application/json:

      • example:

        {
          "type" : "direct",
          "id" : "directRule",
          "sourcePath" : "/source:name",
          "mappingTarget" : {
            "uri" : "target:name",
            "valueType" : {
              "nodeType" : "StringValueType"
            }
          }
        }
        
  • HTTP Status Code 400:

Copy Rule (/copyFrom)

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}/rules/copyFrom

Copies a rule from the source transformation task specified by the query parameters and inserts it into the given target transform task specified by the path parameters.

Valid HTTP methods are:

POST

Copy the source rule from the source project and task and insert it in the given location of the target mapping task.

Query Parameters

This method accepts the following query parameters:

  • sourceProject, required:

    The ID of the source project from the workspace that contains the source transform task from which a rule should be copied from.

    • type: (string)
  • sourceTask, required:

    The ID of the source task the rule should be copied from.

    • type: (string)
  • sourceRule, required:

    The ID of the source rule that should be copied to the target transform task.

    • type: (string)
  • afterRuleId:

    Optional parameter that specified after which existing rule the new rule should be inserted.

    • type: (string)

Reorder Mapping Child Rules (/reorder)

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}/rules/reorder

Valid HTTP methods are:

POST

Reorders all child rules of an object mapping.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      [
        "objectRule",
        "directRule"
      ]
      
Response

The expected response:

  • HTTP Status Code 200:

    • The rules have been successfully reordered. The new ordered list of rules is returned.

    • application/json:

      • example:

        [
          "objectRule",
          "directRule"
        ]
        
  • HTTP Status Code 400:

Mapping Rule Source Paths (/completions/sourcePaths)

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}/completions/sourcePaths

Valid HTTP methods are:

GET

Returns all source paths that match the given term.

Query Parameters

This method accepts the following query parameters:

  • term:

    The search term. Will also return non-exact matches (e.g., naMe == name) and matches from labels.

    • type: (string)
  • maxResults:

    The maximum number of results. Defaults to 30.

    • type: (integer)- default value: 30
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [ {
          "value" : "source:age", // The value to be inserted into the textbox, if the user selects this suggestion
          "label" : "age", // Human-readable label. Never null, will be generated if not available
          "description" : null, // May be null, if not available
          "category" : "Source Paths", // Results should be grouped in categories
        }, {
          "value" : "source:name",
          "label" : "name",
          "description" : "Some description",
          "category" : "Source Paths",
        }, {
          "value" : "foaf:",
          "label" : "foaf:",
          "description" : null,
          "category" : "Prefixes",
        },...
        ]
        
  • HTTP Status Code 404:

Mapping Rule Target Types (/completions/targetTypes)

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}/completions/targetTypes

Valid HTTP methods are:

GET

Returns all types from the target vocabulary that match the given term.

Query Parameters

This method accepts the following query parameters:

  • term:

    The search term. Will also return non-exact matches (e.g., naMe == name) and matches from labels.

    • type: (string)
  • maxResults:

    The maximum number of results. Defaults to 30.

    • type: (integer)- default value: 30
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [ {
          "value" : "source:age", // The value to be inserted into the textbox, if the user selects this suggestion
          "label" : "age", // Human-readable label. Never null, will be generated if not available
          "description" : null, // May be null, if not available
          "category" : "Source Paths", // Results should be grouped in categories
        }, {
          "value" : "source:name",
          "label" : "name",
          "description" : "Some description",
          "category" : "Source Paths",
        }, {
          "value" : "foaf:",
          "label" : "foaf:",
          "description" : null,
          "category" : "Prefixes",
        },...
        ]
        
  • HTTP Status Code 404:

Mapping Rule Target Properties (/completions/targetProperties)

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}/completions/targetProperties

Valid HTTP methods are:

GET

Returns all properties from the target vocabulary that match the given term.

Query Parameters

This method accepts the following query parameters:

  • term:

    The search term. Will also return non-exact matches (e.g., naMe == name) and matches from labels.

    • type: (string)
  • maxResults:

    The maximum number of results. Defaults to 30.

    • type: (integer)- default value: 30
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [ {
          "value" : "source:age", // The value to be inserted into the textbox, if the user selects this suggestion
          "label" : "age", // Human-readable label. Never null, will be generated if not available
          "description" : null, // May be null, if not available
          "category" : "Source Paths", // Results should be grouped in categories
        }, {
          "value" : "source:name",
          "label" : "name",
          "description" : "Some description",
          "category" : "Source Paths",
        }, {
          "value" : "foaf:",
          "label" : "foaf:",
          "description" : null,
          "category" : "Prefixes",
        },...
        ]
        
  • HTTP Status Code 404:

Mapping Rule Value Source Paths (/valueSourcePaths)

Full URL: /transform/tasks/{project}/{transformationTask}/rule/{rule}/valueSourcePaths

Valid HTTP methods are:

GET

Fetch all value source paths relative to the corresponding rule. The format is in the Silk path language.

Query Parameters

This method accepts the following query parameters:

  • maxDepth:

    Limit the depth of the source paths. For example a value of 1 would only return value source paths with exactly one path operator.

    • type: (integer)- default value: no default
  • unusedOnly:

    If this is set to true, only source paths that are not used in any rule so far are returned. Considered rules for filtering are only value rules and complex mappings with a single source path.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • A list of source paths serialized with prefixed URIs.

    • application/json:

      • example:

        ["ID","Properties/Property","Name","Events/@count","Events/Birth","Events/Death","Properties/Property/Key","Properties/Property/Value"]
        

Execute Transform Task with Payload (/transformInput)

Full URL: /transform/tasks/{project}/{transformationTask}/transformInput

Valid HTTP methods are:

POST

Execute a specific transformation task against input data from the POST body.

Body

This method accepts the following body payloads:

  • application/xml:

    • example:

      
        
          
             
               
               
              
            
          
        
        
          {
            ...
          }
        
      
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/n-triples:

      • example:

          "John Doe"@en .
        

Mapping Rule Transformation Examples (/peak/{rule})

Full URL: /transform/tasks/{project}/{transformationTask}/peak/{rule}

URI parameters:

  • rule, required:

    • type: (string)

Valid HTTP methods are:

POST

Get transformation examples for the selected transformation rule. The input task of the transformation task has to be a Dataset task. Also the Dataset task must support this feature.

Query Parameters

This method accepts the following query parameters:

  • limit:

    The maximum number of transformed example entities.

    • type: (integer)- default value: 3
  • maxTryEntities:

    The maximum number of example entities to try to transform before giving up.

    • type: (integer)- default value: 553
Response

The expected response:

  • HTTP Status Code 200:

    • The result JSON consists of the actual values, i.e. all source values and all transformed values for each example entity, the source paths of the mapping and a status object. There are as many string arrays in the sourceValues array as there are input paths. Besides ‘success’ there are 2 other status ids, first there is ‘empty’ and second ‘empty with exceptions’. In both cases the status message gives more details.

    • application/json:

      • example:

        {
          "results": [
            {
              "sourceValues": [
                [
                  "Olaf",
                  "Ralf"
                ],
                [
                  "M\u00fcller",
                  "Schmidt"
                ]
              ],
              "transformedValues": [
                " Olaf  M%C3%BCller",
                " Olaf  Schmidt",
                " Ralf  M%C3%BCller",
                " Ralf  Schmidt"
              ]
            }
          ],
          "sourcePaths": [
            [
              "/"
            ],
            [
              "/"
            ]
          ],
          "status": {
            "id": "success",
            "msg": ""
          }
        }
        

Mapping Rule from Request Transformation Examples (/childRule)

Full URL: /transform/tasks/{project}/{transformationTask}/peak/{rule}/childRule

Valid HTTP methods are:

POST

Get transformation examples for the transformation rule that is attached in the body of this request. The input task of the transformation task has to be a Dataset task. Also the Dataset task must support this feature.

Query Parameters

This method accepts the following query parameters:

  • limit:

    The maximum number of transformed example entities.

    • type: (integer)- default value: 3
  • maxTryEntities:

    The maximum number of example entities to try to transform before giving up.

    • type: (integer)- default value: 553
Response

The expected response:

  • HTTP Status Code 200:

    • The result JSON consists of the actual values, i.e. all source values and all transformed values for each example entity, the source paths of the mapping and a status object. There are as many string arrays in the sourceValues array as there are input paths. Besides ‘success’ there are 2 other status ids, first there is ‘empty’ and second ‘empty with exceptions’. In both cases the status message gives more details.

    • application/json:

      • example:

        {
          "results": [
            {
              "sourceValues": [
                [
                  "Olaf",
                  "Ralf"
                ],
                [
                  "M\u00fcller",
                  "Schmidt"
                ]
              ],
              "transformedValues": [
                " Olaf  M%C3%BCller",
                " Olaf  Schmidt",
                " Ralf  M%C3%BCller",
                " Ralf  Schmidt"
              ]
            }
          ],
          "sourcePaths": [
            [
              "/"
            ],
            [
              "/"
            ]
          ],
          "status": {
            "id": "success",
            "msg": ""
          }
        }
        

Target Vocabulary Type (/type)

Full URL: /transform/tasks/{project}/{transformationTask}/targetVocabulary/type

Valid HTTP methods are:

GET

Retrieves information about a type from the target vocabularies.

Query Parameters

This method accepts the following query parameters:

  • uri, required:

    The URI of the type. May be a prefixed name.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "genericInfo" : {
            "uri" : "foaf:Person",
            "label" : "Person",
            "description" : "A person."
          },
          "parentClasses" : [ "http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing", "foaf:Agent" ]
        }
        
  • HTTP Status Code 404:

    • If no type with the given URI could be found in any of the target vocabularies.

Target Vocabulary Property (/property)

Full URL: /transform/tasks/{project}/{transformationTask}/targetVocabulary/property

Valid HTTP methods are:

GET

Retrieves information about a property from the target vocabularies.

Query Parameters

This method accepts the following query parameters:

  • uri, required:

    The URI of the property. May be a prefixed name.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "genericInfo" : {
            "uri" : "foaf:name",
            "label" : "name",
            "description" : "A name for some thing."
          },
          "domain" : "owl:Thing",
          "range" : "rdfs:Literal"
        }
        
  • HTTP Status Code 404:

    • If no property with the given URI could be found in any of the target vocabularies.

Target Vocabulary Type or Property (/typeOrProperty)

Full URL: /transform/tasks/{project}/{transformationTask}/targetVocabulary/typeOrProperty

Valid HTTP methods are:

GET

Retrieves information about a type or a property from the target vocabularies. This endpoint can be used if it is not known whether the given URI represents a type or a property. Otherwise, the /type and /property endpoints should be prefered.

Query Parameters

This method accepts the following query parameters:

  • uri, required:

    The URI of the type or property. May be a prefixed name.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "genericInfo" : {
            "uri" : "foaf:name",
            "label" : "name",
            "description" : "A name for some thing."
          },
          "domain" : "owl:Thing",
          "range" : "rdfs:Literal"
        }
        
  • HTTP Status Code 404:

    • If no type or property with the given URI could be found in any of the target vocabularies.

Target Vocabulary Type Suggestions (/typeSuggestions)

Full URL: /transform/tasks/{project}/{transformationTask}/targetVocabulary/typeSuggestions

Valid HTTP methods are:

GET

Get auto-complete suggestions for target types usually coming from the vocabulary cache or from the matching candidate cache.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
            {
                "category": "MatchingCandidateCache",
                "description": null,
                "isCompletion": true,
                "label": "Loan",
                "value": "https://vocab.eccenca.com/testTarget/Loan"
            }
        ]
        
  • HTTP Status Code 404:

Target Vocabulary Properties by Class (/propertiesByClass)

Full URL: /transform/tasks/{project}/{transformationTask}/targetVocabulary/propertiesByClass

Valid HTTP methods are:

GET

Get all properties that the given class or any of its parent classes is the domain of in the corresponding target vocabulary.

Query Parameters

This method accepts the following query parameters:

  • classUri, required:

    The URI of the class from the target vocabulary.

    • type: (string)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
            {
                "domain": "https://vocab.eccenca.com/testTarget/Loan",
                "genericInfo": {
                    "URI": "https://vocab.eccenca.com/testTarget/zipCode",
                    "label": "zip code"
                }
            },
            {
                "domain": "https://vocab.eccenca.com/testTarget/Loan",
                "genericInfo": {
                    "URI": "https://vocab.eccenca.com/testTarget/volume",
                    "label": "volume"
                }
            }
        ]
        
  • HTTP Status Code 404:

Target Vocabulary Object Properties by Class (/relationsOfClass)

Full URL: /transform/tasks/{project}/{transformationTask}/targetVocabulary/relationsOfClass

Valid HTTP methods are:

GET

Get all direct relations of a class or one of its parent classes to other classes from the vocabulary.

Query Parameters

This method accepts the following query parameters:

  • classUri, required:

    The URI of the class from the target vocabulary.

    • type: (string)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
            "backwardRelations": [
                {
                    "property": {
                        "domain": "https://vocab.eccenca.com/testTarget/Person",
                        "genericInfo": {
                            "URI": "https://vocab.eccenca.com/testTarget/hasLoan",
                            "label": "hasLoan"
                        },
                        "range": "https://vocab.eccenca.com/testTarget/Loan"
                    },
                    "targetClass": {
                        "genericInfo": {
                            "URI": "https://vocab.eccenca.com/testTarget/Loan",
                            "description": "Loans of customers",
                            "label": "Loan"
                        },
                        "parentClasses": []
                    }
                }
            ],
            "forwardRelations": [
                {
                    "property": {
                        "domain": "https://vocab.eccenca.com/testTarget/Loan",
                        "genericInfo": {
                            "URI": "https://vocab.eccenca.com/testTarget/lendTo",
                            "label": "lend to"
                        },
                        "range": "https://vocab.eccenca.com/testTarget/Person"
                    },
                    "targetClass": {
                        "genericInfo": {
                            "URI": "https://vocab.eccenca.com/testTarget/Loan",
                            "description": "Loans of customers",
                            "label": "Loan"
                        },
                        "parentClasses": []
                    }
                }
            ]
        }
        
  • HTTP Status Code 404:

Transform Task Output (/downloadOutput)

Full URL: /transform/tasks/{project}/{transformationTask}/downloadOutput

Valid HTTP methods are:

GET

Downloads the contents of the first output dataset of the transform task. Note that this does not execute the transformm, but assumes that the transform has been executed already. The output dataset must be file based.

Response

The expected response:

  • HTTP Status Code 200:

  • HTTP Status Code 400:

    • If the output could not be downloaded.

Workflow API (/workflow)

The Workflow API provides read, write and execute access to the workflows. As input data source types all file resource based dataset plugins are allowed, e.g. csv, xml, json, file. As data sinks only those dataset plugins are allowed that have a writable file resource, e.g. csv and file.

Project Workflows (/workflows/{project})

Full URL: /workflow/workflows/{project}

List of project workflow IDs

URI parameters:

  • project, required:

    • type: (string)

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        ["workflow_1","workflow_2"]
        
  • HTTP Status Code 404:

Workflow Task (/{task})

Full URL: /workflow/workflows/{project}/{task}

REST representation of the workflow task.

URI parameters:

  • task, required:

    • type: (string)

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/xml:

      • example:

        
        
        
        
        
  • HTTP Status Code 404:

PUT

DELETE

Execute Workflow with Request Payload (/executeOnPayload)

Full URL: /workflow/workflows/{project}/{task}/executeOnPayload

Execute a variable workflow that gets the inputs for variable data sources with the HTTP request and returns all results of variable data sinks with the HTTP response. This endpoint will block until the workflow execution finished. Use executeOnPayloadAsynchronous for non-blocking execution.

Valid HTTP methods are:

POST

At the moment the file parameter follows the convention that the file name must be the dataset name plus the string “_file_resource“, e.g. dataset name”dOutput" with file parameter value “dOutput_file_resource”. This convention is not needed for data sources. It is possible to reference project resource files. In order to use existing resources and not provide them via the REST request, no resource element with the same name should be in the XML payload. Then, if the project resource with the value given for the file parameter exists, this is used instead.

Body

This method accepts the following body payloads:

  • application/xml:

    • example:

      
        
          
            
              
              
            
          
          
            
              
              
            
          
        
        
          
            
              
              
            
          
        
        
          <https://www.fuhsen.net/resource/xing/person/123456_abcdef> <http://www.w3.org/2000/01/rdf-schema#label>
          "Max Mustermann" .
        
        
          <https://www.fuhsen.net/resource/xing/person/654321_abcdef> <http://www.w3.org/2000/01/rdf-schema#label>
          "mustermann, max" .
        
      
      
  • application/json:

    • example:

      {
        "DataSources": [
          {
            "id": "inputDataset",
            "data": {
              "taskType": "Dataset",
              "type": "json",
              "parameters": {
                "file": "test_file_resource"
              }
            }
          }
        ],
        "Sinks": [
          {
            "id": "outputDataset",
            "data": {
              "taskType": "Dataset",
              "type": "file",
              "parameters": {
                "file": "outputResource",
                "format": "N-Triples"
              }
            }
          }
        ],
        "Resources": {
          "test_file_resource": [
            {"id":"1"},
            {"id":"2" }
          ]
        }
      }
Response

The expected response:

  • HTTP Status Code 200:

    • application/xml:

      • example:

        
          
            <https://www.fuhsen.net/resource/xing/person/123456_abcdef> <http://www.w3.org/2002/07/owl#sameAs>
            <https://www.fuhsen.net/resource/xing/person/654321_abcdef> .
          
        
        
    • application/json:

      • example:

        [
          {
            "sinkId": "outputDataset",
            "textContent": "  \"1\"^^ .\n  \"2\"^^ ."
          }
        ]

Execute Workflow with Request Payload Asynchronously (/executeOnPayloadAsynchronous)

Full URL: /workflow/workflows/{project}/{task}/executeOnPayloadAsynchronous

Execute a variable workflow that gets the inputs for variable data sources with the HTTP request and returns all results of variable data sinks with the HTTP response. This endpoint starts the workflow execution in the background and returns the identifier of the started background activity. Use the activity API to query for its exection status and result, e.g., /workspace/projects/{projectId}/tasks/{taskId} /activities/{BackgroundActivityID}/{value or status}. After having consumed the result value a well-behaving client should remove the execution instance via the /execution/{executionId} endpoint.

Valid HTTP methods are:

POST

At the moment the file parameter follows the convention that the file name must be the dataset name plus the string “_file_resource“, e.g. dataset name”dOutput" with file parameter value “dOutput_file_resource”. This convention is not needed for data sources. It is possible to reference project resource files. In order to use existing resources and not provide them via the REST request, no resource element with the same name should be in the XML payload. Then, if the project resource with the value given for the file parameter exists, this is used instead.

Body

This method accepts the following body payloads:

  • application/xml:

    • example:

      
        
          
            
              
              
            
          
          
            
              
              
            
          
        
        
          
            
              
              
            
          
        
        
          <https://www.fuhsen.net/resource/xing/person/123456_abcdef> <http://www.w3.org/2000/01/rdf-schema#label>
          "Max Mustermann" .
        
        
          <https://www.fuhsen.net/resource/xing/person/654321_abcdef> <http://www.w3.org/2000/01/rdf-schema#label>
          "mustermann, max" .
        
      
      
  • application/json:

    • example:

      {
        "DataSources": [
          {
            "id": "inputDataset",
            "data": {
              "taskType": "Dataset",
              "type": "json",
              "parameters": {
                "file": "test_file_resource"
              }
            }
          }
        ],
        "Sinks": [
          {
            "id": "outputDataset",
            "data": {
              "taskType": "Dataset",
              "type": "file",
              "parameters": {
                "file": "outputResource",
                "format": "N-Triples"
              }
            }
          }
        ],
        "Resources": {
          "test_file_resource": [
            {"id":"1"},
            {"id":"2" }
          ]
        }
      }
Response

The expected response:

  • HTTP Status Code 201:

  • Location:

    • type: (string)

    • example:

      /workflow/workflows/projectName/workflowName/execution/ExecuteWorkflowWithPayload14
    • application/json:

      • example:

        {
          "activityId": "BackgroundActivityID"
        }
        

Workflow Execution Instance (/execution/{executionId})

Full URL: /workflow/workflows/{project}/{task}/execution/{executionId}

A workflow execution instance which was started via the /executeOnPayloadAsynchronous endpoint.

URI parameters:

  • executionId, required:

    • type: (string)

Valid HTTP methods are:

DELETE

Remove the workflow execution instance. Since only a limited number of executions are kept at every moment, a well behaving client should remove the execution if the client has consumed the result.

Response

The expected response:

  • HTTP Status Code 200:

Workflow Editor (/editor/{project}/{task})

Full URL: /workflow/editor/{project}/{task}

The workflow editor user interface.

URI parameters:

  • project, required:

    • type: (string)
  • task, required:

    • type: (string)

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • text/html:
  • HTTP Status Code 404:

Workflow Extensions API (/workflowExt)

Workflow Extensions are services that are not dealing with the typical management of workflows, but offer additional features like workflow generation.

Generate Merge Workflow (/generate)

Full URL: /workflowExt/workflows/{project}/generate

Generates a workflow automatically based on the selected tasks and configuration that merges the output of several tasks into a single Dataset. The request data structure contains selected tasks that should be merged, usually transform tasks. The workflowId of the workflow must be specified, which must be an unused identifier in the project. The pivot task that all other tasks outputs are merged into is defined by startTask. Only tasks that are directly linked to the pivot task can be merged. joinTasks contains pairs of tasks and defines the linking direction, the first entry being the source task and the second being the target task of the linking. For each pair a matching linking task will be retrieved and used to join the two tasks. outputDatasetId specifies the project dataset that the merge result of the generated workflow should be written to. For each selected task an optional list of selected properties can be defined that should be represented in the output. If the list is missing all properties from this task will be selected. An optional targetPropertyOpt may rewrite the URI of the target property.

Valid HTTP methods are:

POST

The selectedProperties property of the selectedTasks array is optional. When left out all properties are selected. Also optional is the targetPropertyOpt property, which specifies rewriting the URI of the target property.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "workflowId":"WORKFLOW_ID",
        "startTask":"START_TASK_ID",
        "selectedTasks":[
          {"taskId":"START_TASK_ID",
            "selectedProperties":[
              {"propertyUri":"http://someOntology.com/ont/prop/state"},
              {"propertyUri":"http://someOntology.com/ont/prop/issueDate",
                "targetPropertyOpt":"http://otherOntology.com/prop/issue_date"},
              {"propertyUri":"http://someOntology.com/ont/prop/volume"}
          ]},
          {"taskId":"MERGE_TASK_ID"}
        ],
        "joinTasks":[{"sourceTask":"START_TASK_ID","targetTask":"MERGE_TASK_ID"}],
        "outputDatasetId":"OUTPUT_DATASET_ID"
      }
Response

The expected response:

  • HTTP Status Code 201:

    • The workflow was generated.
  • Location, required:

    Absolute path of the generated workflow.

    • type: (string)

    • example:

      /workflow/workflows/project_id/workflow_id
  • HTTP Status Code 500:

    • Selected property does not exist.

    • application/json:

      • example:

        {"message":"Selected task transform_loans has no output property http://data.eccenca.com/lendingclub/doesNotExist!"}

Generated Workflow (/generatedWorkflow/{workflowTask})

Full URL: /workflowExt/workflows/{project}/generatedWorkflow/{workflowTask}

This endpoint does not only affect the corresponding workflow task, but also all tasks that were generated with this workflow during the /generate REST endpoint call.

URI parameters:

  • workflowTask, required:

    • type: (string)

Valid HTTP methods are:

DELETE

Delete a generated workflow and all its generated tasks, e.g. Merge and projection transform tasks.

Response

The expected response:

  • HTTP Status Code 200:

  • HTTP Status Code 404:

Resource Preview API (/resources)

The Resource Preview API provides a content preview of a given resource. It serves 2 purposes. The first is kind of a file browser backend. For a given project and resource the service searches in the defined resource repository and tries to determine the type or content type of a resource to create a preview out of the first 1000 bytes (text ,xml, …) or the first 100 records (orc, avro, csv, databases, …). The second purpose is the to provide content preview for already known resources (i.e. all registered data sets and not only files). In this case the datasetInfo must be set in the preview request and is used to instatiate a Dataset subclass that is used directly to generate a preview. See also the readme.md file in the resource-preview plugin folder.

Resource Preview (/preview)

Full URL: /resources/preview

Get a content preview of the specified resource in the specified project. The previewContent.attributes property for structured resources has an array of the following form: Usually they are either absolute IRIs with the form ‘http://someIRI’ or attribute names like ‘first_name’. For data sources that also return backward paths, e.g. RDF, it can also be that a ‘' is prepended to the URI or attribute name, e.g.’’.

Valid HTTP methods are:

POST

Returns a resource preview of either a full dataset description or of a resource file. Either the ‘datasetInfo’ or ‘resource’ property must be defined in the request payload.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "resource":"resource name",    // If datasetInfo is given this will be ignored, else a preview for that resource is generated.
        "project":"project name",      // required
        "datasetInfo": {               // optional for "config-less" resource preview, non-optional for a preview of datasets without resource, e.g. Database, Triplestore, Spark-View
            "id": "task-identifier",
            "type": "csv",
            "parameters": {
                "arraySeparator": "",
                "separator": ",",
                "prefix": "",
                "uri": "",
                "ignoreBadLines": "false",
                "quote": "\"",
                "properties": "URI,date,state,unemployRate,volume",
                "regexFilter": "",
                "charset": "UTF-8",
                "file": "test.csv",
                "linesToSkip": "1",
                "maxCharsPerColumn": "4096"
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {   // structured resources
            "datasetInfo": {
                "id": "task-identifier",
                "type": "csv",
                "parameters": {
                    "arraySeparator": "",
                    "separator": ",",
                    "prefix": "",
                    "uri": "",
                    "ignoreBadLines": "false",
                    "quote": "\"",
                    "properties": "URI,date,state,unemployRate,volume",
                    "regexFilter": "",
                    "charset": "UTF-8",
                    "file": "test.csv",
                    "linesToSkip": "1",
                    "maxCharsPerColumn": "4096"
                }
            },
            "previewType": "structured",
            "previewContent": {
                "attributes":["attribute1", "type", "label", "xxx"],
                "values": [
                    ["v","z", "x", "y"],
                    ["1", "2", "3", "4"]
                ]
            }
        }
        {   // un- or semi-structured resources
            "datasetInfo": {
                "id": "task-identifier",
                "type": "csv",
                "parameters": {
                    "prefix": "",
                    "uri": "",
                    "charset": "UTF-8",
                    "file": "test.xml",
                    "linesToSkip": "1",
                    "maxCharsPerColumn": "4096"
                }
            },
            "previewType": "unstructured",
            "previewContent": {
                "size": 1000,
                "text": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ... xxxxxx"
            }
        }
        

eccenca Dataset Preview (/datasets)

The Dataset Preview API provides a content preview of a given dataset. The dataset must be already registered. If a preview of unregistered data is needed, the Resource Preview API can be used. The main difference to the Resource Preview API is the simpler usage with only the dataset name and project name as input. As opposed to the Resource Preview, the only type of preview is a structured preview, since the datasets were already registerd and their parameters and parsing instructions are known. Returns 404 for non registerd datasets and an empty previewContent for datasets that are registered but not available for reading or empty. See also the readme.md file in the dataset-preview plugin folder.

Dataset Preview (/preview)

Full URL: /datasets/preview

Get a content preview of the specified dataset in the specified project. The previewContent.attributes property for structured resources has an array of the following form: Usually they are either absolute IRIs with the form ‘http://someIRI’ or attribute names like ‘first_name’. For data sources that also return backward paths, e.g. RDF, it can also be that a ‘' is prepended to the URI or attribute name, e.g.’’.

Valid HTTP methods are:

POST

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "project": "project name"
        "dataset": "dataset name",
        "typeUri": "type URI (optional)"
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {   "datasetInfo": {
                "id": "task-identifier",
                "type": "csv",
                "parameters": {
                    "arraySeparator": "",
                    "separator": ",",
                    "prefix": "",
                    "uri": "",
                    "ignoreBadLines": "false",
                    "quote": "\"",
                    "properties": "URI,date,state,unemployRate,volume",
                    "regexFilter": "",
                    "charset": "UTF-8",
                    "file": "test.csv",
                    "linesToSkip": "1",
                    "maxCharsPerColumn": "4096"
                }
            },
            "previewType": "structured",
            "previewContent": {
                "attributes":["attribute1", "type", "label", "xxx"],
                "values": [
                    ["v","z", "x", "y"],
                    ["1", "2", "3", "4"]
                ]
            }
        }
        

Ontology Matching API (/ontologyMatching)

The Ontology Matching API provides endpoints related to ontology matching between datasets and vocabularies and vice versa.

Match Dataset to Vocabulary Class (/matchVocabularyClassDataset)

Full URL: /ontologyMatching/matchVocabularyClassDataset

Matches properties of classes from a target vocabulary of a transformation task to one or more property candidates from the source dataset schema. Thus this is a target vocabulary focused matching operation against the source dataset and would for example be used if the caller wants to restrict the part of the vocabulary that should be matched and provide several candidates for each target property. For the given class URIs, only these target properties are considered in the matching process that have one of the classes or one of its parent classes defined as domain.

Valid HTTP methods are:

POST

The request payload contains the project name both the dataset and transformation task are located in. Furthermore there must be defined the target classes from one of the target vocabularies specified in the transformation task. The number of candidates are the maximum number of property suggestions from the dataset made per vocabulary property. The dataTypePropertiesOnly parameter is optional and defaults to true. It means that only properties that have no range of any of the classes from the vocabulary are returned. Note that this is not the same as being an owl:DatatypeProperty. The uriPrefix parameter is optional and is used during the schema extraction of the dataset to be matched. This is usually not important since the source path is returned and not a source property URI. However, this may be important if additional schema data is attached to the extracted schema data. The ruleId parameter is optional and defaults to the root rule. This should be set if the matching should be run in the context of a nested mapping rule and thus a possibly changed source path/type. The matchFromDataset parameter specifies the source of the matching task, i.e. for each source 0 to nrCandidates matches should be found. If set to true then the dataset is the source, else the vocabulary. The candidates returned in the response are always structured that the source entity is the key and target entities are represented by an array of candidates. and thus a possibly changed source path/type. The ‘schemaExtractionTimeLimit’ specifies the maximum amount of time in milli seconds to spend on the schema extraction. The ‘schemaEntityLimit’ ist the max. number of schema elements that will be extracted from the dataset for the matching.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "projectName": "$project",
        "transformTaskName": "$sourceTransform",
        "targetClassUris": ["https://vocab.eccenca.com/testTarget/SubLoan"],
        "dataTypePropertiesOnly": true,
        "nrCandidates": 3,
        "uriPrefix": "http://eccenca.com/dataset1/prop/",
        "ruleId": "root",
        "schemaExtractionTimeLimit": 10000,
        "schemaEntityLimit": 100000
        "matchFromDataset": true
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The matching has finished without errors. Per target property there are 1 to nrCandidates candidates sorted by decreasing confidence. The ‘type’ property is a suggestion if this should become a value/data type mapping rule or an object rule. It can have values ‘object’ and ‘value’. These values are also used in the rule generator endpoint. ‘uri’ is the source path of the data source and is only historically called URI. It’s possible values are specified by the Silk path language.

    • application/json:

      • example:

        {
            "https://vocab.eccenca.com/testTarget/volume": [
                {
                    "confidence": 0.07339449541284403,
                    "uri": "/volume",
                    "type": "value"
                }
            ],
            "https://vocab.eccenca.com/testTarget/address": [
                {
                    "confidence": 0.04128440366972476,
                    "uri": "/address",
                    "type": "object"
                }
            ]
        }
        
  • HTTP Status Code 400:

    • The request is invalid.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 404:

  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Generate Child Rules (/{projectName}/{transformTaskId}/rule/{parentRuleId})

Full URL: /ontologyMatching/rulesGenerator/{projectName}/{transformTaskId}/rule/{parentRuleId}

Given correspondences, e.g. from the matching endpoint, it generates transformation rules based on on-the-fly generated schema and profiling information in the context of the parent rule. The rules are not added to the parent rule with this call by default, instead it returns the serialization of the rules. These serialized rules can then be added via subsequent REST calls for appending or replacing a rule.

URI parameters:

  • projectName, required:

    • type: (string)
  • transformTaskId, required:

    • type: (string)
  • parentRuleId, required:

    • type: (string)

Valid HTTP methods are:

POST

The samepleLimit parameter defines how many entities should be considered during the schema extraction and profiling. The sourcePath conforms to the Silk path language and thus compatible to the correpondences returned by the ontology matching endpoint. The target property is a URI. The optional uriPrefix parameter specifies the base URI that should be used to generate a URI if no target property has been specified for the correspondence. The optional ‘type’ property defines if the generated rule should be an object mapping rule or a value rule, i.e. direct or complex mapping rule. Possible values are ‘value’ or ‘object’. If ‘addRules’ is set to true, all generated rules are added to the parent rule.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
          "correspondences": [
              {
                  "sourcePath": "Key",
                  "targetProperty": "http://someNamespace.com/prop/key",
                  "type": "value"
              },
              {
                  "sourcePath": "Value"
              }
          ],
          "uriPrefix": "http://www.eccenca.com/inst/xy/",
          "sampleLimit": 50,
          "addRules": false
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The matching has finished without errors.

    • application/json:

      • example:

        [
            {
                "id": "Key",
                "mappingTarget": {
                    "isBackwardProperty": false,
                    "uri": "",
                    "valueType": {
                        "nodeType": "IntegerValueType"
                    }
                },
                "metadata": {
                    "description": "",
                    "label": ""
                },
                "operator": {
                    "function": "IntegerParser",
                    "id": "normalize",
                    "inputs": [
                        {
                            "id": "Key",
                            "path": "/Key",
                            "type": "pathInput"
                        }
                    ],
                    "parameters": {},
                    "type": "transformInput"
                },
                "sourcePaths": [
                    "Key"
                ],
                "type": "complex"
            },
            {
                "id": "Value",
                "mappingTarget": {
                    "isBackwardProperty": false,
                    "uri": "",
                    "valueType": {
                        "nodeType": "AutoDetectValueType"
                    }
                },
                "metadata": {
                    "description": "",
                    "label": ""
                },
                "sourcePath": "/Value",
                "type": "direct"
            }
        ]
        
  • HTTP Status Code 400:

    • This happens when the parent rule cannot have children attached to it.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        
  • HTTP Status Code 401:

    • The user is not authenticated.

    • text/plain:

      • example:

        Unauthorized user.
  • HTTP Status Code 403:

    • Not authorized to access DataIntegration or this specific resource.

    • text/plain:

      • example:

        You are not authorized to use eccenca DataIntegration. Please contact an administrator.
  • HTTP Status Code 404:

  • HTTP Status Code 500:

    • The request could not be processed due to an internal error.

    • application/json:

      • example:

        {
          "title": "The error type"
          "detail": "Human-readable error message"
        }
        

Profiling API (/profiling)

Profiling API

Valdidate Primary Key (/validatePrimaryKey)

Full URL: /profiling/validatePrimaryKey

Validates if all values in a primary key column are unique. The path properties are in Silk path syntax.

Valid HTTP methods are:

POST

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "project": "cmem",
        "primaryKey": {
          "dataset": "persons",
          "typeUri": "foaf:Person",
          "path": " foaf:sha1"
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The validation completed.

    • application/json:

      • example:

        {
          "isValid": false,
          "message": "Validation found 1 values that are not unique. Examples: DuplicatedValue",
          "duplicateKeyCount": 1,
          "duplicateKeyExamples": [
            "DuplicatedValue"
          ]
        }
        
  • HTTP Status Code 400:

    • The user input is invalid.
  • HTTP Status Code 500:

    • An internal issue prevented the validation from completing.

Validate Foreign Key (/validateForeignKey)

Full URL: /profiling/validateForeignKey

Validates if all foreign keys in a given column correspond to primary keys in another table. The path properties are in Silk path syntax.

Valid HTTP methods are:

POST

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "project": "cmem",
        "primaryKey": {
          "dataset": "persons",
          "typeUri": "foaf:Person",
          "path": "foaf:sha1"
        },
        "foreignKey": {
          "dataset": "adresses",
          "typeUri": "addressTable",
          "path": "personId"
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The validation completed.

    • application/json:

      • example:

        {
          "isValid": true,
          "message": "All foreign keys correspond to existing primary keys.",
          "totalForeignKeyCount": 3,
          "invalidForeignKeyCount": 0,
          "invalidKeyExamples": []
        }
        
  • HTTP Status Code 400:

    • The user input is invalid.
  • HTTP Status Code 500:

    • An internal issue prevented the validation from completing.

/profileType/{project}/{datasetTask}

Full URL: /profiling/profileType/{project}/{datasetTask}

Run data type detection and value profiling on a specific type of a dataset.

URI parameters:

  • project, required:

    • type: (string)
  • datasetTask, required:

    • type: (string)

Valid HTTP methods are:

POST

The parameters are similar as in the call to the profiling activity. The optional dataset URI and URI prefix parameters should be the same as in the activity request, if set. The dataset URI specifies the dataset resource for the RDF serialization and the URI prefix is used for constructing RDF resources for the dataset schema classes and paths. If not set, proper values are generated. The source type specifies the type in Silk path syntax. The entity sample limit object must be defined, but the specific limits are optional. The default values are 200 and unlimited for profiling, but unless the profiling should run over the whole dataset it should be limited.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "sourceType": "/persons/phoneNumbers",
        "datasetUri": "http://dataset",
        "uriPrefix": "http://eccenca.com/test/dataset/",
        "entitySampleLimits": {
          "dataTypeDetection": 200,
          "profiling": 10000
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

  • HTTP Status Code 404:

/schemaClass/{projectId}/{datasetTaskId}

Full URL: /profiling/schemaClass/{projectId}/{datasetTaskId}

Returns schema and optional data type and profiling information for a specific type of a dataset.

URI parameters:

  • projectId, required:

    • type: (string)
  • datasetTaskId, required:

    • type: (string)

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • typePath, required:

    The source type path the schema class information is requested for, in normalized form.

    • type: (string)
Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
          "idx": 1,
          "paths": [
              {
                  "dataTypeDiscoveryResult": {
                      "confidence": 1,
                      "parserPlugin": {
                          "pluginId": "IntegerParser",
                          "pluginParameters": {
                              "commaAsDecimalPoint": "false",
                              "thousandSeparator": "false"
                          }
                      },
                      "schemaDatatype": "IntegerDataType$"
                  },
                  "idx": 0,
                  "path": "id",
                  "profilingInfo": {
                      "datatypeProfile": {
                          "count": 2,
                          "distribution": [
                              1,
                              1
                          ],
                          "max": 1,
                          "mean": 1,
                          "min": 0,
                          "profileClassId": "Long profile",
                          "samples": [
                              1,
                              1
                          ],
                          "uniqueCount": 2
                      },
                      "entitiesSampled": 2,
                      "entitySampleLimit": 10000,
                      "profilingInfoUri": "http://example.namespace.prefix/7khKJlIzIjLg/id/profilingInfo"
                  },
                  "uri": "http://example.namespace.prefix/7khKJlIzIjLg/id"
              },
              {
                  "dataTypeDiscoveryResult": {
                      "confidence": 1,
                      "parserPlugin": {
                          "pluginId": "StringParser",
                          "pluginParameters": {}
                      },
                      "schemaDatatype": "StringDataType$"
                  },
                  "idx": 1,
                  "path": "name",
                  "profilingInfo": {
                      "datatypeProfile": {
                          "count": 2,
                          "distribution": [
                              1,
                              1
                          ],
                          "max": "Max",
                          "maxStrLength": 4,
                          "meanStrLength": 4,
                          "min": "John",
                          "minStringLength": 3,
                          "profileClassId": "String profile",
                          "regexPatterns": {
                              "^\\p{L}+$": {
                                  "count": 2,
                                  "regex": "^\\p{L}{3,4}$"
                              }
                          },
                          "regexThreshold": 10,
                          "samples": [
                              "Max",
                              "Max"
                          ],
                          "uniqueCount": 2
                      },
                      "entitiesSampled": 2,
                      "entitySampleLimit": 10000,
                      "profilingInfoUri": "http://example.namespace.prefix/7khKJlIzIjLg/name/profilingInfo"
                  },
                  "uri": "http://example.namespace.prefix/7khKJlIzIjLg/name"
              }
          ],
          "sourceType": "persons",
          "uri": "http://example.namespace.prefix/EjNJqMl0Q7Mx/persons"
      }
      

/schemaClass/{projectId}/{datasetTaskId}/exampleValues

Full URL: /profiling/schemaClass/{projectId}/{datasetTaskId}/exampleValues

Returns example values for each source path of a specific source class.

URI parameters:

  • projectId, required:

    • type: (string)
  • datasetTaskId, required:

    • type: (string)

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • typePath, required:

    The source type path the sample values are requested for, in normalized serialization form.

    • type: (string)
Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
          "id": [
              1,
              0
          ],
          "name": [
              "Max",
              "John"
          ],
          "phoneNumbers": [
              "urn:instance:r271dcda6-bfd2-4be0-96fa-4fac0f9f3f9b#-1145177022",
              "urn:instance:r271dcda6-bfd2-4be0-96fa-4fac0f9f3f9b#-97873630",
              "urn:instance:r271dcda6-bfd2-4be0-96fa-4fac0f9f3f9b#-103599268"
          ]
      }
      

Script API (/scripts)

Provides functions for scripts that can be executed as part of a workflow.

Script Task Auto-Completion (/completions)

Full URL: /scripts/projects/{project}/tasks/{task}/completions

Returns auto completions for a script. The completions are based on the most recent script execution, i.e., the script needs to be executed in the workflow prior to requesting the completions.

Valid HTTP methods are:

POST

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "line": "inputs.",
        "column": 7
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The completions returned by the compiler.

    • application/json:

      • example:

        {
          "candidates": ["apply", "applyOrElse", "head", "headOption", ...],
          "cursor": 7
        }
        

(/api)

Artifact Search (/searchItems)

Full URL: /api/workspace/searchItems

Allows to search over all DataIntegration artifacts with text search and filter facets.

Valid HTTP methods are:

POST

If the optional project parameter is defined, only artifacts from that project are fetched. If the ‘itemType’ parameter is defined, then only artifacts of this type are fetched. Valid values are: Project, Dataset, Transformation, Linking, Workflow, Task. The ‘textQuery’ parameter is a conjunctive multi word query. The single words can be scattered over different artifact properties, e.g. one in label and one in description. The ‘offset’ and ‘limit’ parameters allow for paging through the result list. The limit will default to 10 if it is not provided. It can be disabled by setting it to ‘0’, which will return all results. The optional sort parameter allows for sorting the result list by a specific artifact property, e.g. label, creation date, update date. The ‘facets’ parameter defines what facets are set to which values. The ‘keyword’ facet allows multiple values to be set.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "project": "cmem",
        "itemType": "Dataset",
        "textQuery": "production database",
        "offset": 0,
        "limit": 20,
        "sortBy": "label",
        "sortOrder": "ASC",
        "facets": [
          {
            "facetId": "datasetType",
            "type": "keyword",
            "keywordIds": ["csv", "eccencaDataPlatform"]
          }
        ]
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The response contains the result list as well as the list of potential facets for the currently selected task type.

    • application/json:

      • example:

        {
          "total": 110,
          "results": [
            { "type": "Project",
              "id": "cmem",
              "label": "CMEM",
              "description": "CMEM project",
              "itemLinks": []
            },
            { "type": "Dataset",
              "project": "cmem",
              "projectLabel": "CMEM the project",
              "pluginId": "csv",
              "id": "customers",
              "label": "Customers",
              "description": "Customers dataset"
            },
            { "type": "Task",
              "pluginId": "script",
              "projectLabel": "CMEM the project",
              "project": "cmem",
              "id": "processX",
              "label": "Process X",
              "description": "Process X via script",
              "itemLinks": []
            },
            {
              "description": "",
              "id": "transform_a_to_b",
              "itemLinks": [
                  {
                      "label": "Mapping editor",
                      "path": "/transform/pmd/transform_a_to_b/editor"
                  },
                  {
                      "label": "Transform evaluation",
                      "path": "/transform/pmd/transform_a_to_b/evaluate"
                  },
                  {
                      "label": "Transform execution",
                      "path": "/transform/pmd/transform_a_to_b/execute"
                  }
              ],
              "label": "Transform A to B",
              "projectId": "cmem",
              "type": "Transform"
            }
          ],
          "sortByProperties": [
              {
                  "id": "label",
                  "label": "Label"
              }
          ],
          "facets": [
            {
              "id": "tag",
              "label": "Tag",
              "description": "A user supplied tag for custom categorization.",
              "type": "keyword",
              "values": [{"id": "test", "label": "Test", "count": 2}, {"id": "public", "label": "Public", "count": 3}, {"id": "private", "label": "Private", "count": 4}]
            },
            {
              "id": "datasetType",
              "label": "Dataset type",
              "description": "The concrete type of a dataset, which comprises its format and other characteristics."
              "type": "keyword",
              "values": [
                {
                    "count": 43,
                    "id": "eccencaDataPlatform",
                    "label": "Knowledge Graph"
                },
                {
                    "count": 17,
                    "id": "csv",
                    "label": "CSV"
                }
            ...
          ]
        }
        

Item type (/searchConfig/types)

Full URL: /api/workspace/searchConfig/types

The item types that a user can restrict the search to. The selected type will also influence the available facets.

Valid HTTP methods are:

GET

The id of the source dataset.

Query Parameters

This method accepts the following query parameters:

  • projectId:

    Optional parameter that fetches the types for a specific project. This will only display types that contain at least one item.

    • type: (string)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "label": "Type",
          "values": [{"id": "project", "label": "Project"}, {"id": "dataset", "label": "Dataset"}, {"id": "transformation", "label": "Transformation"}, {"id": "linking", "label": "Linking"}, {"id": "workflow", "label": "Workflow"}, {"id": "task", "label": "Task"}]
        }
        

Plugin parameter auto-completion (/pluginParameterAutoCompletion)

Full URL: /api/workspace/pluginParameterAutoCompletion

Auto-completion endpoint for plugin parameter values.

Valid HTTP methods are:

POST

The ‘pluginId’ and ‘parameterId’ reference the parameter of a plugin, they values can be read e.g. from the /plugins endpoint. The ‘projectId’ provides the project context for parameters that hold values that are project specific, e.g. task references. The ‘dependsOnParameterValues’ parameter contains all the values of other parameters this auto-completion depends on. E.g. if a plugin has a parameter ‘project’ and ‘projectTask’, the ‘projectTask’ parameter may depend on ‘project’, because only when the project is known then the auto-completion of project tasks can be peformed. The list of parameters are returned in the plugin parameter description. The ‘textQuery’ parameter is a conjunctive multi word query matching against the possible results. The ‘offset’ and ‘limit’ parameters allow for paging through the result list.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "pluginId": "Scheduler",
        "parameterId": "task",
        "projectId": "cmem",
        "dependsOnParameterValues": ["value the auto-completion depends on"],
        "textQuery": "sched",
        "offset": 0,
        "limit": 10
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The response contains the result list of matching auto-completion results. The ‘label’ property is optional and may not be defined, even for parameters that are supposed to have labels. In this case the ‘value’ should be taken as label.

    • application/json:

      • example:

        [
          {
            "label": "Scheduled workflow 1",
            "value": "workflow1"
          },
          {
            "value": "workflow2"
          }
        ]
        
  • HTTP Status Code 404:

Init frontend (/initFrontend)

Full URL: /api/workspace/initFrontend

Returns information that is necessary for the frontend initialization or otherwise needed from the beginning on.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • The emptyWorkspace parameter signals if the workspace is empty or contains at least one project. The initialLangauge parameter returns the initial language (either ‘de’ or ‘en’) that has been extracted from the Accept-language HTTP header send by the browser. The dmBaseUrl is optional and returns the base URL, if configured in the DI config via parameter eccencaDataManager.baseUrl. The dmModuleLinks are only available if the DM base URL is defined. These are configured links to DM modules.

    • application/json:

      • example:

        {
          "emptyWorkspace":true,
          "initialLanguage":"en",
          "dmBaseUrl": "http://docker.local",
          "dmModuleLinks": [
                  {
                      "defaultLabel": "Exploration",
                      "icon": "application-explore",
                      "path": "explore"
                  },
                  {
                      "defaultLabel": "Vocabulary Management",
                      "icon": "application-vocabularies",
                      "path": "vocab"
                  },
                  {
                      "defaultLabel": "Queries",
                      "icon": "application-queries",
                      "path": "query"
                  }
              ],
        }
        

Task activities status (/taskActivitiesStatus)

Full URL: /api/workspace/taskActivitiesStatus

Returns status information of a set of task activities. By default all task activities are returned.

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • projectId:

    If defined only task activities of a specific project are considered.

    • type: (string)
  • statusFilter:

    If defined only task activities with a specific status are returned. Valid values are “Idle”, “Not executed”, “Finished”, “Cancelled”, “Failed”, “Successful”, “Canceling”, “Running” and “Waiting”. States “Idle” and “Not executed” are synonyms and “Idle” is kept only for backwards compatibility. State “Finished” is a union of following sub-states “Cancelled”, “Failed” and “Successful”. “Waiting” is the state of an activity being scheduled, but still waiting in queue for being executed.

    • type: (string)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
            {
                "activity": "TypesCache",
                "cancelled": false,
                "concreteStatus": "Successful",
                "exceptionMessage": null,
                "failed": false,
                "isRunning": false,
                "lastUpdateTime": 1595861385992,
                "message": "Finished in 46ms",
                "progress": 100,
                "project": "singleProject",
                "runtime": 46,
                "startTime": 1595861385946,
                "statusName": "Finished",
                "task": "d57c393f-8f3f-48ba-ba13-8e815e04d557_CSVdataset"
            },
            {
                "activity": "ExecuteTransform",
                "concreteStatus": "Not executed",
                "failed": false,
                "isRunning": false,
                "lastUpdateTime": 1595861385941,
                "message": "Idle",
                "progress": null,
                "project": "singleProject",
                "startTime": null,
                "statusName": "Idle",
                "task": "a0d18ae0-085b-4a06-aee1-b4c19bd00eac_failTransform"
            },
            {
                "activity": "ExecuteLocalWorkflow",
                "cancelled": false,
                "concreteStatus": "Failed",
                "exceptionMessage": "Exception during execution of workflow operator a0d18ae0-085b-4a06-aee1-b4c19bd00eac_failTransform. Cause: No input given to transform specification executor a0d18ae0-085b-4a06-aee1-b4c19bd00eac_failTransform!",
                "failed": true,
                "isRunning": false,
                "lastUpdateTime": 1595861468748,
                "message": "Failed after 135ms: Exception during execution of workflow operator a0d18ae0-085b-4a06-aee1-b4c19bd00eac_failTransform. Cause: No input given to transform specification executor a0d18ae0-085b-4a06-aee1-b4c19bd00eac_failTransform!",
                "progress": 100,
                "project": "singleProject",
                "runtime": 135,
                "startTime": 1595861468613,
                "statusName": "Finished",
                "task": "e7dc14e5-b45b-4dc5-9933-bbc2750630f5_failedWorkflow"
            }
        ]
        

Project import resources (/projectImport)

Full URL: /api/workspace/projectImport

Project import resources are used for a multi step project import procedure that is comprised of multiple steps 1. the project file upload, 2. the validation of the uploaded file, 3. the asynchronous execution of the project import and 4. the status of the running project import execution.

Valid HTTP methods are:

POST

Uploads the project file of the project to be imported.

Body

This method accepts the following body payloads:

  • multipart/form-data:

    • form parameters:

      • file:

        The file to be uploaded.

        • type: (file)
  • application/octet-stream:

    • example:

      The raw bytes to be uploaded.
  • text/plain:

    • example:

      The text to be uploaded.
Response

The expected response:

  • HTTP Status Code 201:

    • A project import resource with the returned ID has been created.

    • application/json:

      • example:

        {
            "projectImportId": "di-projectImport5196140007678722748"
        }
        

Project import resource (/:projectImportId)

Full URL: /api/workspace/projectImport/:projectImportId

The project import resource that was created by uploading a project file.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • Details for the uploaded project file. The label, description and projectId properties were extracted from the uploaded file. The projectAlreadyExists property states that there already exists a project with the exact same ID. In that case special flags can be set for the subsequent request that starts the project import execution. The marshallerId is the detected project file format, in the example it is a project XML ZIP archive.

    • application/json:

      • example:

        {
            "description": "Config project description",
            "label": "Config Project",
            "marshallerId": "xmlZip",
            "projectAlreadyExists": true,
            "projectId": "configProject"
        }
        
  • HTTP Status Code 404:

DELETE

Response

The expected response:

  • HTTP Status Code 201:

    • The project import resource and the uploaded files have been deleted. The delete request is idempotent.

POST

Query Parameters

This method accepts the following query parameters:

  • generateNewId:

    When enabled this will always generate a new ID for this project based on the project label. This is one strategy if a project with the original ID already exists.

    • type: (boolean)
  • overwriteExisting:

    When enabled this will overwrite an existing project with the same ID. Enabling this option will NOT override the generateNewId option.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 201:

    • The project import has been executed. The status of the project import can be requested via the status endpoint.
  • HTTP Status Code 404:

  • HTTP Status Code 409:

    • Returned if a project with the same ID already exists and neither generateNewId nor overwriteExisting is enabled. Also returned if the uploaded temporary project file has been deleted because it reached it’s max age.

Project import execution status (/status)

Full URL: /api/workspace/projectImport/:projectImportId/status

When the project import execution has been started, this will return the status of the project execution.

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • timeout:

    The timeout in milliseconds when this call should return if the execution is not finished, yet. This allows for long-polling the result.

    • type: (integer)- default value: 20000
Response

The expected response:

  • HTTP Status Code 200:

    • The status of the project execution. There are 3 types of responses.
  1. The execution is still in progress, i.e. no ‘success’ property is defined.
  2. The ‘success’ property is defined and set to true, which means that the import has been successful.
  3. The ‘success’ property is defined and set to false, which means that the import has failed. The ‘failureMessage’ property gives the reason for the failure.

    • application/json (in progress):

      • example:

        {
            "projectId": "1e813497-0c75-48cf-a857-2ddc3f94fe26_ConfigProject",
            "importStarted": 1600950697304
        }
        
    • application/json (finished successfully):

      • example:

        {
            "projectId": "1e813497-0c75-48cf-a857-2ddc3f94fe26_ConfigProject",
            "importStarted": 1600950697304,
            "importEnded": 1600950697497,
            "success": true
        }
        
    • application/json (finished failed):

      • example:

        {
            "projectId": "1e813497-0c75-48cf-a857-2ddc3f94fe26_ConfigProject",
            "importStarted": 1600950697304,
            "importEnded": 1600950697497,
            "success": false,
            "failureMessage": "Exception during..."
        }
        
  • HTTP Status Code 404:

    • The execution has not been started, yet, or the project import ID is not known.

Projects (/)

Full URL: /api/workspace/projects/

Projects in the workspace.

Valid HTTP methods are:

POST

Create a new project by specifying a label and an optional description.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "metaData": {
          "label": "Project label",
          "description": "Project description"
        }
      }
      
Response

The expected response:

  • HTTP Status Code 201:

    • The project has been added. The URI of the new project is returned, which includes the automatically generated project ID.
  • Location:

    • type: (string)

    • example:

      /api/workspace/projects/projectx42
    • application/json:

Project meta data (/metaData)

Full URL: /api/workspace/projects/{projectId}/metaData

Project meta data like label, description

Valid HTTP methods are:

GET

Returns the meta data of a project.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "label": "Some label",
          "description": "Some description",
          "modified": "2020-05-05T11:55:54.157Z"
        }
        

PUT

Update the meta data of the project, i.e. the label and description.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "label": "New label",
        "description": "New description"
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "label": "New label",
          "description": "New description",
          "modified":"2020-04-29T13:51:00.349Z"
        }
        
  • HTTP Status Code 404:

Project prefixes (/prefixes)

Full URL: /api/workspace/projects/{projectId}/prefixes

Project namespace prefix definitions that map from a prefix name to a URI prefix.

Valid HTTP methods are:

GET

Fetch all project prefixes.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
        "foaf": "http://xmlns.com/foaf/0.1/",
        "customPrefix": "http://customPrefix.cc/"
      }
      

Project prefix (/:prefixName)

Full URL: /api/workspace/projects/{projectId}/prefixes/:prefixName

A single project prefix definition

Valid HTTP methods are:

PUT

Create or update the prefix URI for a specific prefix name.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      "http://custom.prefix/"
      
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
          "foaf": "http://xmlns.com/foaf/0.1/",
          "customPrefix": "http://custom.prefix/"
        }
        
  • HTTP Status Code 404:

    • Project not found

DELETE

Delete the prefix definition.

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
          "foaf": "http://xmlns.com/foaf/0.1/"
        }
        

Project resource search (/resourceSearch)

Full URL: /api/workspace/projects/{projectId}/resourceSearch

Allows searching for project resources by text query and paging through the results.

Valid HTTP methods are:

POST

The ‘limit’ and ‘offset’ parameters allow paging through the results. Results are always lexicographically sorted. The ‘searchText’ parameter allows a case-insensitive multi word search over the resource names. All parameters are optional.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "searchText": "loans csv",
        "limit": 5,
        "offset": 5
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • A list of project resource search results that match the search request.

    • application/json:

      • example:

        [
          {
            "lastModified": "2020-01-09T12:17:12Z",
            "name": "Some_Loans_124432.csv",
            "size": 105560722
          }
        ]
        
  • HTTP Status Code 404:

Project tasks loading error report (/failedTasksReport)

Full URL: /api/workspace/projects/{projectId}/failedTasksReport

Get a detailed loading error report for all tasks that could not be loaded in a project.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
          {
            "taskId": "transformsourcex",
            "errorSummary": "Loading failed: Transform source X",
            "taskLabel": "Transform source X",
            "taskDescription": "Transforms source X to ...",
            "errorMessage": "Loading of task 'Transform source X' failed because input 'some_missing_input' could not be found.",
            "stacktrace": "...  ..."
          }
          ...
        ]
        
    • text/markdown:

      • example:

        # Project task loading error report
        
        In project 'cmem' 2 tasks could not be loaded.
        
        ## Task 1: Transform source X
        
        * Task ID: transformsourcex
        * Error summary: Loading failed: Transform source X,
        * Task label: Transform source X
        * Task description: Transforms source X to ...
        * Error message: Loading of task 'Transform source X' failed because input 'some_missing_input' could not be found.
        * Stacktrace:

        SUPER LONG JVM STACKTRACE ```

        ```

  • HTTP Status Code 404:

Project task loading error report (/{taskId})

Full URL: /api/workspace/projects/{projectId}/failedTasksReport/{taskId}

Get a detailed loading error report for a specific project task that could not be loaded in the project.

URI parameters:

  • taskId, required:

    • type: (string)

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "taskId": "transformsourcex",
          "errorSummary": "Loading failed: Transform source X",
          "taskLabel": "Transform source X",
          "taskDescription": "Transforms source X to ...",
          "errorMessage": "Loading of task 'Transform source X' failed because input 'some_missing_input' could not be found.",
          "stacktrace": "...  ..."
        }
          ...
        
    • text/markdown:

      • example:

        # Project task loading error report
        
        
        Task 'Transform source X' in project 'cmem' has failed loading.
        
        ## Details
        
        * Task ID: transformsourcex
        * Error summary: Loading failed: Transform source X,
        * Task label: Transform source X
        * Task description: Transforms source X to ...
        * Error message: Loading of task 'Transform source X' failed because input 'some_missing_input' could not be found.
        * Stacktrace:
        SUPER LONG JVM STACKTRACE
  • HTTP Status Code 404:

Clone project (/clone)

Full URL: /api/workspace/projects/{projectId}/clone

Clones an existing project.

Valid HTTP methods are:

POST

The body contains the meta data of the to be created project. The label is required and must not be empty. The description is optional.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "metaData": {
          "label": "New project",
          "description": "Optional description"
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The generated ID and the link to the project details page.

    • application/json:

      • example:

        {
          "id": "200a2458-8cd5-4ca1-8047-b2578aa03d24_Newtask",
          "detailsPage": "/workbench/projects/cmem/transform/200a2458-8cd5-4ca1-8047-b2578aa03d24_Newproject"
        }
        
  • HTTP Status Code 404:

Full URL: /api/workspace/projects/{projectId}/tasks/{taskId}/relatedItems

Fetches all directly related items of a project task. Related items are all project tasks that either are directly referenced by the task itself or reference the task. Also any task from a workflow that is directly connected to this task, i.e. either input or output, is part of the result list.

Valid HTTP methods are:

GET

Query Parameters

This method accepts the following query parameters:

  • textQuery:

    An optional (multi word) text query to filter the list of plugins. Each word in the query has to match at least one sub-string from the label or the type property of a related item.

    • type: (string)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
            "total": 2,
            "items": [
                {
                    "id": "testCsv",
                    "itemLinks": [
                        {
                            "label": "Dataset details page",
                            "path": "/workspaceNew/projects/testTasks/dataset/testCsv"
                        }
                    ],
                    "label": "test Csv",
                    "type": "Dataset"
                },
                {
                    "id": "workflow",
                    "itemLinks": [
                        {
                            "label": "Workflow details page",
                            "path": "/workspaceNew/projects/testTasks/workflow/workflow"
                        },
                        {
                            "label": "Workflow editor",
                            "path": "/workflow/editor/project/workflow"
                        }
                    ],
                    "label": "workflow",
                    "type": "Workflow"
                }
            ]
        }
        
  • HTTP Status Code 404:

Full URL: /api/workspace/projects/{projectId}/tasks/{taskId}/links

All relevant links of this project task.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        [
            {
                "label": "Transform details page",
                "path": "/workbench/projects/cmem/transform/someTransformTask"
            },
            {
                "label": "Mapping editor",
                "path": "/transform/cmem/someTransformTask/editor"
            },
            {
                "label": "Transform evaluation",
                "path": "/transform/cmem/someTransformTask/evaluate"
            },
            {
                "label": "Transform execution",
                "path": "/transform/cmem/someTransformTask/execute"
            }
        ]
        

Item info (/itemInfo)

Full URL: /api/workspace/projects/{projectId}/tasks/{taskId}/itemInfo

Frontend relevant information about a project task item, e.g. the item type of a task.

Valid HTTP methods are:

GET

Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
            "itemType": {
                "id": "linking",
                "label": "Linking"
            }
        }
        

Clone project task (/clone)

Full URL: /api/workspace/projects/{projectId}/tasks/{taskId}/clone

Clones an existing project task.

Valid HTTP methods are:

POST

The request body contains the meta data of the newly created, cloned task. The label is required and must not be empty. The description is optional.

Body

This method accepts the following body payloads:

  • application/json:

    • example:

      {
        "metaData": {
          "label": "New task",
          "description": "Optional description"
        }
      }
      
Response

The expected response:

  • HTTP Status Code 200:

    • The generated ID and the link to the task details page.

    • application/json:

      • example:

        {
          "id": "200a2458-8cd5-4ca1-8047-b2578aa03d24_Newtask",
          "detailsPage": "/workbench/projects/cmem/transform/200a2458-8cd5-4ca1-8047-b2578aa03d24_Newtask"
        }
        
  • HTTP Status Code 404:

Task plugins (/taskPlugins)

Full URL: /api/core/taskPlugins

A list of plugins that can be created as workspace tasks, e.g. datasets, transform tasks etc. The result of this endpoint only contains meta data of the plugin, i.e. title, description and categories. To fetch the schema details of a specific plugin use the /plugin endpoint.

Valid HTTP methods are:

GET

The id of the source dataset.

Query Parameters

This method accepts the following query parameters:

  • addMarkdownDocumentation:

    If true, MarkDown documentation will be added for plugins if available.

    • type: (boolean)
  • textQuery:

    An optional (multi word) text query to filter the list of plugins.

    • type: (string)
  • category:

    An optional category. This will only return plugins from the same category.

    • type: (string)
Response

The expected response:

  • HTTP Status Code 200:

    • application/json:

      • example:

        {
          "multiCsv" : {
            "title" : "Multi CSV ZIP",
            "categories" : [ "file" ],
            "description" : "Reads from or writes to multiple CSV files from/to a single ZIP file.",
            "markdownDocumentation": "# Some markdown documentations"
          },
          "csv" : {
            "title" : "CSV",
            "categories" : [ "file" ],
            "description" : "Read from or write to an CSV file."
          },
          ...
        }
        

Plugin description (/plugins/{pluginId})

Full URL: /api/core/plugins/{pluginId}

The plugin description of a specific plugin, including meta data and JSON schema.

URI parameters:

  • pluginId, required:

    • type: (string)

Valid HTTP methods are:

GET

The id of the source dataset.

Query Parameters

This method accepts the following query parameters:

  • addMarkdownDocumentation:

    If true, MarkDown documentation will be added for plugins if available.

    • type: (boolean)
  • pretty:

    If true, JSON output will be pretty printed.

    • type: (boolean)
Response

The expected response:

  • HTTP Status Code 200:

    • Contains the typical meta data of a plugin like title, categories and description. The ‘taskType’ property is optional and specifies the task type a task related plugin, e.g. workflow, dataset etc., belongs to. The task type must be specified when creating tasks via the generic /tasks endpoint. The JSON schema part of the plugin parameters is described in the ‘properties’ object. Besides title and description each parameter has the JSON type, which can only be “string” or “object” at the moment. The ‘parameterType’ specifies the internal data type. For “object” types this can be ignored, for “string” parameter types this gives a hint at what kind of UI widget is appropriate and what kind of validation could be applied. The ‘value’ property gives the default value when this parameter is not specified. The ‘advanced’ property marks the parameter as advanced and acts as a hint that this parameter should be somehow handled differently by the UI. If the ‘visibleInDialog’ property is set to false, then this parameter should not be set from a creation or update dialog. Usually this parameter is complex and is modified in special editors, e.g. the mapping editor. The pluginId property specifies the ID of the plugin and is also set for all plugin parameters that are plugins themselves. The plugin ID is needed, e.g. for the auto-completion of parameter values. A parameter can have an autoCompletion property that specifies how a parameter value can or should be auto-completed. If allowOnlyAutoCompletedValues is set to true then the UI must make sure that only values from the auto-completion are considered as valid. If autoCompleteValueWithLabels is set to true, then the auto-completion values might have a label in addition to the actual value. Only the label should be presented to the user then. The autoCompletionDependsOnParameters array specifies the values of parameters from the same object, a specific parameter depends on. These must be send in the auto-completion request in the same order.

    • application/json:

      • example:

        {
          "title" : "Transform",
          "categories" : [ "Transform" ],
          "description" : "A transform task defines a mapping from a source structure to a target structure.",
          "taskType" : "Transform",
          "type" : "object",
          "pluginId": "transform",
          "properties" : {
            "selection" : {
              "title" : "Input task",
              "description" : "The source from which data will be transformed when executed as a single task outside of a workflow.",
              "type" : "object",
              "parameterType" : "objectParameter",
              "value" : null,
              "advanced" : false,
              "visibleInDialog" : true,
              "pluginId" : "datasetSelectionParameter",
              "properties" : {
                "inputId" : {
                  "title" : "Dataset",
                  "description" : "The dataset to select.",
                  "type" : "string",
                  "parameterType" : "identifier",
                  "value" : null,
                  "advanced" : false,
                  "visibleInDialog" : true,
                  "autoCompletion" : {
                    "allowOnlyAutoCompletedValues" : true,
                    "autoCompleteValueWithLabels" : true,
                    "autoCompletionDependsOnParameters" : [ ]
                  }
                },
                "typeUri" : {
                  "title" : "Type",
                  "description" : "The type of the dataset. If left empty, the default type will be selected.",
                  "type" : "string",
                  "parameterType" : "uri",
                  "value" : null,
                  "advanced" : false,
                  "visibleInDialog" : true,
                  "autoCompletion" : {
                    "allowOnlyAutoCompletedValues" : false,
                    "autoCompleteValueWithLabels" : false,
                    "autoCompletionDependsOnParameters" : [ ]
                  }
                },
                "restriction" : {
                  "title" : "Restriction",
                  "description" : "Additional restrictions on the enumerated entities. If this is an RDF source, use SPARQL patterns that include the variable ?a to identify the enumerated entities, e.g. ?a foaf:knows ",
                  "type" : "string",
                  "parameterType" : "restriction",
                  "value" : "",
                  "advanced" : false,
                  "visibleInDialog" : true
                }
              }
            },
            "mappingRule" : {
              "title" : "mapping rule",
              "description" : "",
              "type" : "object",
              "parameterType" : "objectParameter",
              "value" : {
                "type" : "root",
                "id" : "root",
                "rules" : {
                  "uriRule" : null,
                  "typeRules" : [ ],
                  "propertyRules" : [ ]
                },
                "metadata" : {
                  "label" : "Root Mapping"
                }
              },
              "advanced" : false,
              "visibleInDialog" : false
            },
            "output" : {
              "title" : "Output dataset",
              "description" : "An optional dataset where the transformation results should be written to when executed as single task outside of a workflow.",
              "type" : "string",
              "parameterType" : "option[identifier]",
              "value" : "",
              "advanced" : false,
              "visibleInDialog" : true,
              "autoCompletion" : {
                "allowOnlyAutoCompletedValues" : true,
                "autoCompleteValueWithLabels" : true,
                "autoCompletionDependsOnParameters" : [ ]
              }
            },
            ... SNIP ...
          },
          "required" : [ "selection" ]
        }

Activity Reference

Project Activities

The following activities are available for each project.

Dataset matcher

Generates matches between schema paths and datasets based on the schema discovery and profiling information of the datasets.

Parameter Type Description Example
datasetUri String If set, run dataset matching only
for this particular
dataset.

The identifier for this plugin is DatasetMatcher.

It can be found in the package com.eccenca.di.datamatching.

Task Activities

The following activities are available for different types of tasks.

Custom

Execute REST Task

Executes the REST task.

This plugin does not require any parameters. The identifier for this plugin is ExecuteRestTask.

It can be found in the package com.eccenca.di.workflow.operators.rest.

Dataset

Dataset profiler

Generates profiling data of a dataset, e.g. data types, statistics etc.

Parameter Type Description Example
datasetUri String Optional URI of the dataset
resource that should be profiled.
If not specified an URI will be
generated.
uriPrefix String Optional URI prefix that is
prepended to every generated URI,
e.g. property URIs for every
schema path. If not specified an
URI prefix will be
generated.
entitySample-
Limit
String How many entities should be sampled
for the profiling. If left blank,
all entities will be
considered.
timeLimit String The time in milliseconds that each
of the schema extraction step and
profiling step should spend on.
Leave blank for unlimited
time.
classProfiling-
Limit
int The maximum number of classes that
are profiled from the extracted
schema.
schemaEntity-
Limit
int The maximum number of overall
schema entities (types,
properties/attributes) that will
be extracted.
executionType String The execution type to be used:
SPARK, LEGACY. The legacy
execution uses large in-memory
maps and takes longer!

The identifier for this plugin is DatasetProfiler.

It can be found in the package com.eccenca.di.profiling.

SQL endpoint status

Shows the SQL endpoint status.

This plugin does not require any parameters. The identifier for this plugin is SqlEndpointStatus.

It can be found in the package com.eccenca.di.sql.endpoint.activity.

Types cache

Holds the most frequent types in a dataset.

This plugin does not require any parameters. The identifier for this plugin is TypesCache.

It can be found in the package org.silkframework.workspace.activity.dataset.

LinkSpecification

Active learning

Executes an active learning iteration.

Parameter Type Description Example
fixedRandom-
Seed
boolean No description

The identifier for this plugin is ActiveLearning.

It can be found in the package org.silkframework.learning.active.

Evaluate linking

Evaluates the linking task by generating links.

Parameter Type Description Example
includeReference-
Links
boolean Do not generate a link for which
there is a negative reference link
while always generating positive
reference links.
useFileCache boolean Use a file cache. This avoids
memory overflows for big
files.
partitionSize int The number of entities in a single
partition in the cache.
generateLinksWith-
Entities
boolean Generate detailed information about
the matched entities. If set to
false, the generated links won’t
be shown in the Workbench.
writeOutputs boolean Write the generated links to the
configured output of this
task.
linkLimit int If defined, the execution will stop
after the configured number of
links is reached.
This is just a hint and the
execution may produce slightly
fewer or more links.
timeout int Timeout in seconds after that the
matching task of an evaluation
should be aborted. Set to 0 or
negative to disable the
timeout.

The identifier for this plugin is EvaluateLinking.

It can be found in the package org.silkframework.workspace.activity.linking.

Execute linking

Executes the linking task using the configured execution.

This plugin does not require any parameters. The identifier for this plugin is ExecuteLinking.

It can be found in the package org.silkframework.workspace.activity.linking.

Linking paths cache

Holds the most frequent paths for the selected entities.

This plugin does not require any parameters. The identifier for this plugin is LinkingPathsCache.

It can be found in the package org.silkframework.workspace.activity.linking.

Reference entities cache

For each reference link, the reference entities cache holds all values of the linked entities.

This plugin does not require any parameters. The identifier for this plugin is ReferenceEntitiesCache.

It can be found in the package org.silkframework.workspace.activity.linking.

Supervised learning

Executes the supervised learning.

This plugin does not require any parameters. The identifier for this plugin is SupervisedLearning.

It can be found in the package org.silkframework.learning.active.

Scheduler

Activate

Executes the scheduler

This plugin does not require any parameters. The identifier for this plugin is ExecuteScheduler.

It can be found in the package com.eccenca.di.scheduler.

ScriptTask

Execute Script

Executes the script.

This plugin does not require any parameters. The identifier for this plugin is ExecuteScript.

It can be found in the package com.eccenca.di.scripttask.

TransformSpecification

Execute transform

Executes the transformation.

Parameter Type Description Example
limit IntOptionParameter Limits the maximum number of
entities that are
transformed.

The identifier for this plugin is ExecuteTransform.

It can be found in the package org.silkframework.workspace.activity.transform.

Transform paths cache

Holds the most frequent paths for the selected entities.

This plugin does not require any parameters. The identifier for this plugin is TransformPathsCache.

It can be found in the package org.silkframework.workspace.activity.transform.

Target vocabulary cache

Holds the target vocabularies

This plugin does not require any parameters. The identifier for this plugin is VocabularyCache.

It can be found in the package org.silkframework.workspace.activity.transform.

Workflow

Execute locally

Executes the workflow locally.

This plugin does not require any parameters. The identifier for this plugin is ExecuteLocalWorkflow.

It can be found in the package org.silkframework.workspace.activity.workflow.

WorkflowExecution

Generate Spark assembly

Generate project and Spark assembly artifacts and deploy them using the specified configuration settings: type, artifact and options like destination in case of a simple copy

Parameter Type Description Example
executeStaging boolean Execute loading phase
executeTransform boolean Execute transform phase
executeLoading boolean Execute staging phase

The identifier for this plugin is DeploySparkWorkflow.

It can be found in the package com.eccenca.di.spark.

Default execution

Executes a workflow with the executor defined in the configuration

This plugin does not require any parameters. The identifier for this plugin is ExecuteDefaultWorkflow.

It can be found in the package com.eccenca.di.spark.

Execute operator

Executes a workflow on with an executor that uses Apache Spark. Depending on the Spark configuration it can still run on a single local machine or on a cluster.

Parameter Type Description Example
operator TaskReference The workflow to execute.

The identifier for this plugin is ExecuteSparkOperator.

It can be found in the package com.eccenca.di.spark.

Execute on Spark

Executes a workflow on with an executor that uses Apache Spark. Depending on the Spark configuration it can still run on a single local machine or on a cluster.

This plugin does not require any parameters. The identifier for this plugin is ExecuteSparkWorkflow.

It can be found in the package com.eccenca.di.spark.

Execute with payload

Executes a workflow with custom payload.

Parameter Type Description Example
configuration MultilineStringParameter No description
configuration-
Type
String No description

The identifier for this plugin is ExecuteWorkflowWithPayload.

It can be found in the package org.silkframework.workbench.workflow.

Generate view

Generate and share a view on a workflow executed by the Spark executor. Executes a workflow on Spark and generates a SparkSQL temporary table instead of serializing the result. The table can be accessed via JDBC

Parameter Type Description Example
caching boolean Optional parameter that enables
caching (default=false).
userDefined-
Name
String Optional View name that is used
when a view on a non virtual is
generated (default =
[TASK-ID]_generated_view).

The identifier for this plugin is GenerateSparkView.

It can be found in the package com.eccenca.di.sql.virtual.