Corporate Memory 20.10 is the third release in 2020.

The highlights of this release are:


With this release of Corporate Memory the DataIntegration and DataManager configurations have to be adapted according to the migration notes below. In addition to that, cmemc has a change default behaviour.

This release delivers the following component versions:

  • eccenca DataPlatform v20.10.1
  • eccenca DataIntegration v20.10
  • eccenca DataManager v20.10.1
  • eccenca Corporate Memory Control (cmemc) v20.10
  • eccenca Corporate Memory PowerBI Connector v20.10

More detailed release notes for these versions are listed below.

eccenca DataIntegration v20.10.1

This version of eccenca DataIntegration adds the following new features:

  • Improvements to new Workspace UI:
    • New Workspace UI allows to export projects with and without file resources.
    • Basic support for multiple languages in the New Workspace UI. Initially English and German are supported and plugins are not translated yet.
    • Multi-step Project import in new workspace UI.

    • Multi-step, asynchronous project import REST API.

    • Profiling UI component to start dataset profiling and show profiling information in the dataset preview.

    • Navigation menu in new workspace UI.
    • In link tables, clicking on an entity redirects to the corresponding resource in DataManager, if the entity is coming from an RDF dataset.
  • New/improved operators:
    • New transform operator to retrieve lat/long of a location from a specified API in order to normalize location data.
    • New operator to scale similarity values in linking rules by a specified factor.
    • Email operator improvements:

      • multiple recipients in TO, CC and BCC

      • CC and BCC recipients

      • Timeout parameter

      • SSL support

  • Improvements to datasets
    • CSV Dataset supports UTF-8-BOM encoding for writing CSVs that open correctly in Excel.
    • Support for #id and #text paths in JSON sources.
  • API improvements
    • Task activities API that allows to fetch a list of task activities with optional project and status filter.
    • Profiling data is available via the API.
  • Global vocabulary cache that holds all installed vocabularies from the DataPlatform.

    • REST endpoint to trigger cache updates.

In addition to that, these changes are shipped:

  • Vocabulary caches are not persisted between reboots and workspace reloads
  • Disable geo location data type detector by default via plugin.blacklist parameter
  • Item search API returns plugin IDs where available
  • Expose some Amazon S3 client configuration. Can be changed in the Dataintegration configuration now
  • Improvements to Spark execution engine
    • Entities are stored in DataFrames instead of RDDs
    • Performance improvements
    • Bugfixes
  • Check for usages of resources in all tasks, before deleting them. This was checked only in datasets before
  • File management improvements
    • Allow multi file uploads
    • Ask to replace existing files
    • Allow to delete uploaded files in upload dialog
    • When deleting files check for usages of resources in all items, before deleting them, e.g. transform tasks. This was checked only in datasets before
    • When deleting files that are in use, link the dependent items
    • Upload modal does not close when clicking outside of the modal
    • If the limit parameter of the itemSearch API is set to 0, it will now return all search results instead of none
    • Frontend initialisation endpoint returns initial language preference and configured DM base URL

Finally, the following performance and stability issues were solved:

  • Regression: the output of a transformation is lost after reloading
  • Added warning to the CSV datasets 'maxCharsPerColumn' parameter to make it clear that it affects the heap size
  • Fixed reading of JSON files that contain Unicode byte order marks (BOMs)
  • Workflow not interrupted on invalid XML from Triple-store
  • Fixed generating paths for JSON files that contain keys with special characters, such as spaces. Those will be encoded now
  • Project's rdfs:label uses project ID instead of label
  • Generate consistent URIs for object mappings on JSON files
  • Caches have not been written if the XML workspace provider was used
  • Do not recreate caches on every run
  • In link tables, the header shows the task labels instead of the task ids
  • Fixed search field in link tables (did not work with characters that need to be URL encoded)
  • Meta data description does not maintain whitespace formatting in XML serialisation
  • New workspace UI has invalid favicon
  • Creating a new project with description does not store the description in the new workspace UI
  • XML Dataset: Values that include HTML entities are not retrieved
  • Support for MS Internet Explorer 11 in new workspace
  • Logout action not working. Should perform a global logout
  • Deleting S3 backed resources broken due to a slash added to filenames
  • Update PostgreSQL driver to v42.2.14 because of security vulnerability

eccenca DataManager v20.10.1

This version of eccenca DataManager adds the following new features:

  • General
  • Shacline
    • Add support for 'sh:languageIn' (as multiple values) in literal properties
  • Resource Tables
    • Allow Lucene syntax in the search field of any resource table (Query Syntax)
      • This search will be applied to the label(s) configured in proxy.labelProperties (cf. DataPlatform); by default the search will only be applied to the first column, the labels of the selected resource

In addition to that, these changes are shipped:

  • Shacline

    • Use the new resource/shaped API to generate / save shacl forms.

    • Rendering empty fields on every change

    • Add class triple to save only if class is a string.

    • Prevent labels to be cloned on adding a new block.

    • Nested Table query now defines default graph (CMEM-2939)

    • {graph} can now be used as a placeholder in RFC6570 URI Template string (CMEM-2983)

  • ResourceTable

    • Download data does not retain column order (CMEM-2895)

    • Add pagination/limit on config file

    • Lock Drag and Drop while adding columns to prevent collision

    • Update default pagination limit to 25 and default pagination interval to 5, 10, 25, 100, 500, 1000

  • General

    • use new backend API to retrieve labels.

    • use new backend API to retrieve facets (possible columns)

    • DEPRECATE titleHelper configuration parameters

    • BREAKING remove support for Internet Explorer 11.

    • Disable Datasets module, moved to Data Integration

    • Disable Build module, moved to Data Integration

  • ResourceSelect

    • Wait until click on it to load values.

  • Explore

    • Cyclic references on Tabs content crash the app

    • modules.explore.navigation.topQuery changed in order to list configured graph classes (shui:managedClasses, CMEM-2983)

    • Update Navigation pagination limit to 15

    • Load ResourceTable pagination limit from config file

In addition to that, multiple performance and stability issues were solved.

eccenca DataPlatform v20.10

This version of eccenca DataPlatform adds the following new features:

  • Custom endpoint
    • Create custom json endpoints by defining a query for retrieving the data and a template for transforming the result.

  • Concise Boundary Description retrieval depth is adjustable.
  • New submodule :src:it for integration tests
  • Statement Annotations/Metadata
    • APIs for providing access and managing existing relations

  • Additional APIs
    • Explore Facets ("/api/explore/facets"): Lists the properties of a class or query.
    • Graph List ("/api/graphs/list"): Returns a list of graphs readable by the current user, optionally including OWL imports.
    • Graph List Detailed ("/api/graphs/list-detailed"): Like the previous one, but adding details of triples, classes and instances counts.
    • Added openapi.server.urls env variable in order to define custom baseUrl to be used in
    • Added resource shaping to the backend, this includes
      • Resource ("/api/resources") api for getting information about individual resources
      • Shape ("/api/shapes") api for applying shape information onto the graph
      • Statement Level Metadata ("/api/statementmetadata/") management for adding statement annotations.
    • Added Caching to internal handling of prefixes, vocabularies and shapes lists. Caches are invalidated by updates.
    • Added Showcase ("/api/admin/showcase") endpoint, which inserts a scalable test dataset into the configured endpoint.

In addition to that, these changes are shipped:

  • Changes in ReportsService related to CMEM-2944.

In addition to that, multiple performance and stability issues were solved.

eccenca Corporate Memory Control (cmemc) v20.10

This version of cmemc adds the following new features:

  • A dataset command group, enabling users to create, delete and update datasets as well as upload and download dataset file resources.
  • A vocabulary command group, enabling users to manage vocabularies similar to the vocabulary catalog.
  • The query execute command has some new options for limit, offset distinct and timeout settings.

In addition to that, these changes are shipped:

  • Added:
    • The workflow status command has a --project option
  • Changed:
    • The graph import command outputs a replace/add status message per graph.
    • Much faster workflow status retrieval by using a new activity API
    • The dataset export command default file template changed to {{date}}-{{connection}}-{{id}}.project
    • The query execute command now uses POST instead of GET requests for SPARQL queries
  • Fixed:
    • The graph import --replace command  does not re-replace a same graph with a different file anymore.
    • The completion of --filename-template resulted in files with ` chars.
    • The python version is disabled in completion mode.

eccenca Corporate Memory PowerBI Connector (v20.10)

This release of our PowerBI Connector does not introduce new features or relevant changes. We provided a tutorial on how to use this component: Consuming Graphs in Power BI

Migration Notes

DataIntegration

  • XML serialization for meta data elements is not forward compatible, i.e. projects exported with this version cannot be imported in older DataIntegration versions.

  • The logout URL needs to be set to make sure that DataIntegration also triggers a logout inside the Keycloak instance:
oauth.logoutRedirectUrl = ${DEPLOY_BASE_URL}"/auth/realms/cmem/protocol/openid-connect/logout?redirect_uri="${DEPLOY_BASE_URL}
CODE

DataManager

  • The graphInfo flag in the explore module is now enabled by default.
  • Due to the introduction of the new DataIntegration workspace these changes need to be applied:
    • The modules build as well as datasets are disabled now by default.
    • The module explore is the default first entry point (startsWith).
    • This section needs to be added to each workspace configuration:

          DIWorkspace:
            enable: true
            url: /dataintegration/workbench
      CODE

cmemc

  • If your automation scripts rely on the created file name of the project export command, you need to change your scripts and set the old export name explicitly with "-t {{id}}".