Introduction

This beginner-level tutorial shows how you can build a Knowledge Graph based on input data from a comma-separated value file (.csv), an excel file (.xlsx) or a database table (jdbc).

The workflow consists of the following steps, which are described in detail below:

  1. Registration of the target vocabulary
  2. Uploading of the data (file) / Connect to JDBC endpoint
  3. (Re-)View your data table

  4. Creation of a (target) graph
  5. Creation of the transformation rules
  6. Evaluation of the results of the transformation rules
  7. Execution of the transformation to populate the target graph

Sample Material

The following material is used in this tutorial:

  • Sample vocabulary which describes the data in the CSV files: products_vocabulary.nt
  • Sample CSV file: services.csv

    ServiceIDServiceNameProductsProductManagerPrice
    Y704-9764759Product AnalysisO491-3823912, I965-1821441, Z655-3173353, ...Lambert.Faust@company.org748,40 EUR
    I241-8776317Component ConfabulationZ249-1364492, L557-1467804, C721-7900144, ...Corinna.Ludwig@company.org

    1082,00 EUR

    ...



  • Sample Excel file: products.xlsx

    ProductIDProductNameHeightWidthDepthWeigthProductManagerPrice
    I241-8776317Strain Compensator1268158Baldwin.Dirksen@company.org0,50 EUR
    D215-3449390Gauge Crystal77581915Wanja.Hoffmann@company.org2,00 EUR
    ...






Register the vocabulary

The vocabulary contains the classes and properties needed to map the data into the new structure in the Knowledge Graph.
  1. Press the + button on the lower bottom right of the VOCABS tab in Corporate Memory.
    Press the circular blue button on the lower bottom right of the Vocabs tab in Corporate Memory.
  2. Define a Name, a Graph URI and a Description of the vocabulary.
    In this example we will use:
    • Name: Product Vocabulary
    • Graph URI: http://ld.company.org/prod-vocab/
    • Description: Example vocabulary modeled to describe relations between products and services.

Upload the data file / Connect to the JDBC endpoint

  1. Open a new browser tab and login to the Build / Data Integration area of Corporate Memory: http://your.corporate.memory/dataintegration/.
  2. Press the Resources button of your workspace and select the file to be uploaded.
  3. Press the Browse... button, select the file and press the UPLOAD button. When the upload is done press CANCEL to close the dialog.
  4. Press the Add button in the Dataset category of your workspace and select the type CSV (file) / Excel (file).


    Define a Name for the dataset and select the previously registered resource file. All other parameters can keep the default values.

    In this example we will use:

    • Name: Services_CSV
    • and select the File: 156539415020_services.csv

    Define a Name for the dataset and select the previously registered resource file. All other parameters can keep the default values.

    In this example we will use:

    • Name: Products_XLSX
    • and select the File: 1565639227448_products.xlsx

  1. Open a new browser tab and login to the Build / Data Integration area of Corporate Memory: http://your.corporate.memory/dataintegration/.
  2. Press the Add button in the Dataset category of your workspace and select the JDBC endpoint (remote) type.
  3. Define a Name for the dataset, specify the JDBC Driver connection URL, the table name and the user and password to connect to the database.
    In this example we will use:

    • Name: Services_ServiceDB
    • JDBC Driver Connection URL: jdbc:mysql://mysql:3306/ServicesDB
    • table: Services
    • username: root
    • password: ****


    The general form of the JDBC connection string is:

    jdbc:<vendor>://<hostname>:<portNumber>/<databaseName> 
    CODE

    Default JDBC connection strings for popular Relational Database Management Systems:

    VendorDefault JDBC Connection StringDefault Port
    Microsoft SQL Serverjdbc:sqlserver:<hostname>:1433/<databaseName>1433
    PostgreSQLjdbc:postgresql:<hostname>:5432/<databaseName>5432
    MySQLjdbc:mysql:<hostname>:3306/<databaseName>3306
    MariaDBjdbc:mariadb:<hostname>:3306/<databaseName>3306
    IBM DB2*jdbc:db2:<hostname>:50000/<databaseName>50000
    Oracle*jdbc:oracle:thin:<hostname>:1521/<databaseName>1521

    *IBM DB2 and Oracle JDBC drivers are not by default part of Corporate Memory, but can be added.

    Instead of selecting a table you can also specify a custom SQL query in the source query field.

(Re-)View your Data Table

To validate that the input data is correct, you can preview the data table in Corporate Memory.

  1. Press the Open button of the dataset you want to view the data.
  2. Select the TABLEVIEW tab.
  3. In the TABLEVIEW tab, you can view a couple of rows to check that your data in correctly accessible.

Create a Knowledge Graph

  1. Press the Add button in the Datasets category of your workspace and select the type Knowledge Graph (embedded).
  2. Define a Name for the Knowledge Graph and provide a graph uri. All other parameters can keep the default values. 
    In this example we will use:
    • Name: Service_Knowledge_Graph
    • graph: http://ld.company.org/prod-instances/

Create a Transformation

The transformation defines how an input dataset (e.g. CSV) will be transformed into an output dataset (e.g. Knowledge Graph).

  1. Press the Add button in the Transform Tasks category of your workspace.
  2. Define the Name, the Source Dataset, the Output Dataset and the needed Target Vocabularies of your Transformation Task.
    In this example we will use:
    • Name: Create_Service_Triples
    • Select the previously created dataset, as the Source Dataset: Services CSV
    • Select the previously created dataset as the Output Dataset: Service_Knowledge_Graph
    • Provide the graph (URI) that holds the registered vocabulary as the Target Vocabularies: http://ld.company.org/prod-vocab/
  3. Press the Open button of the created transformation.
  4. Click on thein the right main area to expand the menu.
  5. Press the EDIT button to create a base mapping.
  6. Define the Target entity type from the vocabulary, the URI pattern and a label for the mapping.

    In this example we will use:


    • Target entity type: Service

    • URI pattern: http://ld.company.org/prod-inst/{ServiceID}

      • where http://ld.company.org/prod-inst/ is a common prefix for the instances in this use case,
      • and {ServiceID} is a placeholder that will resolve to the column of that name
    • An optional Label: Service


      Example RDF triple in our Knowledge Graph based on the mapping definition:

      <http://ld.company.org/prod-inst/Y704-9764759> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ld.company.org/prod-vocab/Service> 
      TEXT
  7. Evaluate your mapping by pressing on the button in the Examples of target data property to see at most three generated base URIs.
  8. We have now created the Service entities in the Knowledge Graph. As a next step, will add the name of the Service entity in . Press the circular blue button on the lower right and select Add value mapping.
  9. Define the Target property, the Data type, the Value path (column name) and a Label for your value mapping.

    In this example we will use:

    • Target Property: name

    • Data type: StringValueType
    • Value path: ServiceName
      • which corresponds to the column of that name
    • An optional Label: service name

Evaluate a Transformation

Visit the EVALUATE tab of your transformation to view a list of generated entities.  By clicking one of the generated entities, more details are provided.

Execute a Transformation to build a Knowledge Graph

  1. Go into the mapping and visit the EXECUTE tab.
  2. Press the button and validate the results. In this example, 9x Service triples were created in our Knowledge Graph based on the mapping.
  3. Finally you can use the DataManager EXPLORE module to (re-)view of the created Knowledge Graph: http://ld.company.org/prod-instances/