This tutorial shows how you can build a Knowledge Graph based on input data from a Web API.

Example

Our example is based on the GitHub API (v3).

API Request: Retrieve a list of entities

The HTTP Get request retrieves all repositories of a GitHub organization named vocol.

curl https://api.github.com/orgs/vocol/repos
CODE

API Response

The JSON response which includes the data for all repositories (mobivoc, vocol, ...). File: repos.json

[
    {
		...
        "id": 22646219,
        "name": "mobivoc",
		...
    },
    {
		...
        "id": 22646629,
        "name": "vocol",
		...
    },
    {
		...
        "id": 30964669,
        "name": "scor",
		...
    },
	...
]
JS

Register a Web API in Corporate Memory

  1. Press the Add button in the Others category of your workspace and select the type REST operator (multi input)



  2. Define a Name and the url for the Web API. 
    Example:  https://api.github.com/orgs/vocol/repos (Every other field can keep the default settings.)

Parse the Response

As we are only interested in the body of the HTTP response, we have to parse it first in Corporate Memory.

  1. Press the Add button in the Others category of your workspace and select the type JSON Parser operator



  2. Define a Name and the input path. Every other field can keep the default settings.
    Example:

Add an Input Dataset

To create a mapping within Corporate Memory, we have to first register an example response from the API (repos.json). Based on the schema of the response, we can then define step-by-step the mappings. 

  1. Press the button Resources next to your project.



  2. Upload the API response file repos.json to your project.



  3. Press the button Add in the Datasets directory of your project.



  4. Create a first Dataset of the type JSON (file). Select the previously created repos.json we previously added as a file. Define a Name (Ex: Repose_JSON). Every other field can keep the default settings.

Create a Knowledge Graph

The Knowledge Graph is our output dataset where all generated RDF triples from our transformations will be written into.

  1. Press the button Add in the Datasets directory of your project.



  2. Create a third Dataset of the type Knowledge Graph (embedded). Define a Name and a named graph URI.

    Example:

Adding Transformations

We have to create two transformations: First, select the repository names within the JSON response of the first API request. Second, transform the JSON issue data into RDF triples.

  1. Press the button Add in the Transform Tasks directory.




  2. Create a third Transformation of the type Knowledge Graph (embedded). Define a Name and a named graph URI.

    Example:


Second retrieve all issues for each repository (mobivoc/, scor/, ..).

curl https://api.github.com/repos/vocol/mobivoc/issues
curl https://api.github.com/repos/vocol/scor/issues
curl https://api.github.com/repos/vocol/shopfloor/issues ...
CODE


Merged JSON response for all issues from all repositories (ex: mobivoc, MTConnect, ...).

[
    [
        {  
            ...
            "number": 58,
            "title": "Fix namespace declarations at schema.mobivoc.org (mostly images)",
            "repository_url": "https://api.github.com/repos/vocol/mobivoc",
			...
        },
		{
			...
            "number": 53,
            "title": "Add offers to civic structures",
            "repository_url": "https://api.github.com/repos/vocol/mobivoc",
			...
		}
		...
	],
	[
		{
			"number": 3,
            "title": "Determine Licence",
            "repository_url": "https://api.github.com/repos/vocol/MTConnect",
            ...
		}
	]
	...
]
CODE


Define the external Web API

First, we register the two different API requests and an operator, to parse their response body.

  1. Press the Add button in the Others category of your workspace and select the type REST operator (multi input)



  2. Define a Name and the url for the Web API. 
    Example:  https://api.github.com/orgs/vocol/repos (Every other field can keep the default settings.)



  3. Create a second REST Operator (multi input) which will be used for the looping API requests to retrieve the issues of each repository. Define a Name. The url field can be left empty as it is only known and runtime and will be defined later in the workflow. Every other field can keep the default settings.



  4. Finally, select the type JSON Parser Operator to parse the HTTP response message body. Define a Name and add body into the input path field. Every other field can keep the default settings.



Adding Datasets

We add three datasets: Two JSON files which represent the response of the API requests and one Knowledge Graph where the final result will be written into. The two JSON response files are required for the Transformation we create in the next step.

  1. Press the button Resources next to your project.



    1. Upload the API example response files issues.json, repos.json to your project.



  2. Press the button Add in the subdirectory Dataset of your project.



    1. Create a first Dataset of the type JSON (file). Select the file repos.json we previously added as a file. Define a Name. Every other field can keep the default settings.



    2. Create a second Dataset of the type JSON (file). Select the file issues.json we previously added as a file. Define a Name. Every other field can keep the default settings.



    3. Create a third Dataset of the type Knowledge Graph (embedded). Define a Name and a named graph URI. In our example, the following URI is chosen for this example: http://www.eccenca.com/github_graph/. Every other field can keep the default settings.

Adding Transformations

We have to create two transformations: First, select the repository names within the JSON response of the first API request. Second, transform the JSON issue data into RDF triples.

  1. Create a new transformation.