Unpivot¤

1. Introduction¤

1.1 What is Unpivot?¤

The Unpivot operator transforms a single, wide table into a long, attribute–value table.

It takes several pivot columns (e.g. 2014, 2015, 2016) and converts them into multiple rows, each row containing:

the original non-pivot columns (e.g. product),
an attribute column indicating which pivot column the value came from,
a value column containing the cell value itself.

In other words:

Per input row, Unpivot takes the “list” of values in the pivot columns and turns it into a series of key–value entries, one per output row.

1.2 When to use Unpivot¤

Unpivot is useful when:

there are many similar columns (e.g. one column per year, month, category, metric variant),
a uniform, long table is preferred, where each row is (dimension, value) instead of “one row with many semi-redundant columns”,
all such columns should be processed in the same way downstream and a single attribute/value pair is preferable to many separate columns.

2. Conceptual Overview¤

2.1 Wide → Long¤

Input (wide):

product	2014	2015	2016
Car	100	200	300

Output (long):

product	attribute	value
Car	2014	100
Car	2015	200
Car	2016	300

Before: 1 row with 3 pivot columns.
After: 3 rows, each with attribute and value.

2.2 Pivot vs Non-pivot Columns¤

Pivot columns: the columns to unpivot (e.g. 2014, 2015, 2016).
Non-pivot columns: all other columns (e.g. product) that are carried over and repeated in each unpivoted row.

2.3 Row Explosion¤

For each input row:

the operator iterates over all pivot columns,
for each pivot column:
- it looks at the values in that cell (possibly multiple),
- for each value, it generates one output row.

Thus:

Number of output rows per input row
= sum over all pivot columns of (number of values in that cell).

Empty cells for a pivot column produce no output row for that column.

2.4 List → Map Mental Model¤

Per input row:

the values in the pivot columns form a conceptual map:

2014 → 100
2015 → 200
2016 → 300

Unpivot turns this map into a sequence of key–value entries, stored as multiple rows:

(attribute = 2014, value = 100)
(attribute = 2015, value = 200)
(attribute = 2016, value = 300)

The “map” is not stored as a single field; it is represented by these separate rows.

3. Parameters¤

3.1 `firstPivotProperty` (required)¤

Description
The path/name of the first pivot column in the range.

Usage

Used when pivotColumns is empty.
Defines the start index of the pivot range in the input schema.

Example

firstPivotProperty = "2014"

3.2 `lastPivotProperty` (optional)¤

Description
The path/name of the last pivot column in the range.

Usage

If left empty, all columns from firstPivotProperty to the last column in the schema are used as pivots.
If set, the pivot range is inclusive from firstPivotProperty to lastPivotProperty.

Examples

lastPivotProperty = "2016"  # Use 2014, 2015, 2016
lastPivotProperty = ""      # Use 2014 up to the last column in the schema

3.3 `pivotColumns` (optional)¤

Description
A comma-separated list of pivot column names/paths.

Usage

If non-empty, this list overrides the automatic range defined by firstPivotProperty and lastPivotProperty.
Suitable when:
- pivot columns are not contiguous in the schema, or
- full control over which columns are used and in which order is desired.

Example

pivotColumns = "2014,2015,2016"

Precedence

If pivotColumns is non-empty → the operator uses this list and ignores the first/last range for pivot selection.
If pivotColumns is empty → the operator uses the range defined by firstPivotProperty / lastPivotProperty.

3.4 `attributeProperty` (optional)¤

Description
The path/name of the output column used to hold the attribute.

Default

attributeProperty = "attribute"

Semantics

Contains an identifier of the pivot column for this row.
In the current implementation, this is the pivot column’s property URI as a string.

3.5 `valueProperty` (optional)¤

Description
The path/name of the output column used to hold the value.

Default

valueProperty = "value"

Semantics

Contains the cell value from the pivot column for the entity.
Stored as a string value.

3.6 Parameter Precedence Overview¤

If pivotColumns is non-empty:
- pivot columns are taken exactly from this list,
- firstPivotProperty and lastPivotProperty are ignored for pivot selection.
If pivotColumns is empty:
- pivot columns are inferred from the range:
  - firstPivotProperty → start, and
  - lastPivotProperty → inclusive end, or last schema column if empty.

4. Behaviour in Detail¤

4.1 Pivot Column Selection Logic¤

Given the input schema:

If pivotColumns is empty:
- find index of firstPivotProperty in the input schema,
- if lastPivotProperty is empty:
  - pivot indices = from firstPivotProperty index to the last column index,
- if lastPivotProperty is set:
  - find index of lastPivotProperty;
  - pivot indices = inclusive range from first to last.
If pivotColumns is non-empty:
- split by comma, trim each entry,
- for each name/path, look up its index in the input schema,
- those indices become the pivot columns.

All remaining columns (not in pivotIndices) are the non-pivot columns.

4.2 Output Schema Construction¤

The output schema:

keeps the same type URI as the input, and
defines columns in this order:
1. all non-pivot columns, in their original order, and
2. one attribute column (attributeProperty, string), and
3. one value column (valueProperty, string).

Thus the structure is:

[ non-pivot columns …, attributeProperty, valueProperty ]

4.3 Entity / Row Transformation¤

For each input entity (row):

Iterate over each pivot column index.
For each pivot column:
- evaluate the values in that cell (possibly multiple).
For each value:
- create one output entity with:
  - non-pivot column values copied from the input,
  - attributeProperty = property URI of the pivot column (as string),
  - valueProperty = that cell value (as string).

If a pivot cell is empty (no value):

no output entity is created for that (row, pivot) combination.

4.4 URI Rewriting¤

Each output entity gets a new URI that encodes the original entity plus the pivot column.

Conceptually:

outputUri = originalUri + "/" + encoded(pivotColumnIdentifier)

Example:

original URI: product1
pivot column: 2014
output URI: product1/2014

This makes each unpivoted row identify the original entity and which column it came from.

4.5 Multi-valued Cells¤

If a pivot column can have multiple values for a given entity:

each value becomes a separate output row.

Example (conceptual):

input row: product = Car, 2015 = [200, 250]
output rows (for 2015):

product	attribute	value
Car	2015	200
Car	2015	250

5. Examples¤

5.1 Simple Range-based Example¤

Input schema

Columns:

product, 2014, 2015, 2016

Input data

uri	product	2014	2015	2016
product1	Car	100	200	300
product2	Beer		200	300

(Here product2 has no value for 2014.)

Configuration

firstPivotProperty = "2014"
lastPivotProperty  = "2016"
pivotColumns       = ""         # use range
attributeProperty  = "attribute"
valueProperty      = "value"

Output schema

Columns:

product, attribute, value

Output data

uri	product	attribute	value
product1/2014	Car	2014	100
product1/2015	Car	2015	200
product1/2016	Car	2016	300
product2/2015	Beer	2015	200
product2/2016	Beer	2016	300

Notes:

product2/2014 is missing, because the 2014 cell for product2 was empty.
Non-pivot column product is copied into every output row.
attribute holds the pivot column identifier, value holds the cell value.

5.2 Open-ended Last Pivot¤

Same input as above.

Configuration

firstPivotProperty = "2014"
lastPivotProperty  = ""         # empty → to last schema column
pivotColumns       = ""

Since lastPivotProperty is empty:

the pivot range becomes 2014 → last column (2016),
the result is identical to the previous example.

5.3 Explicit `pivotColumns` List¤

Same input as above.

Configuration

firstPivotProperty = "2014"                  # ignored for pivot selection
lastPivotProperty  = "2016"                  # ignored if pivotColumns is set
pivotColumns       = "2014,2015,2016"
attributeProperty  = "attribute"
valueProperty      = "value"

Since pivotColumns is non-empty:

the operator uses exactly 2014, 2015, 2016 as pivots,
the output is again identical to the range-based examples.

This shows the three configurations are equivalent for this simple case.

6. Usage Patterns & Recommendations¤

6.1 Range vs Explicit List¤

Use range (firstPivotProperty / lastPivotProperty) when:
- pivot columns form a contiguous block in the schema,
- “from here to there” behaviour is desired,
- or “from here to the last column” is desired by leaving lastPivotProperty empty.
Use pivotColumns when:
- pivot columns are not contiguous,
- a subset of them should be used, or
- explicit control over which columns and in which order is needed.

6.2 Naming `attributeProperty` and `valueProperty`¤

Domain-specific names for these columns can often improve clarity, such as:
- year / amount,
- dimension / measure,
- metric / value.
The defaults attribute / value are generic and suitable where no specific naming is required.

6.3 Things to Watch Out For¤

If firstPivotProperty or lastPivotProperty do not match schema columns, no pivots will be found.
If pivotColumns is set incorrectly (typos, wrong names), the corresponding columns will not be unpivoted.
Empty cells in pivot columns do not produce rows: no value implies no key–value pair.
The attribute column stores the pivot column’s property URI, not an arbitrary, independent label.

7. Technical Notes (for Power Users)¤

The attribute column stores the pivot column’s property URI as a string.
The value column is typed as string, regardless of the original type; downstream components may need to re-interpret or cast values.
Output URIs follow the pattern:

<original URI>/<pivot-column-identifier>

(internally URL-encoded where needed).

This makes it easy to trace each unpivoted row back to:

the original entity, and
the pivot column it came from.

8. One-Paragraph Mental Model¤

Unpivot is a single-input operator that takes a group of “pivot” columns in a wide table and converts them into an attribute–value representation. For each input row and each selected pivot column (and each value in that cell), it creates one output row: non-pivot columns are copied, an attribute column records which pivot column this row represents (via its property URI), and a value column stores the cell’s value. URIs are rewritten to reflect the original entity plus the pivot column, and pivot columns can be defined either by a range (first/last) or by an explicit list.

9. Relation to the Pivot Operator¤

The Pivot and Unpivot operators are structurally complementary:

Unpivot takes a wide table with many pivot columns and turns it into a long table with attribute–value pairs (one row per (original row, pivot column, value)).
Pivot takes a tall table with group columns, a pivot column, and a value column and produces a wide summary table (one row per group, one column per pivot value, aggregated).

They are often used together in ETL pipelines:

Unpivot a wide table to normalize structure,
transform or filter the attribute–value data,
optionally pivot the result back into a wide summary form using aggregation.

Parameter¤

First pivot property¤

The name of the first pivot column in the range.

ID: firstPivotProperty
Datatype: string
Default Value: None

Last pivot property¤

the name of the last pivot column in the range. If left empty, all columns starting with the first pivot column are used.

ID: lastPivotProperty
Datatype: string
Default Value: None

Attribute property¤

The URI of the output column used to hold the attribute.

ID: attributeProperty
Datatype: string
Default Value: attribute

Value property¤

The URI of the output column used to hold the value.

ID: valueProperty
Datatype: string
Default Value: value

Pivot columns¤

Comma separated list of pivot column names. This property will override all inferred columns of the first two arguments.

ID: pivotColumns
Datatype: string
Default Value: None

Advanced Parameter¤

None

Unpivot¤

1. Introduction¤

1.1 What is Unpivot?¤

1.2 When to use Unpivot¤

2. Conceptual Overview¤

2.1 Wide → Long¤

2.2 Pivot vs Non-pivot Columns¤

2.3 Row Explosion¤

2.4 List → Map Mental Model¤

3. Parameters¤

3.1 firstPivotProperty (required)¤

3.2 lastPivotProperty (optional)¤

3.3 pivotColumns (optional)¤

3.4 attributeProperty (optional)¤

3.5 valueProperty (optional)¤

3.6 Parameter Precedence Overview¤

4. Behaviour in Detail¤

4.1 Pivot Column Selection Logic¤

4.2 Output Schema Construction¤

4.3 Entity / Row Transformation¤

4.4 URI Rewriting¤

4.5 Multi-valued Cells¤

5. Examples¤

5.1 Simple Range-based Example¤

5.2 Open-ended Last Pivot¤

5.3 Explicit pivotColumns List¤

6. Usage Patterns & Recommendations¤

6.1 Range vs Explicit List¤

6.2 Naming attributeProperty and valueProperty¤

6.3 Things to Watch Out For¤

7. Technical Notes (for Power Users)¤

8. One-Paragraph Mental Model¤

9. Relation to the Pivot Operator¤

Parameter¤

First pivot property¤

Last pivot property¤

Attribute property¤

Value property¤

Pivot columns¤

Advanced Parameter¤

Comments

3.1 `firstPivotProperty` (required)¤

3.2 `lastPivotProperty` (optional)¤

3.3 `pivotColumns` (optional)¤

3.4 `attributeProperty` (optional)¤

3.5 `valueProperty` (optional)¤

5.3 Explicit `pivotColumns` List¤

6.2 Naming `attributeProperty` and `valueProperty`¤