Parse geo location¤
What does this plugin do?¤
This plugin parses and normalizes geolocations.
What is Geolocation?¤
The term “geolocation” means two things. Usually, it is used primarily as a noun referring to the process or technique of determining the physical location of a person or an item, e.g. by means of GPS or IP addresses. On the other hand – and this is the meaning of the word that we‘ll use – it is also used to describe the actual location that is identified.
In short: For our purposes, the term “geolocation” won’t be the act of geolocating, but the outcome of this act: the geolocation, the physical or actual location.
This plugin takes thus a given geolocation and normalizes it into a standard-conforming geolocation. This is the same principle as in the parsing and normalization of geocoordinates, but with the difference that, now, the input isn’t numerical, but textual. A geolocation is, in our case, one of the following input possibilities: A continent, a country, a state or a city.
How does this plugin work?¤
This plugin uses lookup-tables as a mechanism for parsing and normalizing geolocations. Rather than converting the input via a prescribed formula, as in the case of geocoordinates, a tabular dataset provides the standard output for the given input. Which tabular dataset is used, depends on the parameter values.
Parameter values¤
The parameter parseTypeId tells the plugin which parser to use, as well as which normalization logic to apply to the parsed result.
The parseTypeId must be one of the following:
continentcountrynumericCountrystatecity
Description of parameter values¤
Continent¤
The continent parameter refers to a continent code. Although these contintnet codes are not standardized, the de-facto standard is the following table:
| Continent | Code | Notes |
|---|---|---|
| Africa | AF | Alpha-2 style |
| Antarctica | AN | |
| Asia | AS | |
| Europe | EU | |
| North America | NA | |
| Oceania | OC | We don’t use the alternative UA |
| South America | SA |
Country and Numeric Country¤
Both country and numericCountry refer to the ISO 3166-1 alpha-3 country code, such as AND for Andorra, as can be found in the list of officially assigned ISO 3166-1 alpha-3 codes.
State¤
For the state parse type ID, the behavior of the plugin depends on the value of the second parameter, fullStateName. This parameter controls whether the plugin returns the state name (if set to true) or the state code.
The underlying lookup table is the GeoNames database, a widely used geospatial dataset. Specifically, the plugin uses its country-specific dump. In this dataset, each row represents a geographical feature: a city, village, administrative region, mountain, river, etc.
For concreteness, an exemplary row in this dataset is the following:
3039162 Sant Julià de Lòria Sant Julia de Loria Parochia de Sant Julia de Loria,Parochia de Sant Julià de Lòria,Parroquia de Sant Julia de Loria,Parroquia de Sant Julià de Lòria,San Dzhulija de Lorija,San Julian de Loria,San Julián de Loria,San Zhulija de Lorija,San-Zhulia-de-Lorija,Sant Chulija de Lorija,Sant Julia de Loria,Sant Julia de Loria koezoesseg,Sant Julià de Loria,Sant Julià de Lòria,Sant Julià de Lòria közösség,Sant Zhulija de Lorija,Sant-Zhulija-de-Lorija,sanjulliadelolia,sant jwlya dr lwrya,sant jwlya dy lwrya,santa juliya di loriya palli,sheng hu li ya-de luo li ya,sn gwlyh dh lwryh,snt hwlya dh lwrya,Сан Джулия де Лория,Сан Жулија де Лорија,Сан-Жулиа-де-Лория,Сант Жулија де Лорија,Сант-Жулія-де-Лорія,Սանթ Ժուլիա դե Լորիա,סן גוליה דה לוריה,سانت جوليا دي لوريا,سانت جولیا در لوریا,سنت حولیا ده لوریا,संत जूलिया डी लोरिया पल्ली,სანტ-ჟულია-დე-ლორია,サン・ジュリア・デ・ロリア教区,圣胡利娅-德洛里亚,산줄리아데로리아 42.46247 1.48247 A ADM1 AD 06 9448 1143 Europe/Andorra 2014-07-30
Each row in the dataset, such as the previous one, contains the following information:
| Column | Description | Example (Sant Julià de Lòria, Andorra) |
|---|---|---|
| geonameid | Unique identifier in the GeoNames database | 3039162 |
| name | Official name of the feature | Sant Julià de Lòria |
| asciiname | ASCII-only version of the name | Sant Julia de Loria |
| alternatenames | Comma-separated list of alternate names in different languages | Parochia de Sant Julia de Loria, Parochia de Sant Julià de Lòria, Parroquia de Sant Julia de Loria, … |
| latitude | Latitude coordinate | 42.46247 |
| longitude | Longitude coordinate | 1.48247 |
| feature class | Broad type of feature | A |
| feature code | More detailed classification | ADM1 |
| country code | ISO 3166-1 alpha-2 country code | AD |
| cc2 | Secondary country codes | (empty) |
| admin1 code | First-level subdivision code | 06 |
| admin2 code | Second-level subdivision code | (empty) |
| admin3 code | Third-level subdivision | (empty) |
| admin4 code | Fourth-level subdivision | (empty) |
| population | Population of the feature | 9448 |
| elevation | Elevation in meters | (empty) |
| dem | Digital elevation model value | 1143 |
| timezone | IANA timezone ID | Europe/Andorra |
| modification date | Date of last update in GeoNames | 2014-07-30 |
In our case, the state code corresponds to the admin1 code of the GeoNames country-specific dataset (11th column), whereas the name corresponds to the name column (2nd column).
City¤
As a last level of geospatial precision, we have the city. Currently, this does nothing, really. The plugin returns the same input as output.
Examples¤
Notation: List of values are represented via square brackets. Example: [first, second] represents a list of two values “first” and “second”.
Example 1:
-
Parameters
- parseTypeId:
continent
- parseTypeId:
-
Input values:
[Africa]
-
Returns:
[AF]
Example 2:
-
Parameters
- parseTypeId:
continent
- parseTypeId:
-
Input values:
[Europe]
-
Returns:
[EU]
Example 3:
-
Parameters
- parseTypeId:
continent
- parseTypeId:
-
Input values:
[Oceania]
-
Returns:
[OC]
Example 4:
-
Parameters
- parseTypeId:
continent
- parseTypeId:
-
Input values:
[AF]
-
Returns:
[AF]
Example 5:
-
Parameters
- parseTypeId:
continent
- parseTypeId:
-
Input values:
[EU]
-
Returns:
[EU]
Example 6:
-
Parameters
- parseTypeId:
continent
- parseTypeId:
-
Input values:
[OC]
-
Returns:
[OC]
Example 7:
-
Parameters
- parseTypeId:
country
- parseTypeId:
-
Input values:
[Andorra, AND]
-
Returns:
[AND, AND]
Example 8:
-
Parameters
- parseTypeId:
country
- parseTypeId:
-
Input values:
[Germany, DEU]
-
Returns:
[DEU, DEU]
Example 9:
-
Parameters
- parseTypeId:
country
- parseTypeId:
-
Input values:
[United States of America]
-
Returns:
[United States of America]
Example 10:
-
Parameters
- parseTypeId:
country
- parseTypeId:
-
Input values:
[USA]
-
Returns:
[USA]
Example 11:
-
Parameters
- parseTypeId:
state - fullStateName:
false
- parseTypeId:
-
Input values:
[Sant Julià de Lòria]
-
Returns:
[06]
Example 12:
-
Parameters
- parseTypeId:
state - fullStateName:
true
- parseTypeId:
-
Input values:
[Sant Julià de Lòria]
-
Returns:
[Sant Julià de Lòria]
Example 13:
-
Parameters
- parseTypeId:
state - fullStateName:
false
- parseTypeId:
-
Input values:
[nonExistentStateHere, asdf]
-
Returns:
[nonExistentStateHere, asdf]
Example 14:
-
Parameters
- parseTypeId:
city
- parseTypeId:
-
Input values:
[Leipzig, London, Paris, Barcelona]
-
Returns:
[Leipzig, London, Paris, Barcelona]
Parameter¤
Parse type id¤
What type of location should be parsed.
- ID:
parseTypeId - Datatype:
enumeration - Default Value:
None
Full state name¤
Set to true if the full state name should be output instead of the 2-letter code.
- ID:
fullStateName - Datatype:
boolean - Default Value:
true
Advanced Parameter¤
None