Soundex¤
This transformer plugin implements the Soundex phonetic algorithm for indexing names by their English sounds.
Description¤
This plugin is a linguistic transformer plugin. Specifically, it provides an implementation of the Soundex algorithm, also known as Soundex Indexing System.
Soundex is the simplest —and oldest— of the phonetic algorithms. This plugin includes both the original Soundex and a variation thereof known as “Refined Soundex”.
Soundex¤
The (plain) Soundex algorithm encodes or indexes an English word such as a (sur)name into the pattern
initial letter + three digits
, where the three digits represent given consonant groups. This mapping is the
following:
b, f, p, v → 1
c, g, j, k, q, s, x, z → 2
d, t → 3
l → 4
m, n → 5
r → 6
Notice that, in order to use the classical Soundex algorithm, the plugin parameter refined
needs to be set to false
.
Refined Soundex¤
The Refined Soundex algorithm is an improvement of the Soundex algorithm, without the limitations of dropping the
vocals and restricting the output to a four-digit encoding of the input. This variation of the Soundex algorithm can be
used by setting the plugin parameter refined
to true
(default). Its mapping is the following:
a, e, h, i, o, u, w, y → 0
b, p → 1
f, v → 2
c, k, s → 3
g, j → 4
q, x, z → 5
d, t → 6
l → 7
m, n → 8
r → 9
Examples¤
Soundex¤
We can get an idea of the output of the Soundex algorithm using an online Soundex Converter such as https://www.mainegenealogy.net/soundex_converter.asp.
robert
andrupert
lead to the same Soundex index:r163
.euler
leads toe460
,gauss
isg200
andhilbert
corresponds toh416
.
Refined Soundex¤
braz
andbroz
lead to the same Refined Soundex index:b1905
.caren
,carren
,coram
,corran
,curreen
andcurwen
are all encoded withc30908
.hairs
,hark
,hars
,hayers
,heers
andhiers
are all mapped toh093
.- All sorts of variations of
lambard
, such aslambart
,lambert
,lambird
orlampaert
, lead tol7081096
.
Related plugins¤
Other phonetic algorithms usually associated with Soundex are the variations or improvements implemented by the
NYSIIS
and Metaphone
algorithms. The corresponding linguistic transformer plugins
are named accordingly.
Parameter¤
Refined¤
No description
- ID:
refined
- Datatype:
boolean
- Default Value:
true
Advanced Parameter¤
None