Computationally measuring similarity of terms with 6 algorithms

March 21, 2021 Benjamin Willenbring

There many methods of determining similarity and difference between terms. None are simpler to implement than the Levenshtein edit distance – but in many ways, this algorithm is grossly insufficient, because it doesn’t take into consideration a word’s meaning or sense (at all!). For accuracy, Wu-Palmer is the all-around best.

6 Algorithms

Wu-Palmer – returns a score denoting how similar two word senses are, based on the depth of the two senses in the taxonomy and that of their least common subsumer. It weights the edges based on distance in the hierarchy. For example, jumping from inanimate to animate is a larger distance than jumping from say Felid to Canid. In some sense we can think of it as sort of edit distance, assigning type changing operations a higher cost the higher they are in the hierarchy.
Levenshtein – measures the minimum number of single character edits required to change one word into another. This is simply counting the number of string transformations needed to get from string a to string b. It does not take into consideration meaning.
Path Similarity – a score denoting how similar two word senses are, based on the shortest path that connects the senses in the is-a (hypernym/hypnoym) taxonomy. Information Content can only be computed for nouns and verbs in WordNet, since these are the only parts of speech where concepts are organized in hierarchies
Jiang-Cornath similarity – based on Resnik’s similarity; considers the information content of lowest common subsumer (lcs) and the two compared concepts.
Leacock-Chordorow similarity – uses path similarity to compute the shortest number of edges from one word sense to another word sense, assuming a hierarchical structure.
Lin similarity – based on Resnik’s similarity; considers the information content of lowest common subsumer (lcs) and the two compared concepts.

Words similar to “yell”

Using all 6 algorithms (on verbs), you can see that Wu-Palmer performs the best: scream is closest to yell – with a similarity score of 1; whereas whisper is the furthest, with a similarity score of .22. Note however: Wu-Palmer will not work with adjectives.

See this content in the original post

The nltk python code

If you’d care to play around with this, here’s the python code.

See this content in the original post

Other direct and derivational methods of gauging similarity and difference

Concordance maps
Entailments
Keywords in Context (aka: kwic)
Hypernyms
Holonyms
Markov chains
Meronyms
n-grams
Pertainyms
Word collocations