Metrics


Levenshtein Distance

  • The Levenshtein distance is one of the methods to calculate the similarity between two strings.
  • When converting one string to the other, the Levenshtein distance is calculated by the operation how many times the character is
    • inserted
    • deleted
    • replaced
import Levenshtein

str1 = 'Rievenstein'
str2 = 'Levenshtein'

print(Levenshtein.distance(str1, str2))
#3 

# if you want to know the operations
print(Levenshtein.editops(str1, str2))
# [('delete', 0, 0), ('replace', 1, 0), ('insert', 7, 6)]

# If you want to get the similarity ratios
print(Levenshtein.ratio(str1, str2))
# 0.8181818181818182
print(Levenshtein.ratio(str1, str1))
# 1.0
print(Levenshtein.ratio(str1, ''))
# 0.0

# If you want to select the average string in the list of several strings, you should use to median.
print(Levenshtein.median([
                    'Rievenstein',
                    'Levenshtein',
                    'Revenshtein',
                    'Lievenstein',
                    'Levenshtain',
                    'Levennshtein'
                    ]))
# Levenshtein


Last modified March 10, 2021