Various methods for fuzzy matching of strings in Python, including:
.
- String similarity: Gives a measure of string similarity between 0 and 100.
- Partial string similarity: Inconsistent substrings are a common problem
when string matching. To get around it, use a "best partial" heuristic
when two strings are of noticeably different lengths.
- Token sort: This approach involves tokenizing the string in question,
sorting the tokens alphabetically, and then joining them back into a
string.
- Token set: A slightly more flexible approach. Tokenize both strings, but
instead of immediately sorting and comparing, split the tokens into two
groups: intersection and remainder.
Installed Size: 56.3 kB
Architectures: all