Probabilistic matching algorithm methods for socio-demographic data

Does anyone have studies comparing Damerau-Levenshtein and Jaro-Winkler? The only one I have so far is Waruru, 2019: ‘Where No Universal Health Care Identifier Exists: Comparison and Determination of the Utility of Score-Based Persons Matching Algorithms Using Demographic Data’.
But I’d like to see others, if someone has.

Thanks

3 Likes

Found a good link on this topic: Jaro Winkler vs Levenshtein Distance | Medium

I personally found that Levenshtein distance worked very well on names as well. Also, it depends on the type of data you are comparing. It’s helpful to know what specific socio-demographic variables you use.

2 Likes

Thank you very much, Toan