JavaWuzzy是Java版的FuzzyWuzzy，用于计算字符串之间的匹配度。
github地址：
[1] JavaWuzzy: https://github.com/xdrop/fuzzywuzzy
[2] FuzzyWuzzy: https://github.com/seatgeek/fuzzywuzzy

Maven项目引入

<dependency>
    <groupId>me.xdrop</groupId>
    <artifactId>fuzzywuzzy</artifactId>
    <version>1.1.10</version>
</dependency>

基本使用

[1] FuzzySearch.ratio(String s1, String s2)
全匹配，对顺序敏感

[2] FuzzySearch.partialRatio(String s1, String s2)
搜索匹配(部分匹配)，对顺序敏感

[3] FuzzySearch.tokenSortRatio(String s1, String s2)
首先做排序，然后全匹配，对顺序不敏感(也就是更换单词位置之后，相似度依然会很高)

[4] FuzzySearch.tokenSortPartialRatio(String s1, String s2)
首先做排序，然后搜索匹配(部分匹配)，对顺序不敏感

[5] FuzzySearch.tokenSetRatio(String s1, String s2)
首先取集合(去掉重复词)，然后全匹配，对顺序不敏感，第二个字符串包含第一个字符串就100

[6] FuzzySearch.tokenSetPartialRatio(String s1, String s2)
首先取集合，然后搜索匹配(部分匹配)，对顺序不敏感

[7] FuzzySearch.weightedRatio(String s1, String s2)
对顺序敏感，算法不同

高级使用

[1] FuzzySearch.extractOne(String s, String[] list)
提出一个匹配度最高的

例子：FuzzySearch.extractOne(“cowboys”, [“Atlanta Falcons”, “New York Jets”, “New York Giants”, “Dallas Cowboys”])
(string: Dallas Cowboys, score: 90, index: 3)

[2] FuzzySearch.extractTop(String s, String[] list, int num)
提出num个匹配度最高的

例子：FuzzySearch.extractTop(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”], 3)
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index:5), (string: plexoogl, score: 43, index: 7)]

[3] FuzzySearch.extractAll(String s, String[] list)
计算list中所有String的匹配度

例子：FuzzySearch.extractAll(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”]);
[(string: google, score: 83, index: 0), (string: bing, score: 20, index: 1), (string: facebook, score: 29, index: 2), (string: linkedin, score: 29, index: 3), (string: twitter, score: 15, index: 4), (string: googleplus, score: 63, index: 5), (string: bingnews, score: 29, index: 6), (string: plexoogl, score: 43, index: 7)]

[4] FuzzySearch.extractAll(String s, String[] list, int score)
计算list中所有String的匹配度，并列出score以上的

例子：FuzzySearch.extractAll(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”], 40)
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index: 5), (string: plexoogl, score: 43, index: 7)]

[5] FuzzySearch.extractSorted(String s, String[] list)
计算list中所有String的匹配度，并按顺序排列

例子：FuzzySearch.extractSorted(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”]);
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index: 5), (string: plexoogl, score: 43, index: 7), (string: facebook, score: 29, index: 2), (string: linkedin, score: 29, index: 3), (string: bingnews, score: 29, index: 6), (string: bing, score: 20, index: 1), (string: twitter, score: 15, index: 4)]

[6] FuzzySearch.extractSorted(String s, String[] list, int score)
计算list中所有String的匹配度，并列出score以上的，按顺序排列

例子：FuzzySearch.extractSorted(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”], 3);
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index: 5), (string: plexoogl, score: 43, index: 7)]

参考资料

https://blog.csdn.net/sunyao_123/article/details/76942809

← Vue项目中引入pdf文件后的配置零起点Centos+Hadoop+xgboost部署 (伪分布式) →

字符串相似度比较：JavaWuzzy文档

Maven项目引入

基本使用

高级使用

参考资料