JavaWuzzy是Java版的FuzzyWuzzy,用于计算字符串之间的匹配度。
github地址:
[1] JavaWuzzy: https://github.com/xdrop/fuzzywuzzy
[2] FuzzyWuzzy: https://github.com/seatgeek/fuzzywuzzy
Maven项目引入
1 | <dependency> |
基本使用
[1] FuzzySearch.ratio(String s1, String s2)
全匹配,对顺序敏感
[2] FuzzySearch.partialRatio(String s1, String s2)
搜索匹配(部分匹配),对顺序敏感
[3] FuzzySearch.tokenSortRatio(String s1, String s2)
首先做排序,然后全匹配,对顺序不敏感(也就是更换单词位置之后,相似度依然会很高)
[4] FuzzySearch.tokenSortPartialRatio(String s1, String s2)
首先做排序,然后搜索匹配(部分匹配),对顺序不敏感
[5] FuzzySearch.tokenSetRatio(String s1, String s2)
首先取集合(去掉重复词),然后全匹配,对顺序不敏感,第二个字符串包含第一个字符串就100
[6] FuzzySearch.tokenSetPartialRatio(String s1, String s2)
首先取集合,然后搜索匹配(部分匹配),对顺序不敏感
[7] FuzzySearch.weightedRatio(String s1, String s2)
对顺序敏感,算法不同
高级使用
[1] FuzzySearch.extractOne(String s, String[] list)
提出一个匹配度最高的
例子:FuzzySearch.extractOne(“cowboys”, [“Atlanta Falcons”, “New York Jets”, “New York Giants”, “Dallas Cowboys”])
(string: Dallas Cowboys, score: 90, index: 3)
[2] FuzzySearch.extractTop(String s, String[] list, int num)
提出num个匹配度最高的
例子:FuzzySearch.extractTop(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”], 3)
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index:5), (string: plexoogl, score: 43, index: 7)]
[3] FuzzySearch.extractAll(String s, String[] list)
计算list中所有String的匹配度
例子:FuzzySearch.extractAll(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”]);
[(string: google, score: 83, index: 0), (string: bing, score: 20, index: 1), (string: facebook, score: 29, index: 2), (string: linkedin, score: 29, index: 3), (string: twitter, score: 15, index: 4), (string: googleplus, score: 63, index: 5), (string: bingnews, score: 29, index: 6), (string: plexoogl, score: 43, index: 7)]
[4] FuzzySearch.extractAll(String s, String[] list, int score)
计算list中所有String的匹配度,并列出score以上的
例子:FuzzySearch.extractAll(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”], 40)
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index: 5), (string: plexoogl, score: 43, index: 7)]
[5] FuzzySearch.extractSorted(String s, String[] list)
计算list中所有String的匹配度,并按顺序排列
例子:FuzzySearch.extractSorted(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”]);
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index: 5), (string: plexoogl, score: 43, index: 7), (string: facebook, score: 29, index: 2), (string: linkedin, score: 29, index: 3), (string: bingnews, score: 29, index: 6), (string: bing, score: 20, index: 1), (string: twitter, score: 15, index: 4)]
[6] FuzzySearch.extractSorted(String s, String[] list, int score)
计算list中所有String的匹配度,并列出score以上的,按顺序排列
例子:FuzzySearch.extractSorted(“goolge”, [“google”, “bing”, “facebook”, “linkedin”, “twitter”, “googleplus”, “bingnews”, “plexoogl”], 3);
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index: 5), (string: plexoogl, score: 43, index: 7)]