我正在寻找一种方法来输出两个字符串之间的匹配百分比(例如:名称),同时也考虑到它们可能是相同的,但是单词的顺序不同。我尝试使用SequenceMatcher(),但结果只是部分令人满意:
a = "john doe"
b = "jon doe"
c = "doe john"
d = "jon d"
e = 'john do'
s = SequenceMatcher(None, a, b)
s.ratio()
0.9333333333333333
s = SequenceMatcher(None, a, c)
s
Diff function on two arrays (or how to turn Old into New)
Example
One[]={2,3,4,5,6,7}
Two[]={1,2,3,5,5,5,9}
Example Result
Diff: insert 1 into One[0], One[]={1,2,3,4,5,6,7}
Diff: delete 4 from One[3], One[]={1,2,3,5,6,7}
Diff: modify 6 into 5 in One[4], One[]={1,2,3,5,5,7}
Diff: modify 7 into 5 i
我用布宜诺斯艾利斯的街道列表作为语料库:
av. de mayo
av. del libertador
av. diaz velez
一些投标位置字段包含以下文本:
of. de compras hosp. c. durand (diaz velez 5044) c.a.b.a
av. de mayo 525, planta baja, oficina 11, ciudad de buenos aires
oficina de compras - av. diaz velez 5044 - cap. fed. -
我正在阅读这本书,因为它有一个我实现的“位置提取”部分。此代码的问题在于,语料
旋转意味着一个字符串是通过将另一个字符串(一个或多个位)移动到右边创建的。例如abc和cab是旋转的,abcd和bacd不是旋转。我编写了下面的代码,但是它未能通过最后一个测试用例(不知道它是什么)。有人能给我一些关于哪里出了问题的提示吗?或者有没有更有效的算法:
int isLetterInWord(char c, char* word)//find first letter in the word which is equal to c
{
int len = strlen(word);
for(int i=0; i<len; ++i)