我试图使用LanguageTool Java纠正文本文件中出现的一些拼写错误的单词。在浏览了LT和https://languagetool.org/之后,我尝试了一些示例代码-
JLanguageTool langTool;
String text = "I.- Any reference _in this Section to a panicular genus or species of an anirmgl, cxccpl where the context";
langTool = new JLanguageTool(Language.AMERICAN_ENGLISH);
langTool.activateDefaultPatternRules();
List<RuleMatch> matches = langTool.check(text);
for (RuleMatch match : matches) {
System.out.println("Potential error at line " +
match.getEndLine() + ", column " +
match.getColumn() + ": " + match.getMessage());
System.out.println("Suggested correction: " +
match.getSuggestedReplacements());
}
输出如下:
Potential error at line 0, column 19: Possible spelling mistake found
Suggested correction: [Lin, Min, ain, bin, din, fin, gin, in, kin, min, pin, sin, tin, win, yin]
Potential error at line 0, column 41: Possible spelling mistake found
Suggested correction: []
Potential error at line 0, column 74: Possible spelling mistake found
Suggested correction: []
Potential error at line 0, column 83: Possible spelling mistake found
Suggested correction: []
预期输出-
Starting check in English (American)...
1. Line 1, column 19
Message: Possible spelling mistake found (deactivate)
Correction: in; win; bin; pin; tin; min; Lin; din; gin; kin; yin; ain; fin; sin; IN; In; Min; PIN
Context: I.- Any reference _in this Section to a panicular genus or sp...
2. Line 1, column 41
Message: Possible spelling mistake found (deactivate)
Correction: particular; funicular
Context: ...I.- Any reference _in this Section to a panicular genus or species of an anirmgl, cxccpl ...
3. Line 1, column 74
Message: Possible spelling mistake found (deactivate)
Correction: animal
Context: ...n to a panicular genus or species of an anirmgl, cxccpl where the context
4. Line 1, column 83
Message: Possible spelling mistake found (deactivate)
Context: ...nicular genus or species of an anirmgl, cxccpl where the context
Potential problems found: 4 (time: 171ms)
How you can improve LanguageTool
我从LT独立桌面软件获得了这个输出。我将它的安装文件夹及其内容与我的源代码和API jars进行了比较,但是没有发现任何特殊的东西,这使得前者成为了一个更好的解决方案。
另外,我想用建议列表中的第一个元素替换拼错的单词。
任何形式的帮助都将是非常感谢的。
发布于 2016-10-03 12:35:01
我用的是旧的Languagetool。请用这个-
<dependency>
<groupId>org.languagetool</groupId>
<artifactId>language-en</artifactId>
<version>3.5</version>
</dependency>
此外,拼写更正还可以通过从(match.getFromPos()到match.getToPos())中选择拼写错误的单词并替换为建议列表中最有说服力的单词(由程序员来选择该单词)来完成。
希望能帮上忙。
https://stackoverflow.com/questions/39828618
复制相似问题