前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >NLTK For Python3版本变化说明

NLTK For Python3版本变化说明

作者头像
数据饕餮
发布2019-01-14 16:05:34
6840
发布2019-01-14 16:05:34
举报
文章被收录于专栏:数据饕餮数据饕餮

1.Here are some changes you may need to make:

grammar: ContextFreeGrammar → CFG, WeightedGrammar → PCFG, StatisticalDependencyGrammar → ProbabilisticDependencyGrammar, WeightedProduction → ProbabilisticProduction draw.tree: TreeSegmentWidget.node() → TreeSegmentWidget.label(), TreeSegmentWidget.set_node() → TreeSegmentWidget.set_label() parsers: nbest_parse() → parse() ccg.parse.chart: EdgeI.next() → EdgeI.nextsym() Chunk parser: top_node → root_label; chunk_node → chunk_label WordNet properties are now access methods, e.g. Synset.definition → Synset.definition() sem.relextract: mk_pairs() → _tree2semi_rel(), mk_reldicts() → semi_rel2reldict(), show_clause() → clause(), show_raw_rtuple() → rtuple() corpusname.tagged_words(simplify_tags=True) → corpusname.tagged_words(tagset=’universal’) util.clean_html() → BeautifulSoup.get_text(). clean_html() is now dropped, install & use BeautifulSoup or some other html parser instead. util.ibigrams() → util.bigrams() util.ingrams() → util.ngrams() util.itrigrams() → util.trigrams() metrics.windowdiff → metrics.segmentation.windowdiff(), metrics.windowdiff.demo() was removed. parse.generate2 was re-written and merged into parse.generate

2.Creating objects from strings:

Many objects now support a fromstring() method tree.Tree.parse() → tree.Tree.fromstring() tree.Tree() → tree.Tree.fromstring() chunk.RegexpChunkRule.parse() → chunkRegexpChunkRule.fromstring() grammar.parse_cfg() → CFG.fromstring() (same for other types of grammar) sem.LogicParser.parse() → sem.Expression.fromstring() sem.DrtParser.parse() → sem.DrtExpression.fromstring() sem.parse_valuation() → sem.Valuation.fromstring() sem.parse_type() → sem.Type.fromstring() Operations on lists of sentences or other items: tokenize.batch_tokenize() → tokenize.tokenize_sents() tag.batch_tag() → tag.tag_sents() parse.batch_parse() → parse.parse_sents() classify.batch_classify() → classify.classify_many() sem.batch_interpret() → sem.interpret_sents() sem.batch_evaluate() → sem.evaluate_sents() chunk.batch_ne_chunk() → chunk.ne_chunk_sents() Changes in probability.FreqDist: fdist.keys() → sorted(fdist) fdist.inc(x) → fdist[x] += 1 fdist.samples() → sorted(fdist.keys()) fdist.Nr(r) → fdist.Nr()[r] fdist.Nr_nonzero() → fdist.Nr().items() cfdist.conditions() → sorted(cfdist.conditions()) Porter stemmer changes: adjust_case(), cons(), cvc(), doublec(), m(), step1ab(), step1c(), step2(), step3(), step4(), step5(), vowelinstem() made private ends(), r(), setto() removed

3.Removed modules, classes and functions:

classify.svm was removed. For classification based on support vector machines (SVMs) use classify.scikitlearn or scikit-learn directly. See https://github.com/nltk/nltk/issues/450. probability.GoodTuringProbDist class was removed. See https://github.com/nltk/nltk/issues/381. HiddenMarkovModelTaggerTransformI and its subclasses are removed. See https://github.com/nltk/nltk/issues/374. classify.maxent no longer support algorithms backed by scipy.maxentropy. See https://github.com/nltk/nltk/issues/321. misc.babelfish was removed. See https://github.com/nltk/nltk/issues/265. sourcedstring was removed. See https://github.com/nltk/nltk/issues/322. yamltags was removed. JSON is now preferred instead. See https://github.com/nltk/nltk/issues/540 mallet was removed, including the tag.crf module. See https://github.com/nltk/nltk/issues/104 tag.simplify was removed. See https://github.com/nltk/nltk/issues/483 model was removed. See https://github.com/nltk/nltk/issues?labels=model corpus.reader.wordnet._lcs_by_depth was removed. See https://github.com/nltk/nltk/issues/422.

4.Miscellaneous changes:

probability.ConditionalProbDist.default_factory now inherits from dict instead of defaultdict probability.ConditionalProbDistI.default_factory now inherits from dict instead of defaultdict probability.DictionaryConditionalProbDist.default_factory now inherits from dict instead of defaultdict tag.senna.SennaTagger → classify.Senna tag.senna.POSTagger → tag.SennaTagger tag.senna.CHKTagger → tag.SennaChunkTagger

5.Printing changes (from 3.0.2, see https://github.com/nltk/nltk/issues/804):

classify.decisiontree.DecisionTreeClassifier.pp → pretty_format metrics.confusionmatrix.ConfusionMatrix.pp → pretty_format sem.lfg.FStructure.pprint → pretty_format sem.drt.DrtExpression.pretty → pretty_format parse.chart.Chart.pp → pretty_format Tree.pprint() → pformat FreqDist.pprint → pformat Tree.pretty_print → pprint Tree.pprint_latex_qtree → pformat_latex_qtree Environment variables for third-party software: These have been normalised; please see Installing Third Party Software More background on Python 3 and NLTK 3: http://docs.python.org/2/library/2to3.html http://docs.python.org/dev/whatsnew/3.0.html

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2018年05月21日,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1.Here are some changes you may need to make:
  • 2.Creating objects from strings:
  • 3.Removed modules, classes and functions:
  • 4.Miscellaneous changes:
  • 5.Printing changes (from 3.0.2, see https://github.com/nltk/nltk/issues/804):
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档