文章/答案/技术大牛

发布

社区首页 >问答首页 >NLTK将子树转换为python / RSS提要块中的列表

问NLTK将子树转换为python / RSS提要块中的列表
EN

Stack Overflow用户

提问于 2013-10-11 19:59:27

回答 2查看 2.1K关注 0票数 1

使用下面的代码，我正在分块一个已经被标记和标记的rss提要。"print subtree.leaves()“正在输出：

(“总理”、“NNP”)、(“部长”、“NNP”)、(“Stephen”、“NNP”)、(“Harper”、“NNP”)(“什么”、“NNP”)(“CBC”、“NNP”)、(“新闻”、“NNP”)

这看起来像一个python列表，但是我不知道如何直接访问它或迭代它。我认为这是一个子树输出。

我希望能够将此子树转换为我可以操作的列表。有什么简单的方法吗？这是我第一次在巨蟒遇到树，我迷路了。我想以这份清单结束：

博士=“总理斯蒂芬·哈珀”、“美国总统巴拉克·奥巴马”、“什么”、"Keystone XL“、"CBC新闻”

有什么简单的方法可以让这一切发生吗？

谢谢你一如既往的帮助！

grammar = r""" Proper: {<NNP>+} """

cp = nltk.RegexpParser(grammar)
result = cp.parse(posDocuments)
nounPhraseDocs.append(result) 

for subtree in result.subtrees(filter=lambda t: t.node == 'Proper'):
# print the noun phrase as a list of part-of-speech tagged words

    print subtree.leaves()
print" "

nltk

chunks

list

parsing

tree

回答 2

Stack Overflow用户

回答已采纳

发布于 2013-10-16 11:25:39

docs = []

for subtree in result.subtrees(filter=lambda t: t.node == 'Proper'):
    docs.append(" ".join([a for (a,b) in subtree.leaves()]))

print docs

这应该能起作用。

票数 1

Stack Overflow用户

发布于 2017-01-16 09:56:46

node现在已经被label取代了。因此，修改Viktor's答案：

docs = []

for subtree in result.subtrees(filter=lambda t: t.label() == 'Proper'):
    docs.append(" ".join([a for (a,b) in subtree.leaves()]))

这将给出一个列表，其中只列出那些作为Proper恰克的一部分的令牌。您可以从filter方法中移除subtrees()参数，并将得到属于树的特定父级的所有标记的列表。

票数 6

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/19326278

复制

相似问题

问NLTK将子树转换为python / RSS提要块中的列表
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NLTK将子树转换为python / RSS提要块中的列表EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NLTK将子树转换为python / RSS提要块中的列表
EN