文章/答案/技术大牛

发布

社区首页 >问答首页 >Python -如何循环遍历列表中的每个索引位置？

问Python -如何循环遍历列表中的每个索引位置？
EN

Stack Overflow用户

提问于 2021-11-29 20:39:38

回答 2查看 320关注 0票数 1

给定一个列表[[["source1"], ["target1"], ["alignment1"]], ["source2"], ["target2"], ["alignment2"]], ...]，我想提取源中与目标中的单词对齐的单词。例如，在英语和德语句子对中，帽子就在桌子上.

The - Der
hat - Hut
is - liegt
on - auf
the - dem
table - Tisch
. - .

因此，我写了以下几点：

en_de = [
[['The', 'hat', 'is', 'on', 'the', 'table', '.'], ['Der', 'Hut', 'liegt', 'auf', 'dem', 'Tisch', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6'], 
[['The', 'picture', 'is', 'on', 'the', 'wall', '.'], ['Das', 'Bild', 'hängt', 'an', 'der', 'Wand', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6'], 
[['The', 'bottle', 'is', 'under', 'the', 'sink', '.'], ['Die', 'Flasche', 'ist', 'under', 'dem', 'Waschbecken', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6']
]

for group in en_de:
    src_sent = group[0]
    tgt_sent = group[1]
    aligns = group[2]

    split_aligns = aligns.split()

    hyphen_split = [align.split("-") for align in split_aligns]

    align_index = hyphen_split[0]

    print(src_sent[int(align_index[0])],"-", tgt_sent[int(align_index[1])])

这将像预期的那样，打印src_sent和tgt_sent索引位置0中的单词。

The - Der
The - Das
The - Die

现在，我不知道如何打印src_sent和tgt_sent所有索引位置的单词。显然，我可以为句子对中的每个位置手动更新align_index到一个新的索引位置，但是在完整的数据集上，一些句子将有多达25个索引位置。有没有一种方法可以循环遍历每个索引位置？当我尝试：

align_index = hyphen_split[0:]
print(src_sent[int(align_index[0])],"-", tgt_sent[int(align_index[1])])

我得到了一个TypeError: int() argument must be a string, a bytes-like object or a number, not 'list' --很明显，align_index不能是一个列表，但是我不知道如何将它转换成能够实现我想要它做的事情。如有任何建议或帮助，将不胜感激。提前谢谢你。

nlp

linguistics

python

for-loop

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-11-29 21:06:42

您忘记遍历您的hyphen_split列表：

for group in en_de:
    src_sent = group[0]
    tgt_sent = group[1]
    aligns = group[2]

    split_aligns = aligns.split()

    hyphen_split = [align.split("-") for align in split_aligns]

    for align_index in hyphen_split:
        print(src_sent[int(align_index[0])],"-", tgt_sent[int(align_index[1])])

请参阅从代码中更新的最后两行。

票数 0

Stack Overflow用户

发布于 2021-11-29 20:44:45

你想要这样：

en_de = [
    [['The', 'hat', 'is', 'on', 'the', 'table', '.'], ['Der', 'Hut', 'liegt', 'auf', 'dem', 'Tisch', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6'],
    [['The', 'picture', 'is', 'on', 'the', 'wall', '.'], ['Das', 'Bild', 'hängt', 'an', 'der', 'Wand', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6'],
    [['The', 'bottle', 'is', 'under', 'the', 'sink', '.'], ['Die', 'Flasche', 'ist', 'under', 'dem', 'Waschbecken', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6']
]


for sentences in en_de:
    for en, de in zip(*sentences[:2]):
        print(f'{en} - {de}')

为每个句子打印一对英语和德语。如果他们总是成对的，这应该是可行的。因此，如果对齐总是线性的，则根本不需要有它。

如果对齐方式并不总是线性的，那么您也需要考虑到这一点：

en_de = [
    [['The', 'hat', 'is', 'on', 'the', 'table', '.'], ['Der', 'Hut', 'liegt', 'auf', 'dem', 'Tisch', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6'],
    [['The', 'picture', 'is', 'on', 'the', 'wall', '.'], ['Das', 'Bild', 'hängt', 'an', 'der', 'Wand', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6'],
    [['The', 'bottle', 'is', 'under', 'the', 'sink', '.'], ['Die', 'Flasche', 'ist', 'under', 'dem', 'Waschbecken', '.'], '0-0 1-1 2-2 3-3 4-4 5-5 6-6']
]


for sentences in en_de:
    # alternative to the below for loop
    # alignment = [(int(a), int(b)) for a, b in [p.split('-') for p in sentences[2].split()]]
    alignment = []
    for pair in sentences[2].split():
        e, g = pair.split('-')
        alignment.append((int(e), int(g)))

    english = [sentences[0][i] for i, _ in alignment]
    german = [sentences[1][i] for _, i in alignment]
    for en, ge in zip(english, german):
        print(f'{en} - {ge}')

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70161048

复制

相似问题

问Python -如何循环遍历列表中的每个索引位置？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python -如何循环遍历列表中的每个索引位置？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python -如何循环遍历列表中的每个索引位置？
EN