我有一个名为pdf_sequence_dict的库。
pdf_sequence_dict = {'document-page1.pdf': (1, 1),
'document-page10.pdf':(10, 0),
'document-page2.pdf': (2, 0),
'document-page3.pdf': (3, 1),
'document-page4.pdf': (4, 0),
'document-page5.pdf': (5, 1),
'document-page6.pdf': (6, 0),
'document-page7.pdf': (7, 1),
'document-page8.pdf': (8, 0),
'document-page9.pdf': (9, 1)}
其次,我将按照升序对字典的第0个索引进行排序。并将pdf_sequence_dict字典转换为pdf_sequence_list列表。
pdf_sequence_list = [ (v , k) for k , v in pdf_sequence_dict.items()]
pdf_sequence_list.sort()
print(pdf_sequence_list)
分类清单如下所示,
[((1, 1), 'document-page1.pdf'),
((2, 0), 'document-page2.pdf'),
((3, 1), 'document-page3.pdf'),
((4, 0), 'document-page4.pdf'),
((5, 1), 'document-page5.pdf'),
((6, 0), 'document-page6.pdf'),
((7, 1), 'document-page7.pdf'),
((8, 0), 'document-page8.pdf'),
((9, 1), 'document-page9.pdf'),
((10, 0), 'document-page10.pdf')]
接下来,我想从pdf_sequence_list创建一个子列表,其中包含来自子元组的第一个索引值的一组1和o。
示例:page1和page2都属于单个pdf集。
((1, **1**), 'document-page1.pdf')
((2, **0**), 'document-page2.pdf')
我正在循环访问列表,并且始终附加第零个索引pdf名称(document-page1.pdf)。因为第一个值始终为1,即将到来的值可能会随着0而变化。例:
((1, **1**), 'document-page1.pdf')
((2, **0**), 'document-page2.pdf')
((3, **0**), 'document-page3.pdf')
((4, **0**), 'document-page4.pdf')
每当有1,这意味着新的pdf集正在启动。基于这个,我将这个pdf文件分列出来。代码如下。
flag = 1
subList = []
finalList = []
for i in pdf_sequence_list :
y = i[-1]
if pdf_sequence_list.index(i) == 0:
subList.append(y)
else:
if i[0][1] * flag == 0:
subList.append(y)
else:
finalList.append(subList)
subList = []
subList.append(y)
finalList如下所示,
[['document-page1.pdf', 'document-page2.pdf'],
['document-page3.pdf', 'document-page4.pdf'],
['document-page5.pdf', 'document-page6.pdf'],
['document-page7.pdf', 'document-page8.pdf']]
目前,子列表代码不够精通。有没有一个有效的替代方法?
发布于 2018-06-14 15:38:38
#python 3.5.2
import json
pdf_sequence_dict = {
'document-page1.pdf': (1, 1),
'document-page10.pdf':(10, 0),
'document-page2.pdf': (2, 0),
'document-page3.pdf': (3, 1),
'document-page4.pdf': (4, 0),
'document-page5.pdf': (5, 1),
'document-page6.pdf': (6, 0),
'document-page7.pdf': (7, 1),
'document-page8.pdf': (8, 0),
'document-page9.pdf': (9, 1)
}
pdf_sequence_list = sorted([(v , k) for k , v in pdf_sequence_dict.items()], key=lambda tup:tup[0][0]);
final_list = [[pdf_sequence_list[i][1], pdf_sequence_list[i+1][1]] for i in range(0, len(pdf_sequence_list), 2)];
# PRETTY PRINTING LIST
print(json.dumps(final_list, indent=4));
输出
[
[
"document-page1.pdf",
"document-page2.pdf"
],
[
"document-page3.pdf",
"document-page4.pdf"
],
[
"document-page5.pdf",
"document-page6.pdf"
],
[
"document-page7.pdf",
"document-page8.pdf"
],
[
"document-page9.pdf",
"document-page10.pdf"
]
]
https://stackoverflow.com/questions/-100005385
复制相似问题