我正在学习python (2.7),如果两个文件范围重叠,我会尝试打印它们。
假设在input1.txt文件中,我有-
p1234: 4-5, 7-12, 15-19
p5678: 7-59, 78-345
p4356: 3-4, 6-10
在input2.txt文件中-
p1234: 1-3, 6-13, 16-20, 22-25
p4356: 9-10
在这两个输入文件中,我只想为两个输入文件中的每个id (每个文件中最左边的列)保留彼此重叠的范围,并丢弃其他范围。
也就是说,两个输出文件如下所示:
output1.txt-
p1234: 7-12, 15-19
p4356: 6-10
output2.txt-
p1234: 6-13, 16-20
p4356: 9-10
我了解到,要仅打印重叠的范围,我可以使用:
x = range(1,10)
y = range(8,20)
intersection = [i for i in x if i in y]
try:
print x
print y
except NameError:
print intersection
这就给了我们:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
我可以从这里得到两个区域之间的范围(虽然最后一个数字是重叠的,没有显示),但我不知道如何在匹配两个输入文件的ids (以两种输出文件格式)后,仅打印两个输入文件中重叠的区域。请帮帮忙。
感谢您的考虑。
发布于 2017-07-24 01:17:46
正如我在评论中建议的那样,并进一步使用您已经开始的内容。以下面的函数为例:
def intersect_or_not(range_, list_of_ranges):
for range2_ in list_of_ranges:
intersections = [i for i in range_ if i in range2_]
if intersections:
return True # return also breaks (if one range intersected you have a match!)
return False
x = range(1,10)
y = [range(8,20),range(9,20)]
print(intersect_or_not(x,y))
#True
我想你知道该怎么做!:)
更新:
好吧,这有点太复杂了,但我还是要把它贴出来。假设您将文件读入元组(startpos,endpos),您可以找到如下值:
def intersect_or_not(range_, list_of_ranges):
for range2_ in list_of_ranges:
intersections = [i for i in range_ if i in range2_]
if intersections:
return True
return False
list1 = [(4,5), (7,12), (15,19)]
list2 = [(1,3), (6,13), (16,20), (22,25)]
#output 1
[i for i in list1 if intersect_or_not(range(*i),[range(*ii) for ii in list2])]
# [(7, 12), (15, 19)]
#output 2
[i for i in list2 if intersect_or_not(range(*i),[range(*ii) for ii in list1])]
# [(6, 13), (16, 20)]
https://stackoverflow.com/questions/45267634
复制相似问题