我有两个叫做"hosts“的文件(在不同的目录下)
我想使用python对它们进行比较,看看它们是否相同。如果它们不相同,我想在屏幕上打印出不同之处。
到目前为止,我已经尝试过了
hosts0 = open(dst1 + "/hosts","r")
hosts1 = open(dst2 + "/hosts","r")
lines1 = hosts0.readlines()
for i,lines2 in enumerate(hosts1):
if lines2 != lines1[i]:
print "line ", i, " in hosts1 is different \n"
print lines2
else:
print "same"
但是当我运行这个的时候,我得到了
File "./audit.py", line 34, in <module>
if lines2 != lines1[i]:
IndexError: list index out of range
这意味着其中一台主机的线路比另一台多。有没有更好的方法来比较两个文件并报告差异?
发布于 2013-10-02 08:14:21
import difflib
lines1 = '''
dog
cat
bird
buffalo
gophers
hound
horse
'''.strip().splitlines()
lines2 = '''
cat
dog
bird
buffalo
gopher
horse
mouse
'''.strip().splitlines()
# Changes:
# swapped positions of cat and dog
# changed gophers to gopher
# removed hound
# added mouse
for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm=''):
print line
输出以下内容:
--- file1
+++ file2
@@ -1,7 +1,7 @@
+cat
dog
-cat
bird
buffalo
-gophers
-hound
+gopher
horse
+mouse
这个diff为您提供了围绕行的上下文,以帮助您清楚文件的不同之处。你可以在这里看到两次“猫”,因为它是从“狗”下面删除的,而是添加到它上面的。
您可以使用n=0删除上下文。
for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0):
print line
输出以下内容:
--- file1
+++ file2
@@ -0,0 +1 @@
+cat
@@ -2 +2,0 @@
-cat
@@ -5,2 +5 @@
-gophers
-hound
+gopher
@@ -7,0 +7 @@
+mouse
但现在它充满了"@@“行,告诉您文件中已更改的位置。让我们删除多余的行,使其更具可读性。
for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0):
for prefix in ('---', '+++', '@@'):
if line.startswith(prefix):
break
else:
print line
为我们提供以下输出:
+cat
-cat
-gophers
-hound
+gopher
+mouse
现在你想让它做什么?如果忽略所有删除的行,则不会看到"hound“被删除。如果您只想显示文件中添加的内容,那么您可以这样做:
diff = difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0)
lines = list(diff)[2:]
added = [line[1:] for line in lines if line[0] == '+']
removed = [line[1:] for line in lines if line[0] == '-']
print 'additions:'
for line in added:
print line
print
print 'additions, ignoring position'
for line in added:
if line not in removed:
print line
输出:
additions:
cat
gopher
mouse
additions, ignoring position:
gopher
mouse
到目前为止,您可能已经知道有多种方法可以“打印”两个文件的差异,因此,如果您需要更多帮助,则需要非常具体。
发布于 2013-10-02 00:12:56
difflib库对此很有用,它是标准库中的一部分。我喜欢统一的diff格式。
http://docs.python.org/2/library/difflib.html#difflib.unified_diff
import difflib
import sys
with open('/tmp/hosts0', 'r') as hosts0:
with open('/tmp/hosts1', 'r') as hosts1:
diff = difflib.unified_diff(
hosts0.readlines(),
hosts1.readlines(),
fromfile='hosts0',
tofile='hosts1',
)
for line in diff:
sys.stdout.write(line)
输出:
--- hosts0
+++ hosts1
@@ -1,5 +1,4 @@
one
two
-dogs
three
这里是一个不可靠的版本,它忽略了某些行。可能有一些边缘情况不起作用,而且肯定有更好的方法来做到这一点,但也许它对您的目的来说已经足够好了。
import difflib
import sys
with open('/tmp/hosts0', 'r') as hosts0:
with open('/tmp/hosts1', 'r') as hosts1:
diff = difflib.unified_diff(
hosts0.readlines(),
hosts1.readlines(),
fromfile='hosts0',
tofile='hosts1',
n=0,
)
for line in diff:
for prefix in ('---', '+++', '@@'):
if line.startswith(prefix):
break
else:
sys.stdout.write(line[1:])
发布于 2013-10-02 00:01:48
hosts0 = open("C:path\\a.txt","r")
hosts1 = open("C:path\\b.txt","r")
lines1 = hosts0.readlines()
for i,lines2 in enumerate(hosts1):
if lines2 != lines1[i]:
print "line ", i, " in hosts1 is different \n"
print lines2
else:
print "same"
上面的代码对我来说很有效。你能指出你面临的错误是什么吗?
https://stackoverflow.com/questions/19120489
复制相似问题