1. 打开文件
数据文本:sketch.txt
我们利用程序来处理数据时,需要额外打开数据文件。
我们首先导入os模块。 #关于os模块的用法点击此处。
os.getcwd() #获取当前工作路径
os.chdir() #把当前工作路径切换到想要处理文本数据所在所在文件夹
>>> import os
>>> os.getcwd()
'/home/mwx'
>>> os.chdir('/home/mwx/HeadFirstPython/chapter3')
>>> os.getcwd() #再来一遍获取路径,检查是否路径已经改到文本所在的路径
'/home/mwx/HeadFirstPython/chapter3'
>>> data=open('sketch.txt') #打开数据文件,把文件赋值给‘data’
>>> print(data.readline(),end='') #读取文件的第一行数据
>>> data.seek(0) #使用seek()回到文件起始位置,python文件tell()也可以
0
>>> for each_line in data: #打印每一行数据
print(data.readline(),end='')
Other Man: I've told you once.
Other Man: Yes I have.
Other Man: Just now.
Other Man: Yes I did!
Other Man: I'm telling you, I did!
Other Man: Oh I'm sorry, is this a five minute argument, or the full half hour?
Other Man: Just the five minutes. Thank you.
Man: You most certainly did not!
Man: Oh no you didn't!
Man: Oh no you didn't!
Man: Oh look, this isn't an argument!
Other Man: Yes it is!
(pause)
Other Man: No it isn't!
Other Man: It is NOT!
Other Man: No I didn't!
Other Man: No no no!
Other Man: Nonsense!
(pause)
Man: Yes it is!
>>> data.close()
2. split()的用法
Python split()通过指定分隔符对字符串进行切片,如果参数num 有指定值,则仅分隔 num 个子字符串。
2.1 语法
str.split(str="", num=string.count(str))
#str -- 分隔符,默认为所有的空字符,包括空格、换行(\n)、制表符(\t)等。 num--分割次数
3. 对数据进行处理
#将每一句话中的':'改为' said :'
>>> import os
>>> os.getcwd()
'/home/mwx'
>>> os.chdir('/home/mwx/HeadFirstPython/chapter3')
>>> os.getcwd()
'/home/mwx/HeadFirstPython/chapter3'
>>> data=open('sketch.txt')
>>> for each_line in data:
(role,line_spoken)=each_line.split(':',1)
print(role,end='')
print(' said:',end='')
print(line_spoken,end='')
Man said: Is this the right room for an argument?
Other Man said: I've told you once.
Man said: No you haven't!
Other Man said: Yes I have.
Man said: When?
Other Man said: Just now.
Man said: No you didn't!
Other Man said: Yes I did!
Man said: You didn't!
Other Man said: I'm telling you, I did!
Man said: You did not!
Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour?
Man said: Ah! (taking out his wallet and paying) Just the five minutes.
Other Man said: Just the five minutes. Thank you.
Other Man said: Anyway, I did.
Man said: You most certainly did not!
Other Man said: Now let's get one thing quite clear: I most definitely told you!
Man said: Oh no you didn't!
Other Man said: Oh yes I did!
Man said: Oh no you didn't!
Other Man said: Oh yes I did!
Man said: Oh look, this isn't an argument!
#此处后一行报错,因为此处原文为"(pause)",并不存在':',split()查找':'就会出现问题。
4. 错误处理
for each_line in data:
if not each_line.find(':')==-1: #find()返回-1表示未找到
(role,line_spoken)=each_line.split(':',1)
print(role,end='')
print(' said:',end='')
print(line_spoken,end='')
for each_line in data:
try:
(role,line_spoken)=each_line.split(':',1)
print(role,end='')
print(' said:',end='')
print(line_spoken,end='')
except:
pass #根据实际情况,有时候可以直接"放过"错误
5. 一些错误检查及错误提示
os.path.exists('sketch.txt')#检查文件是否存在
ValueError: 数据不符合期望格式。
IoError: 数据无法正常访问(如文件已被移走或者重命名)。
AttributeError: 调用不存在的方法引发的异常
EOFError: 遇到文件末尾引发的异常
ImportError: 导入模块出错引发的异常
IndexError: 列表越界引发的异常
KeyError: 使用字典中不存在的关键字引发的异常
NameError: 使用不存在的变量名引发的异常
TabError: 语句块缩进不正确引发的异常
ZeroDivisionError: 除数为零引发的异常
扫码关注腾讯云开发者
领取腾讯云代金券
Copyright © 2013 - 2025 Tencent Cloud. All Rights Reserved. 腾讯云 版权所有
深圳市腾讯计算机系统有限公司 ICP备案/许可证号:粤B2-20090059 深公网安备号 44030502008569
腾讯云计算(北京)有限责任公司 京ICP证150476号 | 京ICP备11018762号 | 京公网安备号11010802020287
Copyright © 2013 - 2025 Tencent Cloud.
All Rights Reserved. 腾讯云 版权所有