python3--文件操作

py3study

发布于 2018-08-02 16:21:10

9380

发布于 2018-08-02 16:21:10

文章被收录于专栏：python3

python文件操作

文件以什么编码存储，就以什么编码打开

参数：

1 文件路劲

2 编码方式

3 执行动作(打开方式)：只读，只写，追加，读写，写读

例子

现有一个文档，制服护士空姐萝莉联系方式.txt,怎么用python打开？

f = open('D:\制服护士空姐萝莉联系方式.txt', encoding='gbk', mode='r')
content = f.read()
print(content)
f.close()

执行结果

没睡醒？赶紧去写代码吧。

上面例子讲解

f：变量文件句柄

open 调用windows的系统功能，执行打开文件的动作

windows 默认编码方式：gbk，linux默认编码方式utf-8

r: 执行读的操作

f.close() 关闭文件

流程：打开一个文件，产生一个文件句柄，对文件句柄进行操作，关闭文件

读：

r，只读，以str方式读取

rb,只读，以bytes类型读取(非文字类的文件时，用rb，比如图片，音频文件等)

下面一个例子

f = open('D:\qycache\test.txt',encoding='utf-8')

content = f.read()

print(content)

f.close()

默认mode不写，表示只读模式

编码不一致时，报错

UnicodeDecodeError: 'gbk' codec can't decode byte 0xaa in position 14: illegal multibyte sequence

文件以什么编码存储的，就用什么编码打开

文件路径。

绝对路径：从根目录开始，一级一级查找直到找到文件

相对路径：在同一个文件夹下，直接写文件名即可

相对路径举例

f = open('username.txt',encoding='utf-8')

content = f.read()

print(content)

f.close()

务必保证python代码和txt文件在同一文件夹下

某些windows系统，读取文件的时候报错

[Error 22] Invalid argument: '\u202adD:\\xx.txt'

解决方法

第一种前面加个r: r'C:\log.txt'

第二种前面多加个斜杠，表示转义: C:\\log.txt

python r模式有5种模式读取

1：全部读出来f.read()

f = open('tianqi.txt',encoding='utf-8')

content = f.read()

print(content)

f.close()

执行输出

03月27日(今天)

晴转多云

11～27℃

西南风 1级

重度污染

2: 一行行的读f.readline()

f = open('天气.txt',encoding='utf-8')

print(f.readline())

f.close()

3: 将原文件的每一行作为一个列表的元素f.readlines()

f = open('天气.txt',encoding='utf-8')

print(f.readlines())

f.close()

执行输出

['03月27日(今天)\n', '晴转多云\n', '11～27℃\n', '西南风 1级\n', '重度污染']

4: 读取一部分read(n)

在r模式下，read(n)按照字符去读取

f = open('天气.txt',encoding='utf-8')

print(f.read(3))

f.close()

执行输出

03月

5：for循环读取(也是做好的一种方式)

f = open('天气.txt',encoding='utf-8')

for i in f:

print(i.strip())

f.close()

执行输出

03月27日(今天)

晴转多云

11～27℃

西南风 1级

重度污染

在for循环中，每次读取一行，结束之后，内存就释放了。所以在整个for循环个过程中，始终只占用了一行内容的内存。

推荐使用第5种方式

写操作(w)

w 文件不存在时，创建一个文件写入内容

有文件时，将原文件内容清空，再写入内容

f = open('log.txt',encoding='utf-8',mode='w')

f.write('人生苦短，我想学Python')

f.close()

wb以bytes写入，写入的内容，必须要转换为bytes类型才可以

a追加

没有文件时，创建一个文件追加内容

有文件时，直接追加内容

f = open('log2.txt',encoding='utf-8',mode='a')

f.write('666')

f.close()

r+读写，先读，后追加

错误的写法

f = open('log.txt',encoding='utf-8',mode='r+')

f.write('BBB')

content = f.read()

print(content)

f.close()

执行输出，内容为空

为什么呢？

因为光标，默认是从0开始。只要进行一个动作，光标就会移动，包括读取。

上面的代码写入时，光标移动到最后了。所以执行f.read()时，读取不到后面的内容了。

r+ 一定要先读后写，否则会错乱或者读取不到内容

w+ 先写后读

f = open('log.txt',encoding='utf-8',mode='w+')

f.write('AAA')

content = f.read()

print(content)

f.close()

执行输出，内容是空的

写完之后，光标移动到最后了，所以读取的时候，读取不到内容了

正确的写法

f = open('log.txt',encoding='utf-8',mode='w+')

f.write('AAA')

print(f.tell()) #按直接去读光标位置

f.seek(0) #调整光标位置

content = f.read()

print(content)

f.close()

执行输出

AAA

下面一个例子

f = open('log.txt',encoding='utf-8',mode='w+')

f.write('中国')

print(f.tell()) #按直接去读光标位置

f.seek(2) #调整光标位置

content = f.read()

print(content)

f.close()

执行输出:

......

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 0: invalid start byte

因为一个中文占用3字节

ftp的断点续传，需要用到光标，一定会用到tell和seek

a+追加读，就不一一举例了

用的最多的还是r和r+模式，r的5种模式中，重点掌握第5种，for的方法

其它操作方法

其他操作方法：

truncate #截取文件

writable() #是否可写

readable() #是否可读

truncate是截取文件，所以文件的打开方式必须可写，但是不能用w或w+等方式打开，因为那样直接清空文件了，所以truncate要在r+或a或a+等模式下测试效果

f = open('log.txt',encoding='utf-8',mode='r+')

# 截取10个字节

f.truncate(3)

content = f.read()

print(content)

f.close()

执行输出

中

判断是否可写

f = open('log.txt',encoding='utf-8',mode='r+')

print(f.writable())

f.close()

执行输出

True

回收方法为：

1、f.close() #回收操作系统级打开的文件，关闭文件句柄

2、del f #回收应用程序级的变量，在python代码级别中，删除变量

为了避免忘记回收文件句柄，需要使用with open方法，代码执行完毕之后，自动关闭文件句柄

功能1：自动关闭文件句柄

with open('log.txt',encoding='utf-8') as f:

print(f.read())

功能2：一次性操作多个文件

with open('log.txt',encoding='utf-8') as f1,\

open('log1.txt',encoding='utf-8',mode='r+') as f2:

print(f1.read())

print(f2.read())

有些情况下，必须先关闭，才能执行某某动作的情况下，不能用with，这种情况比较少见。

推荐使用with open

所有的软件，不是直接在原文件修改的。

它是进行了5步操作

1.将原文件读取到内存。

2.在内存中进行修改，形成新的内容。

3.将新的字符串写入新文件。

4.将原文件删除。

5.将新文件重命名成原文件。

将log文件内容中含有张三的，替换为李四

import os

#第1步

with open('log',encoding='utf-8') as f1,\

open('log.bak',encoding='utf-8',mode='w') as f2:

content = f1.read()

#第2步

new_content = content.replace('张三','李四')

#第3步

f2.write(new_content)

#第4步

os.remove('log')

#第5步

os.rename('log.bak','log')

这种方法不好，如果文件比较大，内存直接爆掉，因为f1.read()是将文件所有内容读取到内容中存放

推荐做法

import os

#第1步

with open('log',encoding='utf-8') as f1,\

open('log.bak',encoding='utf-8',mode='w') as f2:

for i in f1:

#第3步

new_i = i.replace('张三', '李四')

#第4步

f2.write(new_i)

#第4步

os.remove('log')

#第5步

os.rename('log.bak','log')

这种方式，每次只占用一行。

所有软件，都是执行这5步的

习题练习：

#!/usr/bin/env python
# coding: utf-8
__author__ = 'www.py3study.com'
# 1. 文件a.txt内容：每一行内容分别为商品名字，价钱，个数。
# apple 10 3
# tesla 100000 1
# mac 3000 2
# lenovo 30000 3
# chicken 10 3
# 通过代码，将其构建成这种数据类型：[{'name':'apple','price':10,'amount':3},{'name':'tesla','price':1000000,'amount':1}......] 并计算出总价钱。
#方法一
dic = {}
num = 0
sum1 = 0
with open('zuoye1.txt', encoding='utf-8', mode='r') as f1:
    for i in f1:
        ss = i.split()
        if ss[0] not in dic:
            num += 1
            dic['key' + str(num)] = {'name': ss[0], 'price': int(ss[1]), 'amount': int(ss[2])}
        else:
            dic['key' + str(num)] = {'name': ss[0], 'price': int(ss[1]), 'amount': int(ss[2])}

    print(list(dic.values()))
    for i in dic.values():
        sum1 += i['amount'] * i['price']
    print(sum1)

# 方法二
l1 = []
name_list = ['name', 'price', 'amount', 'year']
with open('zuoye1.txt', encoding='utf-8', mode='r') as f1:
    for i in f1:
        l2 = i.strip().split()
        #print(l2)
        dic = {}
        for j in range(len(l2)):
            dic[name_list[j]] = l2[j] # dic[name] = apple dic[price] = 10 .....
        l1.append(dic)
sum1 = 0
sum2 = 0
for i in l1:
    sum1 = int(i['amount']) * int(i['price'])
    sum2 += sum1
print(sum2)

# 3. 文件a1.txt内容：每一行内容分别为商品名字，价钱，个数。
# 文件内容：
# name:apple price:10 amount:3 year:2012
# name:tesla price:100000 amount:1 year:2013
#
# 通过代码，将其构建成这种数据类型：
# [{'name':'apple','price':10,'amount':3},
# {'name':'tesla','price':1000000,'amount':1}......]
# 并计算出总价钱。
# 思路 {key1 : {'name':'apple','price':10,'amount':3} ....}
dic = {}
num2 = 0
sum2 = 0
with open("zuoye3.txt", encoding='utf-8', mode='r') as f1:
    for i in f1:
        ss = i.strip().replace(' ', ':').split(':')
        #print(ss)
        num2 += 1
        if ss[0] not in dic:
            dic['key' + str(num2)] = {ss[0]: ss[1], ss[2]: ss[3], ss[4]: ss[5], ss[6]: ss[7]}
        else:
            dic['key' + str(num2)] = {ss[0]: ss[1], ss[2]: ss[3], ss[4]: ss[5], ss[6]: ss[7]}
    print(list(dic.values()))
    for i in dic.values():
        #print(i)
        sum2 += int(i['amount']) * int(i['price'])
    print("总价钱为:{}".format(sum2))
    
# 4,文件a2.txt内容：每一行内容分别为商品名字，价钱，个数。
# 文件内容：
# 序号     部门      人数      平均年龄      备注
# 1       python    30         26         单身狗
# 2       Linux     26         30         没对象
# 3       运营部     20         24         女生多
# 通过代码，将其构建成这种数据类型：
# [{'序号':'1','部门':Python,'人数':30,'平均年龄':26,'备注':'单身狗'},
# ......]
# 并计算出总价钱。
# 思路 {'序号':'1','部门':Python,'人数':30,'平均年龄':26,'备注':'单身狗'}
dic = {}
list1 = []
num = 0
sum4 = 0
with open("zuoye4.txt", encoding='utf-8', mode='r') as f1:
    for i in f1.readlines(1):
        ss = i.strip().split()
        for y in ss:
            list1.append(y.strip())
    for x in f1:
        x = x.split()
        num += 1
        if x[0] not in list1[0]:
            dic['key' + str(num)] = {list1[0]: x[0], list1[1]: x[1], list1[2]: x[2], list1[3]: x[3], list1[4]: x[4]}

    new_dic = list(dic.values())
    print(new_dic)
    for y in new_dic:
        sum4 += int(y['人数'])
    # 总人数
    print('总人数:{}'.format(sum4))

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2018/03/28 ，如有侵权请联系 cloudcommunity@tencent.com 删除

node.js