我正在学习python,并且我已经做了一个函数从一个.nfo文件中获取文本,改进了这个代码块:
# Book Author
_author = ["Author:", "Author.."]
logger_nfo.info('Searching Author book...')
with open(nfofile, "r+", encoding='utf-8') as file1:
fileline1 = file1.readlines()
for x in _author: # <--- Loop through the list to check
for line in fileline1: # <--- Loop through each line
line = line.casefold() # <--- Set line to lowercase
if x.casefold() in line:
logger_nfo.info('Line found with word: %s', x)
nfo_author = line
if nfo_author == '':
logger_nfo.warning('Author not found.')
我已经完成了这个功能:
nfofile_link = "The Sentence.nfo"
search_for = ["Author:", "Author.."]
def search_nfo(nfofile_link, search_for):
logger_nfo.info('Searching nfo')
with open(nfofile_link, "r+", encoding='utf-8') as file1:
fileline1 = file1.readlines()
for x in search_for: # <--- Loop through the list to check
for line in fileline1: # <--- Loop through each line
line = line.casefold() # <--- Set line to lowercase
if x.casefold() in line:
logger_nfo.info('Line found with word: %s', x)
global nfo_author
nfo_author = line
search_nfo(nfofile_link, search_for)
print(nfo_author)
这是可行的,但我想改进它,我怎么做呢?
.nfo文件内容的一个示例:
General Information
===================
Title: Agent in Berlin
Author: Alex Gerlis
Read By: Duncan Galloway
Copyright: 2021
Audiobook Copyright: 2021
Genre: Audiobook
Publisher: Canelo
Series Name: Wolf Pack Spies
Position in Series: 01
Abridged: No
Original Media Information
==========================
Media: Downloaded
Source: Audible
Condition: New
File Information
================
Number of MP3s: 42
Total Duration: 11:33:05
Total MP3 Size: 319.23 MB
Encoded At: 64 kbit/s 22050 Hz Mono
ID3 Tags: Set, v1.1, v2.3
发布于 2021-12-11 16:06:59
拥抱函数,而不是全局变量。将代码放入函数中是很好的。但是,您没有充分利用函数的强大功能,因为您使用该函数来修改全局变量。除非在非常罕见的情况下,这不是一个好主意。相反,函数应该使用参数并返回值。围绕这一基本模式组织您的程序。
将算法细节推入易于测试的函数中。需要文件才能存在的函数比接受文本并返回值的函数更难测试和调试。例如,一个简单的面向数据的函数可以粘贴到Python中并进行实验。这样的函数的自动测试很容易编写,因为您只需要定义一个字符串的列表/元组并将它们传递到函数中。出于这些原因,我会将search_nfo()
降到最低限度:打开文件,读取行,并执行任何所需的日志记录。就这样。将大部分工作--特别是可能包含bug的算法细节,可能需要调试、测试或未来修改--委托给一个单独的函数。
不要比你需要的更频繁地用小写字母。目前,您正在嵌套的for-循环内部降低强制值.如果数据量是适中的还是很小,那就不是什么大问题了。尽管如此,实践编辑代码的良好习惯是一个不错的主意,以消除不必要的操作。不要把这种美德看得不可理喻,但记住它。
def main():
nfo_author = search_nfo('foobar.nfo', ('Author:', 'Author..'))
print(nfo_author)
def search_nfo(nfofile_link, search_for):
with open(nfofile_link) as fh:
nfo_author = search_nfo_lines(fh.readlines(), search_for)
return nfo_author
def search_nfo_lines(lines, search_for):
targets = tuple(s.casefold() for s in search_for)
for line in lines:
lower_line = line.casefold()
if any(t in lower_line for t in targets):
# Do you want to return the original line
# or the lowercase line? Adjust as needed.
return line
return None
if __name__ == '__main__':
main()
发布于 2021-12-11 16:32:03
我完全同意这里的其他答案,我只想拿出我的两分钱。
在nfo
文件中读取数据结构的转换,将其转换为Python更熟悉的对象是明智的。如果您对该文件做了更多的工作,那么NFO
类可能是最好的,但就目前而言,我认为文件的内容可以最好地通过字典来表示。
如前所述,从函数返回时,我们希望充分利用函数,并通过返回或yielding
避免全局变量。如果我们想逐行查看nfo文件的内容,我们可以执行以下操作
from pathlib import Path
from pprint import pprint
from typing import Iterable
def path_2_nfo(path: Path) -> Iterable[str]:
with open(path, "r+") as f:
for line in f:
if stripped := line.strip():
yield stripped
上面的代码将逐行返回nfo文件的内容。例如,您可以尝试运行下面的代码来查看它的功能。
BOOK_PATH = Path(r"The Sentence.nfo")
nfo = path_2_nfo(BOOK_PATH)
for line in nfo:
print(line)
我也强烈建议避免文件名中的特殊字符,但这是另一天的故事。第二部分是简单地将我们的nfo
读入一个小块。乍一看,nfo由三个不同的对象组成:标头单行,不包含:
,换行整行=
,最后是包含:
的一行内容。
逐行读取文件并根据这三种类型进行筛选,如下所示
def nfo_2_dict(nfo: Iterable[str], sep: str = ":") -> dict:
nfo_dict = dict()
for line in nfo:
is_linebreak = all("=" == char for char in line)
is_content = sep in line
is_header = not is_linebreak and not is_content
if is_header:
header = line
nfo_dict[header] = dict()
elif is_content:
value, *content = line.split(sep)
nfo_dict[header][value] = sep.join(c.strip() for c in content)
return nfo_dict
我们现在的主旋律变成了
BOOK_PATH = Path(r"The Sentence.nfo")
nfo = path_2_nfo(BOOK_PATH)
nfo_dict = nfo_2_dict(nfo)
print(nfo_dict["General Information"]["Author"])
pprint(nfo_dict)
根据您需要搜索文件的频率,我觉得简单地创建字典比每次搜索文件要干净得多。
https://codereview.stackexchange.com/questions/270901
复制相似问题