我想将两行与只有一行提要\n结合起来,下一行有时以引号开头。我正在尝试使用这段代码将它们组合起来,并使用\"查找引号,
comb_nextline = re.sub(r'(?<=[^\.][A-Za-z,-])\n[ ]*(?=[a-zA-Z0-9\(\"])', ' ', txt)但它不适用于以引号开头的行。有什么方法将以引号开头的行组合起来吗?谢谢!
我的txt看起来是这样的:
import re
txt= '''
The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output
(I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called
"chip joining", RTC offers both a near infrared or forced convection oven.
'''
comb_nextline = re.sub(r'(?<=[^\.][A-Za-z,-])\n[ ]*(?=[a-zA-Z0-9\(\"])', ' ', txt)
print(comb_nextline)我希望能得到这个
txt =
'''
The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output (I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called "chip joining", RTC offers both a near infrared or forced convection oven.
'''发布于 2022-11-20 12:15:45
还可以在匹配换行符之前匹配可选空格。
(?<=[^.][A-Za-z,-]) *\n *(?=[a-zA-Z0-9(\"])或使用否定式字符类[^\S\n]匹配所有没有换行符的空格。
(?<=[^.][A-Za-z,-])[^\S\n]*\n[^\S\n]*(?=[a-zA-Z0-9(\"])import re
txt = '''
The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output
(I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called
"chip joining", RTC offers both a near infrared or forced convection oven.
'''
comb_nextline = re.sub(r'(?<=[^.][A-Za-z,-]) *\n *(?=[a-zA-Z0-9(\"])', ' ', txt)
print(comb_nextline)输出
The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output (I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called "chip joining", RTC offers both a near infrared or forced convection oven.https://stackoverflow.com/questions/74507967
复制相似问题