问Python:使用regex提取文件的一列
EN

Stack Overflow用户

提问于 2018-07-12 05:13:41

回答 1查看 276关注 0票数 0

我目前正在通过在os.system()中使用awk提取文件中的列：

os.system("awk '{print $'%i'}' < infile > outfile"%some_column)
np.loadtxt('outfile')

有没有使用正则表达式来完成此任务的等效方法？

谢谢。

编辑:我想要澄清的是，我正在寻找提取大文件特定列的最佳方法。

python

regex

回答 1

Stack Overflow用户

发布于 2018-07-12 06:34:30

根据您的数据分隔符是什么，正则表达式可能对此过于苛刻。如果分隔符很简单(空格或特定字符/字符串)，则只需使用string.split method就可以分隔列。

下面是一个示例程序来解释这是如何工作的：

column = 0  # First column
with open("data.txt") as file:
  data = file.readlines()
columns = list(map(lambda x: x.strip().split()[column], data))

要分解它，请执行以下操作：

column = 0
# Read a file named "data.txt" into an array of lines
with open("data.txt") as file:
  data = file.readlines()
# This is where we will store the columns as we extract them
columns = []
# Iterate over each line in the file
for line in data:
  # Strip the whitespace (including the trailing newline character) from the
  # start and end of the string
  line = line.strip()
  # Split the line, using the standard delimiter (arbitrary number of
  # whitespace characters)
  line = line.split()
  # Extract the column data from the desired index and store it in our list
  columns.append(line[column])
# columns now holds a list of strings extracted from that column

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51294237

复制

相似问题

问Python:使用regex提取文件的一列
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python:使用regex提取文件的一列EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python:使用regex提取文件的一列
EN