我有一个有100行的Excel表格。每个单元都有不同的信息,包括一个id和一个包含照片的单元格。
我用熊猫把数据载入字典:
import pandas as pd
df = pd.read_excel('myfile.xlsx')
data = []
for index,row in df.iterrows():
data.append({
'id':row['id'],
'field2':row['field2'],
'field3':row['field3']
})
对于图像列,我希望提取每个图像,将其命名为该行的id ( image _ row‘id’..jpg),并将其放入文件夹中。然后,我想将图像的路径存储如下:
for index,row in df.iterrows():
data.append({
'id':row['id'],
'field2':row['field2'],
'field3':row['field3'],
'image':'path/image_'+row['id']+'.jpg'
})
我在找一种方法去做,或者另一种更好的方法。你知不知道?
我在Linux上,所以我不能使用该方法与pywin32相结合。
非常感谢
-编辑
您可以找到我使用的工作表的示例这里。
发布于 2020-05-27 14:16:40
我找到了一个使用开吡咯和openpyxl-图像加载程序模块的解决方案
# installing the modules
pip3 install openpyxl
pip3 install openpyxl-image-loader
然后,在脚本中:
#Importing the modules
import openpyxl
from openpyxl_image_loader import SheetImageLoader
#loading the Excel File and the sheet
pxl_doc = openpyxl.load_workbook('myfile.xlsx')
sheet = pxl_doc['Sheet_name']
#calling the image_loader
image_loader = SheetImageLoader(sheet)
#get the image (put the cell you need instead of 'A1')
image = image_loader.get('A1')
#showing the image
image.show()
#saving the image
image.save('my_path/image_name.jpg')
最后,我可以将路径和图像名存储在我的字典中,为每一行设置一个循环。
发布于 2021-10-19 16:16:33
def extract_images_from_excel(path, dir_extract=None):
"""extracts images from excel and names then with enumerated filename
Args:
path: pathlib.Path, excel filepath
dir_extract: pathlib.Path, default=None, defaults to same dir as excel file
Returns:
new_paths: list[pathlib.Path], list of paths to the extracted images
"""
if type(path) is str:
path = pathlib.Path(path)
if dir_extract is None:
dir_extract = path.parent
if path.suffix != '.xlsx':
raise ValueError('path must be an xlsx file')
name = path.name.replace(''.join(path.suffixes), '').replace(' ', '') # name of excel file without suffixes
temp_file = pathlib.Path(source_file).parent / 'temp.xlsx' # temp xlsx
temp_zip = temp_file.with_suffix('.zip') # temp zip
shutil.copyfile(source_file, temp_file)
temp_file.rename(str(temp_zip))
extract_dir = temp_file.parent / 'temp'
extract_dir.mkdir(exist_ok=True)
shutil.unpack_archive(temp_zip, extract_dir) # unzip xlsx zip file
paths_img = sorted((extract_dir / 'xl' / 'media').glob('*.png')) # find images
move_paths = {path: destination_dir / (name + f'-{str(n)}.png') for n, path in enumerate(paths_img)} # create move path dict
new_paths = [shutil.move(old, new) for old, new in move_paths.items()] # move / rename image files
shutil.rmtree(extract_dir) # delete temp folder
temp_zip.unlink() # delete temp file
return new_paths
上述^执行下列操作:
不需要第三方包,也不需要窗口来运行
发布于 2021-01-28 07:01:53
您可以解压缩重命名的xlsx文件。
$ cp a.xlsx a.zip
$ unzip a.zip
$ ls -al xl/media
https://stackoverflow.com/questions/62039535
复制相似问题