首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >BadZipFile:文件不是压缩文件

BadZipFile:文件不是压缩文件
EN

Stack Overflow用户
提问于 2018-05-31 02:57:35
回答 1查看 18.4K关注 0票数 3

这是我的代码。尝试执行此脚本时出现错误

 Error    raise BadZipFile("File is not a zip file")  
          BadZipFile: File is not a zip file

这是我的源目录路径

data_dir =r‘l:\DataQA\python解压缩文件\源码压缩’

我在“源压缩”(未压缩)文件夹中有多个压缩文件夹。当我将源代码的所有子文件夹压缩到单个压缩文件夹中时,也可以使用相同的代码。但我不想要这种方法。

import os
import zipfile
import shutil
import json
import logging
import logging.config
import time

def my_start_time():
    global start_time, cumulative_time, start_time_stamp
    start_time = time.time()
    this_time = time.localtime(start_time)
    start_time_stamp = '{:4d}{:02d}{:02d} {:02d}:{:02d}:{:02d}'.format(\
                    this_time.tm_year, this_time.tm_mon, this_time.tm_mday,\
                    this_time.tm_hour, this_time.tm_min, this_time.tm_sec)
    cumulative_time = start_time - start_time 
    logging.info('Initial Setup: {:s}'.format(start_time_stamp))

def my_time():
    global cumulative_time
    time_taken = time.time() - start_time
    incremental_time = time_taken - cumulative_time
    cumulative_time = time_taken
    logging.info("Started: %s  Complete:  Cumulative: %.4f s  Incremental: %.4f s\n" \
          % (start_time_stamp, cumulative_time, incremental_time) )

logging.basicConfig(filename='myunzip_task_log.txt',level=logging.DEBUG)
my_start_time()

logging.info('Initial Setup...')

def write_to_json(data, file):
    value = False
    with open(file, 'w') as f:
        json.dump(json.dumps(data, sort_keys=True),f)   
        f.close()
        value = True
    return value


data_dir = r'L:\DataQA\Python Unzip Files\Source Zipped'
temp_dir =  r'L:\DataQA\Python Unzip Files\temp1'
new_dir = r'L:\DataQA\Python Unzip Files\temp2'
final_dir = r'L:\DataQA\Python Unzip Files\Destination Unzipped files'





big_list = os.listdir(data_dir)

archive_count = 0
file_count = 152865
basename1 = os.path.join(final_dir,'GENERIC_ROUGHDRAFT')
basename2 = os.path.join(final_dir,'XACTDOC')

my_time()
archive_count = len(big_list)
logging.info('Unzipping {} archives...'.format(archive_count))
for folder in big_list:
    prior_count = file_count
    logging.info('Starting: {}'.format(folder))

    try:
        shutil.rmtree(temp_dir)
    except FileNotFoundError: 
        pass
    os.mkdir(temp_dir)
    with zipfile.ZipFile(os.path.join(data_dir,folder),mode='r') as a_zip:
        a_zip.extractall(path = temp_dir)
        archive_count += 1
        logging.info('Cumulative total of {} archive(s) unzipped'.format(archive_count))
        bigger_list = os.listdir(temp_dir)
        logging.info('Current archive contains {} subfolders'.format(len(bigger_list)))
        for sub_folder in bigger_list:
            with zipfile.ZipFile(os.path.join(temp_dir,sub_folder),mode='r') as b_zip:
                b_zip.extractall(path = new_dir)
            file1 = "%s (%d).%s" % (basename1, file_count, 'xml')
            file2 = "%s (%d).%s" % (basename2, file_count, 'xml')
            shutil.copy(os.path.join(new_dir, 'GENERIC_ROUGHDRAFT.xml'), file1)
            shutil.copy(os.path.join(new_dir, 'XACTDOC.xml'), file2)
            file_count += 1
        logging.info('{} subfolders unzipped'.format(file_count - prior_count))
    #os.remove(data_dir)
    shutil.rmtree(data_dir)
    os.mkdir(data_dir)
    #os.unlink(data_dir)
    my_time()
logging.info('Total of {0} files -- {1} pairs -- should be in {2}'.format(2*(file_count-1), file_count-1, final_dir))

time.sleep(1)

my_time()

EN

回答 1

Stack Overflow用户

发布于 2018-05-31 03:11:39

在两个zip archive open语句中:

with zipfile.ZipFile(os.path.join(data_dir,folder),mode='r')

with zipfile.ZipFile(os.path.join(temp_dir,sub_folder),mode='r')

没有任何东西(至少我们可以检查到的任何东西)可以保证您传递的文件名实际上是.zip文件。它可以是一个目录,一个已经解压的文件,或者已经存在的某个文件...

我建议您在解压之前检查文件扩展名,例如:

import fnmatch
zfn = os.path.join(temp_dir,sub_folder)
if fnmatch.fnmatch(zfn,"*.zip"):
    with zipfile.ZipFile(zfn,mode='r') as whatever:

一些.zip文件可能已损坏,但这种可能性较小。此外,如果您希望提取具有不同扩展名的.jar和其他zip结构的文件,请将fnmatch替换为

if zfn.lower().endswith(('.zip','.jar','.docx')):
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50611708

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档