我的本地机器中有一组.parquet文件,我正试图上传到DataLakeGen2中的一个容器中。
我不能这样做:
def upload_file_to_directory():
try:
file_system_client = service_client.get_file_system_client(file_system="my-file-system")
directory_client = file_system_client.get_directory_client("my-directory")
file_client = directory_client.create_file("uploaded-file.parquet")
local_file = open("C:\\file-to-upload.parquet",'r')
file_contents = local_file.read()
file_client.append_data(data=file_contents, offset=0, length=len(file_contents))
file_client.flush_data(len(file_contents))
except Exception as e:
print(e)
因为.parquet文件不能被.read()函数读取。
当我尝试这样做的时候:
def upload_file_to_directory():
file_system_client = service_client.get_file_system_client(file_system="my-file-system")
directory_client = file_system_client.get_directory_client("my-directory")
file_client = directory_client.create_file("uploaded-file.parquet")
file_client.upload_file("C:\\file-to-upload.txt",'r')
我得到以下错误:
AttributeError: 'DataLakeFileClient' object has no attribute 'upload_file'
有什么建议吗?
发布于 2022-06-21 05:13:19
您收到此消息是因为您已经导入了DataLakeFileClient
模块。尝试安装DataLakeServiceClient
,因为它有upload_file
方法。
pip install DataLakeServiceClient
但是,要读取.parquet文件,解决方法之一是使用pandas
。下面是对我有用的代码。
storage_account_name='<ACCOUNT_NAME>'
storage_account_key='ACCOUNT_KEY'
service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
"https", storage_account_name), credential=storage_account_key)
file_system_client = service_client.get_file_system_client(file_system="container")
directory_client = file_system_client.get_directory_client(directory="directory")
file_client = directory_client.create_file("uploaded-file.parquet")
local_file = pd.read_parquet("<YOUR_FILE_NAME>.parquet")
df = pd.DataFrame(local_file).to_parquet()
file_client.upload_data(data=df,overwrite=True) #Either of the lines works
#file_client.append_data(data=df, offset=0, length=len(df))
file_client.flush_data(len(df))
您可能需要导入DataLakeFileClient
库才能完成此工作:
from azure.storage.filedatalake import DataLakeServiceClient
import pandas as pd
结果:
https://stackoverflow.com/questions/72692155
复制相似问题