1. Directly Called Functions
1.1. API List
download_from_hdfs(hdfs_url, hdfs_path, local_path)Download files from HDFS to a local machine.:param hdfs_url: WebHDFS address, for example, http://10.0.3.16:4008.:type hdfs_url: str:param hdfs_path: path on HDFS.:type hdfs_path: str:param local_path: local path.:type local_path: str:return: local result path.:rtype: strupload_to_hdfs(local_path, hdfs_url, hdfs_path)Upload a local directory to HDFS.:param local_path: local path.:type local_path: str:param hdfs_url: WebHDFS address, for example, http://10.0.3.16:4008.:type hdfs_url: str:param hdfs_path: path on HDFS.:type hdfs_path: str:return: result path of HDFS.:rtype: strupload_to_hive_by_hdfs(local_path, hdfs_url, hive_server, table_name, database='default', auth='CUSTOM', username=None, password=None, overwrite=False, partition='')Import the data in a local file into a Hive table with HDFS as the intermediate storage.Process: Upload the local file to HDFS first, and then import the file from HDFS into the Hive table.:param local_path: local file or folder. The folder cannot contain subfolders.:type local_path: str:param hdfs_url: WebHDFS URL, for example, http://10.0.3.16:4008.:type hdfs_url: str:param hive_server: HiveServer2 address.:type hive_server: str:param table_name: Hive table name.:type table_name: str:param database: database name.:type database: str:param auth: authentication method.:type auth: str:param username: username for database authentication.:type username: str:param password: password for database authentication.:type password: str:param overwrite: Whether to delete the original data.:type overwrite: bool:param partition: partition selection.:type partition: str:return::rtype:export_from_hive_by_hdfs(local_path, hdfs_url, hive_server, table_name='', sql='', database='default', auth='CUSTOM', username=None, password=None, row_format="row format delimited fields terminated by ','")Export a Hive table to a local machine with HDFS as the intermediate storage. For large files, this method is more efficient than that of directly writing from the Hive table to the local machine.Process: Export the Hive table to HDFS first, and then download it from HDFS to the local machine.:param local_path: local directory.:type local_path: str:param hdfs_url: WebHDFS URL, for example, http://10.0.3.16:4008.:type hdfs_url: str:param hive_server: HiveServer2 address.:type hive_server: str:param table_name: Hive table name. This parameter can be ignored when SQL is specified.:type table_name: str:param sql: SQL statement for querying data, for example, select * from t1.:type sql: str:param database: database name.:type database: str:param auth: authentication method.:type auth: str:param username: username for database authentication.:type username: str:param password: password for database authentication.:type password: str:param row_format: row output format.:type row_format: str:return::rtype:
1.2. Usage:
import tikittikit.upload_to_hdfs("dir1/file1", "http://10.0.3.16:4008", "/dir1/file1")
2. Methods Called Through the Client
describe_cos_buckets(self)List all buckets.:return: bucket list.:rtype: dictThe returned results are as follows:{"Owner": {"ID": "qcs::cam::uin/100011011262:uin/100011011262","DisplayName": "100011011162"},"Buckets": {"Bucket": [{"Name": "bucket-58565","Location": "ap-beijing-fsi","CreationDate": "2021-07-21T11:06:00Z","BucketType": "cos"},{"Name": "tai-1300158565","Location": "ap-guangzhou","CreationDate": "2021-10-22T11:04:40Z","BucketType": "cos"}]}}describe_cos_path(self, bucket, path, maker='', max_keys=1000, encoding_type='')Obtain the information on the Cloud Object Storage (COS) directory. A maximum of 1,000 items under the directory can be displayed. To display the files and folders under the directory, add a slash (/) at the end of these files and folders.:param bucket: COS bucket.:type bucket: str:param path: path.:type path: str:return: directory information.:rtype: dict:param maker: List entries starting from marker.:type maker: str:param max_keys: Configure the maximum number of entries returned at a time. The maximum value is 1000.:type max_keys: int:param encoding_type: Configure the encoding method for returned results. You can only set it to url.:type encoding_type: strupload_to_cos(self, local_path, bucket, cos_path)Upload files or directories from a local path to COS.:param local_path: local path.:type local_path: str:param bucket: COS bucket.:type bucket: str:param cos_path: COS path.:type cos_path: str:return: None. Error messages are given through raise.:rtype:download_from_cos(self, bucket, cos_path, local_path)Download files or directories from COS to a local machine.Note: If local files exist, they will be directly overwritten. When cos_path is a directory and local_path is an existing directory, the folder name of cos_path will be retained as a subdirectory.:param bucket: COS bucket.:type bucket: str:param cos_path: COS path.:type cos_path: str:param local_path: local path.:type local_path: str:return: None. Error messages are given through raise.:rtype:delete_cos_path(self, bucket, delete_path)Delete the COS directory. To delete files and folders under the directory, add a slash (/) at the end of these files and folders. That is, files and folders without a slash (/) will be deleted as files, and files and folders with a slash (/) will be deleted as folders.:param bucket: COS bucket.:type bucket: str:param delete_path: path to be deleted.:type delete_path: str
3. Methods Called Through HiveInitial
3.1. Initialization
from tikit.client import Clientfrom tikit.hive import HiveInitialclient = Client("your_secret_id", "your_secret_key", "<region>")hive_init = HiveInitial(client)hive_init.hive_initial("emr-xsjhbhf", "hadoop", "./emr.keytab")
3.2. APIs
def spark_hive_initial_wedata(self, wedata_id, source_account=None):"""Initialize WeData Hive for Spark. (After calling this method, you can use Spark to perform Hive operations.):param wedata_id: WeData data source ID.:type wedata_id: int:param source_account: If Hive is a system source, the account UIN needs to be specified.:type source_account: str:rtype:"""def hive_initial_wedata(self, wedata_id, source_account=None):"""Obtain the WeData Hive connection handle.:param wedata_id: WeData data source ID.:type wedata_id: int:param source_account: If Hive is a system source, the account UIN needs to be specified.:type source_account: str:rtype:"""def spark_hive_initial(self, emr_id, username=None, keytab=None):"""Initialize EMR Hive for Spark. (After calling this method, you can use Spark to perform Hive operations.):param emr_id: Tencent Cloud EMR ID.:type emr_id: str:param username: If Kerberos authentication is used, the corresponding username needs to be specified.:type username: str:param keytab: keytab file path. If the default account (for example, hadoop) of a cluster is used, the keytab path needs to be provided.:type keytab: str:rtype:"""def hive_initial(self, emr_id, username=None, keytab=None):"""Obtain the EMR Hive connection handle.:param emr_id: Tencent Cloud EMR ID.:type emr_id: str:param username: If Kerberos authentication is used, the corresponding username needs to be specified.:type username: str:param keytab: keytab file path. If the default account (for example, hadoop) of a cluster is used, the keytab path needs to be provided.:type keytab: str:rtype:"""def hive_initial_custom(self,host=None,port=None,scheme=None,username=None,database='default',auth=None,configuration=None,kerberos_service_name=None,password=None,check_hostname=None,ssl_cert=None,thrift_transport=None):"""Connect to HiveServer2:param host: What host HiveServer2 runs on:param port: What port HiveServer2 runs on. Defaults to 10000.:param auth: The value of hive.server2.authentication used by HiveServer2.Defaults to ``NONE``.:param configuration: A dictionary of Hive settings (functionally same as the `set` command):param kerberos_service_name: Use with auth='KERBEROS' only:param password: Use with auth='LDAP' or auth='CUSTOM' only:param thrift_transport: A ``TTransportBase`` for custom advanced usage.Incompatible with host, port, auth, kerberos_service_name, and password.The way to support LDAP and GSSAPI is originated from cloudera/Impyla:https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62/impala/_thrift_api.py#L152-L160"""def upload_to_wedata_hive(self, wedata_id, local_path, table_name, database='default', overwrite=False,partition='', source_account=None):"""Upload files to WeData Hive.:param wedata_id: WeData data source ID.:type wedata_id: int:param local_path: local file path.:type local_path: str:param table_name: table name.:type local_path: str:param database: database.:type database: str:param overwrite: Whether to delete the original data.:type overwrite: bool:param partition: partition selection.:type partition: str:param source_account: If Hive is a system source, the account UIN needs to be specified.:type source_account: str:rtype:"""def export_from_wedata_hive(self, wedata_id, local_path, table_name='', database='default', sql='',row_format="row format delimited fields terminated by ','", source_account=None):"""Export WeData Hive data to a local machine.:param wedata_id: WeData data source ID.:type wedata_id: int:param local_path: local file path.:type local_path: str:param table_name: table name.:type local_path: str:param database: database.:type database: str:param sql: SQL statement for querying data, for example, select * from t1.:type sql: str:param row_format: row output format.:type row_format: str:param source_account: If Hive is a system source, the account UIN needs to be specified.:type source_account: str:rtype:"""