社区首页 >问答首页 >使用Pythonforgcp2.0的OAuth BigQuery

问使用Pythonforgcp2.0的OAuth BigQuery
EN

Stack Overflow用户

提问于 2019-11-22 06:03:19

回答 3查看 1.8K关注 0票数 0

我正在寻找一个代码片段，用于实现GCP2.0身份验证，使用oAuth连接到GCP查询服务。

我正在使用Google cloud shell来编写python代码。但是我收到的访问令牌是错误的请求。

access_token = google.fetch_token(token_url=token_url,client_id=client_id,client_secret=client_secret,authorization_response=redirect_response).

此外，我需要自动化这个过程，所以手动粘贴的redirect_response需要避免。

python

google-cloud-platform

oauth-2.0

google-bigquery

google-oauth

回答 3

Stack Overflow用户

发布于 2019-11-22 06:18:35

建议您使用BigQuery Python客户端库。Pip package google-cloud-bigquery提供了这一点。您还需要使用服务帐户json文件设置GOOGLE_APPLICATION_CREDENTIALS。

使用这个过程，你不需要处理令牌的生成和更新，因为这个过程是由后台的客户端库处理的。

详细说明请参考BigQuery Client Libraries Python小节。

票数 0

Stack Overflow用户

发布于 2019-11-22 11:04:59

在BigQuery Client Libraries中，记录了如何从GCP控制台和Command Line设置身份验证。

要使用BigQuery API library，您需要验证您的服务帐户。gcloud命令gcloud iam service-accounts keys create [FILE_NAME].json --iam-account [NAME]@[PROJECT_ID].iam.gserviceaccount.com生成一个JSON密钥文件，其中包含执行此操作所需的私有信息(如project_id、私钥等)。

在进行BigQuery应用程序接口调用时，您需要向应用程序代码提供这样的凭据。可通过将环境变量GOOGLE_APPLICATION_CREDENTIALS设置为指向服务帐户JSON文件的路径来完成此操作

export GOOGLE_APPLICATION_CREDENTIALS="PATH/TO/SERVICE_ACCOUNT.json"

然而，这只在您当前的shell会话期间有效，所以如果这个shell会话过期或者您打开了一个新的shell会话，那么您将需要再次设置这个变量。验证凭据的另一种方法是使用

Python脚本中的google.oauth2.Credentials.from_service_account_file。

在以下Python代码中，服务帐户使用方法google.oauth2.Credentials.from_service_account_file进行身份验证，从Google Cloud Storage中的CSV文件生成一个新的BigQuery表，并将新数据插入到该表中。

from google.cloud import bigquery
from google.oauth2 import service_account

# Path to the service account credentials
key_path = "/PATH/TO/SERVICE-ACCOUNT.json"
credentials = service_account.Credentials.from_service_account_file(
    key_path,
    scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

# Instantiation of the BigQuery client
bigquery_client = bigquery.Client()

GCS_URI    = "gs://MY_BUCKET/MY_CSV_FILE"
DATASET_ID = "MY_DATASET"
TABLE_ID   = "MY_TABLE"

def bq_insert_from_gcs(target_uri = GCS_URI, dataset_id = DATASET_ID, table_id = TABLE_ID):
    """This method inserts a CSV file stored in GCS into a BigQuery Table."""

    dataset_ref = bigquery_client.dataset(dataset_id)

    job_config = bigquery.LoadJobConfig()
    # Schema autodetection enabled
    job_config.autodetect = True
    # Skipping first row which correspnds to the field names
    job_config.skip_leading_rows = 1
    # Format of the data in GCS
    job_config.source_format = bigquery.SourceFormat.CSV
    load_job = bigquery_client.load_table_from_uri(target_uri,\
                                                   dataset_ref.table(table_id),\
                                                   job_config=job_config)\

    print('Starting job {}'.format(load_job.job_id))
    print('Loading file {} into the Bigquery table {}'.format(target_uri, table_id))

    load_job.result()
    return 'Job finished.\n'


def bq_insert_to_table(rows_to_insert, dataset_id = DATASET_ID, table_id= TABLE_ID):
    """This method inserts rows into a BigQuery table"""

    # Prepares a reference to the dataset and table
    dataset_ref = bigquery_client.dataset(dataset_id)
    table_ref = dataset_ref.table(table_id)
    # API request to get table call
    table = bigquery_client.get_table(table_ref)

    # API request to insert the rows_to_insert
    print("Inserting rows into BigQuery table {}".format(table_id))
    errors = bigquery_client.insert_rows(table, rows_to_insert)
    assert errors == []


bq_insert_from_gcs()

rows_to_insert = [( u'Alice', u'cat'),\
                  (u'John', u'dog')]
bq_insert_to_table(rows_to_insert)

此外，我强烈建议使用Python3实现您的脚本，因为从2020年1月1日起，google-cloud-bigquery将不再支持Python2。

票数 0

Stack Overflow用户

发布于 2019-11-25 09:19:18

您需要导出到json的serviceaccount的凭据。GCP、->、IAM和Admin ->服务帐户，在这三个小圆点下，您将发现您的帐户的创建密钥。

正如在前面的答案中所提到的，您还需要BigQuery library

那么像这样的东西就可以工作了

from google.cloud import bigquery
from google.oauth2 import service_account

def BigQuery():
  try:
    credentials = service_account.Credentials.from_service_account_file(
      '/Credentials.json')
    project_id = '[project_id]
    client = bigquery.Client(credentials= credentials,project=project_id)

    query = ('SELECT Column1, Column2 FROM `{}.{}.{}` limit 20'.format('[project_id]','[dataset]','[table]'))
    query_job = client.query(query)
    results = query_job.result()
    for row in results:
      print('Column1 1 : {}, Column 2: {}'.format(row.Column1, row.Column2))
  except:
    print('Error!')



if __name__ == '__main__':
  BigQuery()