Data Lake Compute Quick Start with Data Analytics in Data Lake Compute-Getting Started-Help & Documentation-Tencent Cloud

With Data Lake Compute, you can complete data analysis queries on COS in just a minute. It currently supports multiple formats including CSV, ORC, PARQUET, JSON, ARVO, and text files.
Preliminary Preparations
Before initiating a query, you need to activate the internal permissions of Data Lake Compute and configure the path for query results.
Step 1: Establish the necessary internal permissions for Data Lake Compute.
Note
 If the user already has the necessary permissions, or if they are the root account administrator, this step can be disregarded.
If you are logging in as a sub-account for the first time, in addition to the necessary CAM authorization, you also need to request any Data Lake Compute admin or root account admin to grant you the necessary Data Lake Compute permissions from the Permission Management menu on the left side of the Data Lake Compute console (for a detailed explanation of permissions, please refer to DLC Permission Overview).
1. Table Permissions: Grant read and write operation permissions to the corresponding catalog, database, table, and view.
2. Engine Permissions: These can grant usage, monitoring, and modification rights to the computation engine.
Note
The system will automatically provide each user with a shared public-engine based on the Presto kernel, allowing you to quickly try it out without the need to purchase a private cluster first.
For detailed steps on granting permissions, please refer to Sub-account Permission Management.
Step 2: Configure the path for query results.
Upon initial use of Data Lake Compute, you must first configure the path for query results. Once configured, the query results will be saved to this COS path.
1. Log in to the Data Lake Compute DLC console and select the service region.
2. Navigate to Data Exploration via the left sidebar menu.
3. Under the Database and Tables page, click on Storage Configuration to set the path for query results.
﻿
Specify the COS path for storage. If there are no available COS buckets in your account, you can create one through the Object Storage Console.
﻿
﻿
﻿
Analysis Steps
Step 1: Create a Database
If you are familiar with SQL statements, write the CREATE DATABASE statement in the query and skip the creation wizard.
1. Log in to the Data Lake Compute DLC console and select the service region.
2. Navigate to Data Exploration via the left sidebar menu.
3. Select Database, click +, choose Create Database to establish a new database. As shown below:
﻿
Enter the database name and its descriptive information.
﻿
4. After selecting the execution engine in the upper right corner, execute the generated 'create database' statement to complete the database creation.
﻿
The details are as shown below:
﻿
For detailed operation steps and configuration methods, please refer to Database Management.
Step 2: Create an External Table
If you are familiar with SQL statements, write the CREATE TABLE statement in the query and skip the creation wizard.
1. Log in to the Data Lake Compute DLC console and select the service region.
2. Navigate to Data Exploration via the left sidebar menu.
3. Select the database/table, right-click on the newly created table, and choose Create External Table.
Note
 External tables typically refer to data files stored in your own COS bucket. Data Lake Compute can directly create external tables for analysis without the need for additional data loading. Given the characteristics of external tables, actions such as executing 'drop table' will not delete your original data in Data Lake Compute, but only the metadata of the table.
﻿
4. Follow the guide to generate the table creation statement, completing each step in the following order: Data Path > Data Format > Data Format Configuration > Edit Partition.
Step 1: Select the COS path where the data files are stored (the path must be a directory under the COS bucket, not directly to the COS bucket). A shortcut for quickly uploading files to COS is also provided here. This operation requires relevant COS permissions.
Currently, Data Lake Compute supports the creation of: File, CSV, JSON, PARQUET, ORC, AVRD
Note
 Structure inference is an auxiliary tool for table creation and cannot guarantee 100% accuracy. You still need to review and verify whether the field names and types meet your expectations, and edit them to the correct information based on the actual situation.
﻿
﻿
﻿
Step 3: If there are no partitions, you can skip this step. Enabling partition use can reasonably enhance analysis performance. For detailed partition information, refer to Query Partition Table.
﻿
﻿
﻿
5. Click Complete to generate the SQL table creation statement. Execute the generated statement after selecting the data engine to complete the table creation.
     
﻿
﻿
Step 3: Execute SQL Analysis
After the data is prepared, write the SQL analysis statement, select an appropriate compute engine, and start data analysis.
﻿
﻿
﻿
Sample
Write a SQL statement with all data query results being SUCCESS and run the statement after selecting a compute engine.
select * from DataLakeCatalog.demo2.demo_audit_table where _c5 = 'SUCCESS'
﻿
The execution result is as follows:
﻿
﻿
﻿
﻿
Quick Start with Data Analytics in Data Lake Compute

On this page:

Preliminary Preparations

Step 1: Establish the necessary internal permissions for Data Lake Compute.

Step 2: Configure the path for query results.

Analysis Steps

Step 1: Create a Database

Step 2: Create an External Table

Step 3: Execute SQL Analysis

Sample