The content of this page has been automatically translated by AI. If you encounter any problems while reading, you can view the corresponding content in Chinese.

Data Development Overview

Last updated: 2024-11-22 16:51:24

WeData Data Development provides data engineers and related personnel with an efficient, collaborative, and intelligent data development platform. The WeData Data Development module supports various engines (EMR, Data Lake Compute (DLC), Tencent Cloud Data Warehouse TCHouse-P, etc.), offering capabilities such as script development, visual development, task orchestration, task release, and task operation and maintenance, helping organizations and enterprises efficiently build their data warehouses.

Use Limits

1. The type of compute tasks available in WeData Data Development depends on the compute engine bound to the project. Binding to an EMR engine requires starting Hive and Spark services, allowing the use of HiveSQL, SparkSQL, and Spark task types for development. Other tasks in WeData based on the EMR engine, such as Trino, also require the corresponding EMR services to be enabled. Binding to a TCHouse-P engine enables TCHouse-P tasks; binding to a DLC engine allows the use of DLC SQL and DLC Spark task types. Other types of relational databases, if needed for data development in WeData, require creating a data source within WeData to use JDBC tasks for creating SQL scripts. WeData Data Development is responsible for the execution of these scripts.
2. Each compute engine has its own data access control system. WeData does not perform specific data access control but instead interfaces with each engine's permissions through account interoperability. For instance, permissions for EMR data in WeData are managed via account mapping.