SCF provides a serverless execution environment for companies and developers. For more information, see SCF.
A common use case for TencentDB for PostgreSQL is to synchronize information from a message middleware to TencentDB for PostgreSQL for further analysis. This article provides a convenient method, namely using SCF to import data from Kafka into TencentDB for PostgreSQL in real-time, eliminating the need for users to maintain any services.
Precautions
Only Tencent Cloud CKafka is supported as the data source. External Kafka services are not supported.
One function can only import data to one table in CDWPG. To write data into multiple tables, you need to create one function for each table.
Directions
Step 1: Create a Function
Navigate to SCF Console and select Function Service > Create. In the Search field of the "Create Function" page, enter "ckafka data load to CDW" and click Next after setting up.
In the "Function Configuration" page, set up the Environment Configuration and Network Configuration as follows:
Environment configuration
Memory: set the memory based on the actual running status, which is 128 MB by default. If it is insufficient during data import, you should increase it.
Environment Variables:
Parameter
Required
Description
DB_DATABASE
Yes
Database name
DB_HOST
Yes
If the function is within a private network and shares the same subnet with TencentDB for PostgreSQL, you can enter the private IP of TencentDB for PostgreSQL. Otherwise, you need to enter the public IP and configure the whitelist.
DB_USER
Yes
Username
DB_PASSWORD
Yes
User password
DB_SCHEMA
Yes
Schema name, typically 'public' if not specified during table creation.
DB_TABLE
Yes
Table name
DB_PORT
No
The default port for TencentDB for PostgreSQL in the cloud data warehouse is 5436.
MSG_SEPARATOR_ASCII
No
The ASCII code for the data separator in CKafka is set to 39 (comma) by default. However, as commas often appear in business data, it is recommended to use 11 (Vertical tab) instead.
MSG_NULL
No
NULL value of CKafka consumption. The default value is \N
REPLACE_0X00
No
Whether to replace 0x00 in the string. The default is 0 (1 indicates replacement).
ENABLE_DEBUG
No
Whether to print erroneous records. The default is 0 (1 indicates printing).
ENABLE_COS
No
Whether to dump unwritten records to COS. The default is 0 (1 indicates dumping).
COS_SECRET_ID
No
The secret_id for accessing COS. This field is mandatory if ENABLE_COS is set to 1.
COS_SECRET_KEY
No
The secret_key for accessing COS. This field is mandatory if ENABLE_COS is set to 1.
COS_BUCKET
No
The name of the COS bucket. This field is mandatory if ENABLE_COS is set to 1.
STATMENT_TIMEOUT
No
The default query timeout period is set to 50 seconds.
Network configuration
Private Network: It is recommended to enable the private network and configure the VPC and subnet values to match those of the Cloud Data Warehouse PostgreSQL.
The following image shows the corresponding values for the Cloud Data Warehouse PostgreSQL.
Public Network Access: we recommend you also enable this option.
Step 2: Configuring the Trigger
In the SCF console, under the Function Service list, click on the newly created function to access its details page. On the left side of the page, select Trigger Management > Create Trigger to create a new trigger. The Trigger Method should be set to "Ckafka Trigger", as shown in the following figure: