COPY is used for mutual data copying between TBase tables and standard file system files. COPY TO can copy the contents of a table to a file, and COPY FROM can copy data from a file to a table (data is appended to the existing data). COPY TO can also copy the results of a SELECT query to a file. If a column list is specified, COPY will only copy the data of the specified columns to or from the file. If there are columns in the table not listed, COPY FROM will insert default values for those columns. When using COPY, the TBase server reads from or writes to a file directly from "local" storage. The file must be accessible to the TBase user (the user ID running the server) and should be named from the server's perspective.
Experiment Table Structure and Data
```sql\d+ t
Column | Type | Modifiers | Storage | Stats target | Description |
f1 | integer | not null | plain | | |
f2 | character | not null | extended | | |
f3 | timestamp | default now() | plain | | |
f4 | integer | | plain | | |
The data testing process can re-enter and modify rows
select * from t;
f1 | f2 | f3 | f4 |
3 | pgxz | 2017-10-28 18:24:05.645691 | |
1 | Tbase | | 7 |
2 | | 2017-10-28 18:24:05.643102 | 3 |
COPY TO Detailed Usage - Copy Data to a File
Export all columns
copy public.t to '/data/pgxz/t.txt';
cat /data/pgxz/t.txt
The default generated file content includes all columns of the table, with columns separated by tabs. NULL values are represented as \N.
Export selected columns
copy public.t(f1,f2) to '/data/pgxz/t.txt';
cat /data/pgxz/t.txt
Only export the f1 and f2 columns
Export query results
copy (select f2,f3 from public.t order by f3) to '/data/pgxz/t.txt';
cat /data/pgxz/t.txt
The query can be any complex query
Specify file format
Generate in CSV format
```sqlcopy public.t to '/data/pgxz/t.txt' with csv;
cat /data/pgxz/t.txt
Generate in binary format
copy public.t to '/data/pgxz/t.txt' with binary;
Default format is TEXT
Specify the delimiter between columns using the delimiter option
copy public.t to '/data/pgxz/t.txt' with delimiter '@';
cat /data/pgxz/t.txt
Specify the character that separates columns in the file. The default is a tab character for text format and a comma for CSV format. The delimiter must be a single-byte character, meaning Chinese characters are not supported. This option is not allowed when using binary format.
Handling NULL values
copy public.t to '/data/pgxz/t.txt' with NULL 'NULL';
cat /data/pgxz/t.txt
When a value is NULL, it is exported as the NULL string. This option is not allowed when using binary format.
Generate column header names
Export the table using CSV format and the HEADER option:```sqlcopy public.t to '/data/pgxz/t.txt' with csv HEADER;
View the export results:
cat /data/pgxz/t.txt
f1,f2,f3,f41,Tbase,,72,pgxc,2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,
Attempting to use the HEADER option in non-CSV mode will result in an error:
copy public.t to '/data/pgxz/t.txt' with HEADER;
ERROR: COPY HEADER available only in CSV mode
Export the oids system column
Create a new table with oids:
drop table t;CREATE TABLE t (f1 integer NOT NULL,f2 text NOT NULL,f3 timestamp without time zone,f4 integer) with oids DISTRIBUTE BY SHARD (f1);
Import data including oids:
copy t from '/data/pgxz/t.txt' with csv;
Query to confirm the import results:
select * from t;
f1 | f2 | f3 | f4 |
1 | Tbase | | 7 |
2 | pg'", xc | 2017-10-28 18:24:05.643102 | 3 |
3 | pgxz | 2017-10-28 18:24:05.645691 | |
Export data using the oids option:
copy t to '/data/pgxz/t.txt' with oids;
View the export results:
cat /data/pgxz/t.txt
35055 1 Tbase \N 735056 2 pg'", xc 2017-10-28 18:24:05.643102 335177 3 pgxz 2017-10-28 18:24:05.645691 \N
Use the quote option to quote characters from the definition
Default CSV export reference characters are double quotes:
copy t to '/data/pgxz/t.txt' with csv;
View the export results:
cat /data/pgxz/t.txt
1,Tbase,,72,"pg'"", xc%",2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,
Custom reference characters are percentage signs:
copy t to '/data/pgxz/t.txt' with quote '%' csv;
View the export results:
cat /data/pgxz/t.txt
1,Tbase,,72,%pg'", xc%%%,2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,
Use custom escape characters:
If escape is not specified, it defaults to the QUOTE value:
copy t to '/data/pgxz/t.txt' with quote '%' csv;
Custom escape character is '@':
copy t to '/data/pgxz/t.txt' with quote '%' escape '@' csv;
View the export results:
cat /data/pgxz/t.txt
1,Tbase,,72,%pg'", xc@%%,2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,
Attempting to use multibyte characters as escape will result in an error:
copy t to '/data/pgxz/t.txt' with quote '%' escape '@@' csv;
ERROR: COPY escape must be a single one-byte character
Force add reference characters to a specific column
Export specified columns using CSV format andforce_quoteoption:```sqlcopy t to '/data/pgxz/t.txt' (format 'csv', force_quote (f1,f2));
View the export results:
cat /data/pgxz/t.txt
"1","Tbase",,7"2","pg'"", xc%",2017-10-28 18:24:05.643102,3"3","pgxz",2017-10-28 18:24:05.645691,
Specify multiple columns to force the addition of reference characters; column order can be arbitrary:
copy t to '/data/pgxz/t.txt' (format 'csv', force_quote (f1,f4,f3,f2));
View the export results:
cat /data/pgxz/t.txt
"1","Tbase",,"7""2","pg'"", xc%","2017-10-28 18:24:05.643102","3""3","pgxz","2017-10-28 18:24:05.645691",
Attempting to use the
force_quote option in non-CSV mode will result in an error:copy t to '/data/pgxz/t.txt' (format 'text', force_quote (f1,f2,f3,f4));
ERROR: COPY force quote available only in CSV mode
Use encoding to specify file content encoding for export
Export files using different encodings:
copy t to '/data/pgxz/t.csv' (encoding utf8);copy t to '/data/pgxz/t.csv' (encoding gbk);
Set client encoding with
set_client_encoding and export CSV files:set client_encoding to gbk;copy t to '/data/pgxz/t.csv' with csv;
Use multibyte characters as separators
Tencent Cloud Data Warehouse TCHouse-P version 5.06.2.1 and above supports multibyte separators.
Create a new table and insert data:
create table t_position_copy(f1 int, f2 varchar(10), f3 int);insert into t_position_copy values(123, 'Adi', 2021);
Export data using multibyte separators:
copy t_position_copy to '/data/tbase/t_position_copy_mul_delimiter.text' with delimiter '@\u0026';
View the export results:
cat /data/tbase/t_position_copy_mul_delimiter.text
123@\u0026Adi@\u00262021
COPY FROM Detailed Usage Explanation—Copy file content to the data table
Import all columns
cat /data/pgxz/t.txt
truncate table t;copy t from '/data/pgxz/t.txt';
select * from t;
Import specified columns
Copy data from specified columns to a file:```sqlcopy t(f1,f2) to '/data/pgxz/t.txt';
View file content:
cat /data/pgxz/t.txt
1 Tbase2 pg'", xc%3 pgxz
Clear table
t and import specified columns from a file:truncate table t;copy t(f1,f2) from '/data/pgxz/t.txt';
Query to confirm the import results:
select * from t;
f1 | f2 | f3 | f4 |
1 | Tbase | 2017-10-30 11:54:16.559579 | |
2 | pg' | 2017-10-30 11:54:16.559579 | |
3 | pgxz | 2017-10-30 11:54:16.560283 | |
Specify the input file format
Import text file:
copy t from '/data/pgxz/t.txt' (format 'text');
Import CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv');
Import binary file:
copy t from '/data/pgxz/t.bin' (format 'binary');
Query to confirm the import results:
select * from t;
f1 | f2 | f3 | f4 |
Tbase | | | 7 |
pg'", xc% | 2017-10-28 18:24:05.643102 | 3 | |
pgxz | 2017-10-28 18:24:05.645691 | | |
Specify the delimiter between columns using the delimiter option
Import text file using a custom delimiter:
copy t from '/data/pgxz/t.txt' (format 'text', delimiter '%');
Import CSV file using custom delimiter:
copy t from '/data/pgxz/t.csv' (format 'csv', delimiter '%');
Query to confirm the import results:
select * from t;
f1 | f2 | f3 | f4 |
1 | Tbase | | 7 |
2 | pg'", xc% | 2017-10-28 18:24:05.643102 | 3 |
3 | pgxz | 2017-10-28 18:24:05.645691 | |
NULL value handling
The import file contains NULL strings:
copy t from '/data/pgxz/t.txt' (null 'NULL');
Query to confirm NULL value handling results:
select * from t;
f1 | f2 | f3 | f4 |
1 | Tbase | | 7 |
2 | pg'", xc% | 2017-10-28 18:24:05.643102 | 3 |
3 | pgxz | 2017-10-28 18:24:05.645691 | |
Note:
The string NULL may not be recognized in different contexts, leading to import errors.
copy t from '/data/pgxz/t.txt' (null 'null');
ERROR: invalid input syntax for type timestamp: "NULL"CONTEXT: COPY t, line 1, column f3: "NULL"
Custom quote character
View CSV file content:```shellcat /data/pgxz/t.csv
1,Tbase,,72,%pg'", xc%%%,2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,
Configure quote character when importing CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv', quote '%');
Custom escape character
View CSV file content:
cat /data/pgxz/t.csv
1,Tbase,,72,"pg'@", xc%",2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,
Configure escape character when importing CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv', escape '@');
Query to confirm the import results:
select * from t;
f1 | f2 | f3 | f4 |
1 | Tbase | | 7 |
2 | pg'", xc% | 2017-10-28 18:24:05.643102 | 3 |
3 | pgxz | 2017-10-28 18:24:05.645691 | |
CSV Header ignore first line
View CSV file content (with header):
cat /data/pgxz/t.csv
f1,f2,f3,f41,Tbase,,72,"pg'"", xc%",2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,
Ignore header when importing CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv', header true);
Query to confirm the import results:
select * from t;
f1 | f2 | f3 | f4 |
Tbase | | | 7 |
pg'", xc% | 2017-10-28 18:24:05.643102 | 3 | |
pgxz | 2017-10-28 18:24:05.645691 | | |
Import oid column value
View text file content (including oid):
cat /data/pgxz/t.txt
35242 1 Tbase \N 735243 2 pg'", xc% 2017-10-28 18:24:05.643102 335340 3 pgxz 2017-10-28 18:24:05.645691 \N
Import text file and retain oid:
truncate table t;copy t from '/data/pgxz/t.txt' (oids true);
Query to confirm the import results:
select oid, * from t;
oid | f1 | f2 | f3 | f4 |
35242 | 1 | Tbase | | 7 |
35243 | 2 | pg'", xc% | 2017-10-28 18:24:05.643102 | 3 |
35340 | 3 | pgxz | 2017-10-28 18:24:05.645691 | |
Use FORCE_NOT_NULL to convert null values in a column to zero-length strings
View CSV file content (including null values):
cat /data/pgxz/t.csv
1,Tbase,,72,"pg'"", xc%",2017-10-28 18:24:05.643102,33,pgxz,2017-10-28 18:24:05.645691,4,,2017-10-30 16:14:14.954213,4
Import CSV file using FORCE_NOT_NULL:
truncate table t;copy t from '/data/pgxz/t.csv' (format 'csv', FORCE_NOT_NULL (f2));
Query to confirm the null value handling result in a column:
select * from t where f2 = '';
f1 | f2 | f3 | f4 |
4 | | 2017-10-30 16:14:14.954213 | 4 |
Specify import file encoding with Encoding
Determine file encoding using theencacommand:```shellenca -L zh_CN /data/pgxz/t.txt
Simplified Chinese National Standard; GB2312
Import file into the database:
copy t from '/data/pgxz/t.txt';
Query to confirm the import results:
select * from t;
f1 | f2 | f3 | f4 |
1 | Tbase | | 7 |
2 | pg'", xc% | 2017-10-28 18:24:05.643102 | 3 |
3 | pgxz | 2017-10-28 18:24:05.645691 | |
If the encoding format of the import file is not specified, Chinese characters cannot be imported correctly. Reimport using GBK encoding:
truncate table t;copy t from '/data/pgxz/t.txt' (encoding gbk);
Query to confirm import results again:
select * from t;
f1 | f2 | f3 | f4 |
1 | Tbase | | 7 |
2 | pg'", xc% | 2017-10-28 18:24:05.643102 | 3 |
3 | pgxz | 2017-10-28 18:24:05.645691 | |
4 | Tencent | 2017-10-30 16:41:09.157612 | 4 |
You can use the
enconv command to convert the file encoding before importing data:truncate table t;enconv -L zh_CN -x UTF-8 /data/pgxz/t.txtcopy t from '/data/pgxz/t.txt';
Query to confirm the import results after converting encoding:
select * from t;
Use Position to specify field length for import
View the content of the text file to be imported:
cat /data/tbase/t_position_copy.txt
123 Adi 2021
Use position to specify field length for import:
copy t_position_copy (f1 position(1:3), f2 position(4:9), f3 position(10:13)) from '/data/tbase/t_position_copy.txt';
Position specifies field length and function for import
```shellcat /data/tbase/t_position_copy.txt
123 Adi 2020010120200101121212
copy t_position_copy (f1 position(1:3), f2 position(4:9), f3 position(10:17) to_date($3,'yyyymmdd'), f4 position(18:31) to_timestamp($4,'yyyymmddhh24miss')) from '/data/tbase/t_position_copy.txt';
Position specifies the encoding to be used
Tencent Cloud Data Warehouse TCHouse-P version 5.06.2.1 and above supports this feature. This feature allows data files with character encoding different from the database to be imported correctly.
file /data/tbase/t_position_copy.txt
/data/tbase/t_position_copy.txt: ISO-8859 text
Importing data without specifying encoding will cause errors:
copy t_position_copy (f1 position(1:3), f2 position(4:7), f3 position(8:11)) from '/data/tbase/t_position_copy.txt';
ERROR: multibyte character "Ü2" is truncated for column f2CONTEXT: COPY t_position_copy, line 1: "123°¢µÜ2021", nodetype:1(1:cn,0:dn)
Specify the file encoding to be imported as gbk:
copy t_position_copy (f1 position(1:3), f2 position(4:7), f3 position(8:11)) from '/data/tbase/t_position_copy.txt' encoding 'gbk';
select * from t_position_copy;
f1 | f2 | f3 |
123 | Adi | 2021 |
Note, in gbk encoding, the length of one Chinese character is 2.
If your gbk file is exported from Oracle, then import it into the utf database of TBase to ensure correct import:
copy t4(f1 position(1:1), f2 position(2:3)) from '/data/tbase/t4.txt' with csv encoding 'gbk';
If not specified, importing will encounter errors with characters such as "猏", "玕":
copy t4(f1 position(1:1), f2 position(2:3)) from '/data/tbase/t4.txt' encoding 'gbk';
ERROR: character with byte sequence 0xab 0x0a in encoding "GBK" has no equivalent in encoding "UTF8"CONTEXT: COPY t4, line 1: "1«1ÌÚ", nodetype:1(1:cn,0:dn)
Use multibyte separators
Tencent Cloud Data Warehouse TCHouse-P version 5.06.2.1 and above is supported.
create table t_position_copy(f1 int, f2 varchar(10), f3 int);
cat /data/tbase/t_position_copy_mul_delimiter.text
123@\u0026Adi@\u00262021
copy t_position_copy from '/data/tbase/t_position_copy_mul_delimiter.text' with delimiter '@\u0026';
select * from t_position_copy;
f1 | f2 | f3 |
123 | Adi | 2021 |