The content of this page has been automatically translated by AI. If you encounter any problems while reading, you can view the corresponding content in Chinese.

Usage of COPY

Last updated: 2024-08-22 17:03:13

COPY is used for mutual data copying between TBase tables and standard file system files. COPY TO can copy the contents of a table to a file, and COPY FROM can copy data from a file to a table (data is appended to the existing data). COPY TO can also copy the results of a SELECT query to a file. If a column list is specified, COPY will only copy the data of the specified columns to or from the file. If there are columns in the table not listed, COPY FROM will insert default values for those columns. When using COPY, the TBase server reads from or writes to a file directly from "local" storage. The file must be accessible to the TBase user (the user ID running the server) and should be named from the server's perspective.

Experiment Table Structure and Data

```sql
\d+ t
Column
Type
Modifiers
Storage
Stats target
Description
f1
integer
not null
plain


f2
character
not null
extended


f3
timestamp
default now()
plain


f4
integer

plain


The data testing process can re-enter and modify rows
select * from t;
f1
f2
f3
f4
3
pgxz
2017-10-28 18:24:05.645691

1
Tbase

7
2

2017-10-28 18:24:05.643102
3

COPY TO Detailed Usage - Copy Data to a File

Export all columns

copy public.t to '/data/pgxz/t.txt';
cat /data/pgxz/t.txt
The default generated file content includes all columns of the table, with columns separated by tabs. NULL values are represented as \N.

Export selected columns

copy public.t(f1,f2) to '/data/pgxz/t.txt';
cat /data/pgxz/t.txt
Only export the f1 and f2 columns

Export query results

copy (select f2,f3 from public.t order by f3) to '/data/pgxz/t.txt';
cat /data/pgxz/t.txt
The query can be any complex query

Specify file format

Generate in CSV format
```sql
copy public.t to '/data/pgxz/t.txt' with csv;
cat /data/pgxz/t.txt
Generate in binary format
copy public.t to '/data/pgxz/t.txt' with binary;
Default format is TEXT

Specify the delimiter between columns using the delimiter option

copy public.t to '/data/pgxz/t.txt' with delimiter '@';
cat /data/pgxz/t.txt
Specify the character that separates columns in the file. The default is a tab character for text format and a comma for CSV format. The delimiter must be a single-byte character, meaning Chinese characters are not supported. This option is not allowed when using binary format.

Handling NULL values

copy public.t to '/data/pgxz/t.txt' with NULL 'NULL';
cat /data/pgxz/t.txt
When a value is NULL, it is exported as the NULL string. This option is not allowed when using binary format.

Generate column header names

Export the table using CSV format and the HEADER option:
```sql
copy public.t to '/data/pgxz/t.txt' with csv HEADER;
View the export results:
cat /data/pgxz/t.txt
f1,f2,f3,f4
1,Tbase,,7
2,pgxc,2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,
Attempting to use the HEADER option in non-CSV mode will result in an error:
copy public.t to '/data/pgxz/t.txt' with HEADER;
ERROR: COPY HEADER available only in CSV mode

Export the oids system column

Create a new table with oids:
drop table t;
CREATE TABLE t (
f1 integer NOT NULL,
f2 text NOT NULL,
f3 timestamp without time zone,
f4 integer
) with oids DISTRIBUTE BY SHARD (f1);
Import data including oids:
copy t from '/data/pgxz/t.txt' with csv;
Query to confirm the import results:
select * from t;
f1
f2
f3
f4
1
Tbase

7
2
pg'", xc
2017-10-28 18:24:05.643102
3
3
pgxz
2017-10-28 18:24:05.645691

Export data using the oids option:
copy t to '/data/pgxz/t.txt' with oids;
View the export results:
cat /data/pgxz/t.txt
35055 1 Tbase \N 7
35056 2 pg'", xc 2017-10-28 18:24:05.643102 3
35177 3 pgxz 2017-10-28 18:24:05.645691 \N

Use the quote option to quote characters from the definition

Default CSV export reference characters are double quotes:
copy t to '/data/pgxz/t.txt' with csv;
View the export results:
cat /data/pgxz/t.txt
1,Tbase,,7
2,"pg'"", xc%",2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,
Custom reference characters are percentage signs:
copy t to '/data/pgxz/t.txt' with quote '%' csv;
View the export results:
cat /data/pgxz/t.txt
1,Tbase,,7
2,%pg'", xc%%%,2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,

Use custom escape characters:

If escape is not specified, it defaults to the QUOTE value:
copy t to '/data/pgxz/t.txt' with quote '%' csv;
Custom escape character is '@':
copy t to '/data/pgxz/t.txt' with quote '%' escape '@' csv;
View the export results:
cat /data/pgxz/t.txt
1,Tbase,,7
2,%pg'", xc@%%,2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,
Attempting to use multibyte characters as escape will result in an error:
copy t to '/data/pgxz/t.txt' with quote '%' escape '@@' csv;
ERROR: COPY escape must be a single one-byte character

Force add reference characters to a specific column

Export specified columns using CSV format and force_quote option:
```sql
copy t to '/data/pgxz/t.txt' (format 'csv', force_quote (f1,f2));
View the export results:
cat /data/pgxz/t.txt
"1","Tbase",,7
"2","pg'"", xc%",2017-10-28 18:24:05.643102,3
"3","pgxz",2017-10-28 18:24:05.645691,
Specify multiple columns to force the addition of reference characters; column order can be arbitrary:
copy t to '/data/pgxz/t.txt' (format 'csv', force_quote (f1,f4,f3,f2));
View the export results:
cat /data/pgxz/t.txt
"1","Tbase",,"7"
"2","pg'"", xc%","2017-10-28 18:24:05.643102","3"
"3","pgxz","2017-10-28 18:24:05.645691",
Attempting to use the force_quote option in non-CSV mode will result in an error:
copy t to '/data/pgxz/t.txt' (format 'text', force_quote (f1,f2,f3,f4));
ERROR: COPY force quote available only in CSV mode

Use encoding to specify file content encoding for export

Export files using different encodings:
copy t to '/data/pgxz/t.csv' (encoding utf8);
copy t to '/data/pgxz/t.csv' (encoding gbk);
Set client encoding with set_client_encoding and export CSV files:
set client_encoding to gbk;
copy t to '/data/pgxz/t.csv' with csv;

Use multibyte characters as separators

Tencent Cloud Data Warehouse TCHouse-P version 5.06.2.1 and above supports multibyte separators.
Create a new table and insert data:
create table t_position_copy(f1 int, f2 varchar(10), f3 int);
insert into t_position_copy values(123, 'Adi', 2021);
Export data using multibyte separators:
copy t_position_copy to '/data/tbase/t_position_copy_mul_delimiter.text' with delimiter '@\u0026';
View the export results:
cat /data/tbase/t_position_copy_mul_delimiter.text
123@\u0026Adi@\u00262021

COPY FROM Detailed Usage Explanation—Copy file content to the data table

Import all columns

cat /data/pgxz/t.txt
truncate table t;
copy t from '/data/pgxz/t.txt';
select * from t;

Import specified columns

Copy data from specified columns to a file:
```sql
copy t(f1,f2) to '/data/pgxz/t.txt';
View file content:
cat /data/pgxz/t.txt
1 Tbase
2 pg'", xc%
3 pgxz
Clear table t and import specified columns from a file:
truncate table t;
copy t(f1,f2) from '/data/pgxz/t.txt';
Query to confirm the import results:
select * from t;
f1
f2
f3
f4
1
Tbase
2017-10-30 11:54:16.559579

2
pg'
2017-10-30 11:54:16.559579

3
pgxz
2017-10-30 11:54:16.560283


Specify the input file format

Import text file:
copy t from '/data/pgxz/t.txt' (format 'text');
Import CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv');
Import binary file:
copy t from '/data/pgxz/t.bin' (format 'binary');
Query to confirm the import results:
select * from t;
f1
f2
f3
f4
Tbase


7
pg'", xc%
2017-10-28 18:24:05.643102
3

pgxz
2017-10-28 18:24:05.645691



Specify the delimiter between columns using the delimiter option

Import text file using a custom delimiter:
copy t from '/data/pgxz/t.txt' (format 'text', delimiter '%');
Import CSV file using custom delimiter:
copy t from '/data/pgxz/t.csv' (format 'csv', delimiter '%');
Query to confirm the import results:
select * from t;
f1
f2
f3
f4
1
Tbase

7
2
pg'", xc%
2017-10-28 18:24:05.643102
3
3
pgxz
2017-10-28 18:24:05.645691


NULL value handling

The import file contains NULL strings:
copy t from '/data/pgxz/t.txt' (null 'NULL');
Query to confirm NULL value handling results:
select * from t;
f1
f2
f3
f4
1
Tbase

7
2
pg'", xc%
2017-10-28 18:24:05.643102
3
3
pgxz
2017-10-28 18:24:05.645691

Note:
The string NULL may not be recognized in different contexts, leading to import errors.
copy t from '/data/pgxz/t.txt' (null 'null');
ERROR: invalid input syntax for type timestamp: "NULL"
CONTEXT: COPY t, line 1, column f3: "NULL"

Custom quote character

View CSV file content:
```shell
cat /data/pgxz/t.csv
1,Tbase,,7
2,%pg'", xc%%%,2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,
Configure quote character when importing CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv', quote '%');

Custom escape character

View CSV file content:
cat /data/pgxz/t.csv
1,Tbase,,7
2,"pg'@", xc%",2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,
Configure escape character when importing CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv', escape '@');
Query to confirm the import results:
select * from t;
f1
f2
f3
f4
1
Tbase

7
2
pg'", xc%
2017-10-28 18:24:05.643102
3
3
pgxz
2017-10-28 18:24:05.645691


CSV Header ignore first line

View CSV file content (with header):
cat /data/pgxz/t.csv
f1,f2,f3,f4
1,Tbase,,7
2,"pg'"", xc%",2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,
Ignore header when importing CSV file:
copy t from '/data/pgxz/t.csv' (format 'csv', header true);
Query to confirm the import results:
select * from t;
f1
f2
f3
f4
Tbase


7
pg'", xc%
2017-10-28 18:24:05.643102
3

pgxz
2017-10-28 18:24:05.645691



Import oid column value

View text file content (including oid):
cat /data/pgxz/t.txt
35242 1 Tbase \N 7
35243 2 pg'", xc% 2017-10-28 18:24:05.643102 3
35340 3 pgxz 2017-10-28 18:24:05.645691 \N
Import text file and retain oid:
truncate table t;
copy t from '/data/pgxz/t.txt' (oids true);
Query to confirm the import results:
select oid, * from t;
oid
f1
f2
f3
f4
35242
1
Tbase

7
35243
2
pg'", xc%
2017-10-28 18:24:05.643102
3
35340
3
pgxz
2017-10-28 18:24:05.645691


Use FORCE_NOT_NULL to convert null values in a column to zero-length strings

View CSV file content (including null values):
cat /data/pgxz/t.csv
1,Tbase,,7
2,"pg'"", xc%",2017-10-28 18:24:05.643102,3
3,pgxz,2017-10-28 18:24:05.645691,
4,,2017-10-30 16:14:14.954213,4
Import CSV file using FORCE_NOT_NULL:
truncate table t;
copy t from '/data/pgxz/t.csv' (format 'csv', FORCE_NOT_NULL (f2));
Query to confirm the null value handling result in a column:
select * from t where f2 = '';
f1
f2
f3
f4
4

2017-10-30 16:14:14.954213
4

Specify import file encoding with Encoding

Determine file encoding using the enca command:
```shell
enca -L zh_CN /data/pgxz/t.txt
Simplified Chinese National Standard; GB2312
Import file into the database:
copy t from '/data/pgxz/t.txt';
Query to confirm the import results:
select * from t;
f1
f2
f3
f4
1
Tbase

7
2
pg'", xc%
2017-10-28 18:24:05.643102
3
3
pgxz
2017-10-28 18:24:05.645691

If the encoding format of the import file is not specified, Chinese characters cannot be imported correctly. Reimport using GBK encoding:
truncate table t;
copy t from '/data/pgxz/t.txt' (encoding gbk);
Query to confirm import results again:
select * from t;
f1
f2
f3
f4
1
Tbase

7
2
pg'", xc%
2017-10-28 18:24:05.643102
3
3
pgxz
2017-10-28 18:24:05.645691

4
Tencent
2017-10-30 16:41:09.157612
4
You can use the enconv command to convert the file encoding before importing data:
truncate table t;
enconv -L zh_CN -x UTF-8 /data/pgxz/t.txt
copy t from '/data/pgxz/t.txt';
Query to confirm the import results after converting encoding:
select * from t;

Use Position to specify field length for import

View the content of the text file to be imported:
cat /data/tbase/t_position_copy.txt
123 Adi 2021
Use position to specify field length for import:
copy t_position_copy (f1 position(1:3), f2 position(4:9), f3 position(10:13)) from '/data/tbase/t_position_copy.txt';

Position specifies field length and function for import

```shell
cat /data/tbase/t_position_copy.txt
123 Adi 2020010120200101121212
copy t_position_copy (f1 position(1:3), f2 position(4:9), f3 position(10:17) to_date($3,'yyyymmdd'), f4 position(18:31) to_timestamp($4,'yyyymmddhh24miss')) from '/data/tbase/t_position_copy.txt';

Position specifies the encoding to be used

Tencent Cloud Data Warehouse TCHouse-P version 5.06.2.1 and above supports this feature. This feature allows data files with character encoding different from the database to be imported correctly.
file /data/tbase/t_position_copy.txt
/data/tbase/t_position_copy.txt: ISO-8859 text
Importing data without specifying encoding will cause errors:
copy t_position_copy (f1 position(1:3), f2 position(4:7), f3 position(8:11)) from '/data/tbase/t_position_copy.txt';
ERROR: multibyte character "Ü2" is truncated for column f2
CONTEXT: COPY t_position_copy, line 1: "123°¢µÜ2021", nodetype:1(1:cn,0:dn)
Specify the file encoding to be imported as gbk:
copy t_position_copy (f1 position(1:3), f2 position(4:7), f3 position(8:11)) from '/data/tbase/t_position_copy.txt' encoding 'gbk';
select * from t_position_copy;
f1
f2
f3
123
Adi
2021
Note, in gbk encoding, the length of one Chinese character is 2.
If your gbk file is exported from Oracle, then import it into the utf database of TBase to ensure correct import:
copy t4(f1 position(1:1), f2 position(2:3)) from '/data/tbase/t4.txt' with csv encoding 'gbk';
If not specified, importing will encounter errors with characters such as "猏", "玕":
copy t4(f1 position(1:1), f2 position(2:3)) from '/data/tbase/t4.txt' encoding 'gbk';
ERROR: character with byte sequence 0xab 0x0a in encoding "GBK" has no equivalent in encoding "UTF8"
CONTEXT: COPY t4, line 1: "1«1ÌÚ", nodetype:1(1:cn,0:dn)

Use multibyte separators

Tencent Cloud Data Warehouse TCHouse-P version 5.06.2.1 and above is supported.
create table t_position_copy(f1 int, f2 varchar(10), f3 int);
cat /data/tbase/t_position_copy_mul_delimiter.text
123@\u0026Adi@\u00262021
copy t_position_copy from '/data/tbase/t_position_copy_mul_delimiter.text' with delimiter '@\u0026';
select * from t_position_copy;
f1
f2
f3
123
Adi
2021