我正在通过读取文件来做一个大的插入。文件看起来像这样,
sampletext1
sampletext2
..........
..........
sampletextN文件中有数百万行,大小约为3 GB。将每一行读取到一个变量,然后执行一次插入操作是行不通的,因为我只有大约2 GB的RAM。
我逐行读取并创建了mysql插入字符串。当代码读取了5000行时,我将它们插入到DB中,因此在一次插入中将有5000条记录。我的代码中的MySQL查询( INSERT IGNORE INTO $curr VALUES $string )像往常一样运行,直到大约25000行被读取和INSERTed,但随后它变慢了,仅一个INSERTion就需要大约5-10秒。我认为随着记录的增加,它呈线性下降。
Perl代码片段:
sub StoreToDB {
my $self = shift;;
$self->_doPreliminary();
my $data_struc = $self->_getDATA();
my $file = $data_struc->{DOMAIN_FILE};
my ($count,$cnt,$string,$curr) = (0,0,'',$self->_getTLD() . '_current');
open FH,$file or ( FullLogger($self->_getTLD(),"Cant open $file from StoreToDB : $!\n") and return );
$self->_dbConnect();
while (<FH>) {
chomp;
if ( $cnt == MAX ) {
$self->_dbExecute("INSERT IGNORE INTO $curr VALUES $string");
$count += $cnt;
$cnt = 0;
$string = '';
Logger("Inside StoreToDB, count is : $count ***\n");
}
$string .= "('" . $_ . "')";
++$cnt;
$string = ($cnt != MAX ? $string . ',' : $string . ';');
}#while
close FH;
$self->_dbDisconnect();
return 1;
}#StoreToDB
==============================
DB table details :
mysql> SHOW CREATE TABLE com_current;
+-------------+-------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------------+-------------------------------------------------------------------------------------------------------------------------------+
| com_current | CREATE TABLE `com_current` (
`domain` varchar(60) NOT NULL,
PRIMARY KEY (`domain`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------------+-------------------------------------------------------------------------------------------------------------------------------+
1 row in set (16.60 sec)
mysql>MySQL状态输出:
Uptime: 1057 Threads: 2 Questions: 250 Slow queries: 33 Opens: 38 Flush tables: 1 Open tables: 28 Queries per second avg: 0.236=============================================================== UPdATE:
到目前为止,我已经尝试了以下方法,但没有一种方法更好:
1) LOCK TABLES my_table WRITE;
then after inserting, I unlock it,
UNLOCK TABLES;
2) INSERT DELAYED IGNORE INTO $curr VALUES $string
3) LOAD DATA INFILE '$file' IGNORE INTO TABLE $curr
this is currently in progress, but seems worse than the original method.我不知道我的my.cnf是否有任何问题。所以我把它贴在这里。
[client]
port = 3306
socket = /tmp/mysql.sock
[mysqld]
datadir = /mnt/mysql/data
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 16M
max_allowed_packet = 1M
table_open_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
log-bin=mysql-bin
binlog_format=mixed
server-id = 1
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[myisamchk]
key_buffer_size = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout发布于 2011-12-09 22:07:12
如果您从一个空表开始,或者表中的行数少于要插入的行数,那么禁用索引会显著提高速度。
ALTER TABLE tbl_name DISABLE KEYS;
INSERT....
INSERT....
INSERT...
...
ALTER TABLE tbl_name ENABLE KEYS;OTOH如果你已经有了大量的数据,它实际上可能会减慢速度。
批量插入将有助于提高性能(特别是在启用索引的情况下),例如来自mysql手册:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);(看看你正在做什么,你可以考虑使用INSERT IGNORE...并首先对列表进行排序)。
但是,提高批量插入性能的最佳方法之一是将数据装载到单独的专用表中,然后使用INSERT....SELECT...(基于最频繁使用的索引在SELECT语句上使用ORDER BY将有助于保持平衡)。
https://stackoverflow.com/questions/8446320
复制相似问题