接着续上次提到的sqlldr的性能问题,加载一个表数据400多万条记录,竟然用了6个多小时。最后大家争论不休的时候,我发现应该是网络的问题。 http://blog.itpub.net/23718752/viewspace-1182534/ 今天客户IT的同事把网络做了调整,他们就想看看到底改进有多大。 下面是测试的一些记录。
àoriginal logs for issue table, loading around 6 hours.
Total logical records skipped: 0
Total logical records read: 4096786
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Wed Jun 11 08:52:55 2014
Run ended on Wed Jun 11 14:57:40 2014
Elapsed time was: 06:04:44.05
CPU time was: 00:00:38.18
àafter network enhanced, loading time has reduced to around 4 mins.
Total logical records skipped: 0
Total logical records read: 4096786
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Mon Jun 16 12:58:01 2014
Run ended on Mon Jun 16 13:02:15 2014
Elapsed time was: 00:04:13.28
CPU time was: 00:00:30.00
àconsidering impact from index, I loaded data again, keep table structure,index exactly the same.
Total logical records skipped: 0
Total logical records read: 4096786
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Mon Jun 16 13:09:34 2014
Run ended on Mon Jun 16 13:16:37 2014
Elapsed time was: 00:07:02.61
CPU time was: 00:00:31.23
àtune sqlldr command with below options, keep table structure,index exactly the same.
parallel=true bindsize=7500000 readsize=7500000 streamsize=7500000 rows=50000
Total logical records skipped: 0
Total logical records read: 4096786
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Mon Jun 16 13:21:13 2014
Run ended on Mon Jun 16 13:24:19 2014
Elapsed time was: 00:03:05.75
CPU time was: 00:00:41.27
所以综上所述,在排查了cpu,io,缓存,网络的因素后,还可以使用一些选项来优化sqlldr,在某种程度上得到提升。