专栏首页bisal的个人杂货铺JDBC读取数据优化-fetch size

JDBC读取数据优化-fetch size

最近由于业务上的需求,一张旧表结构中的数据,需要提取出来,根据规则,导入一张新表结构中,开发同学写了一个工具,用于实现新旧结构的transformation,

实现逻辑简单,就是使用jdbc从A表读出数据,做了一些处理,再存入新表B中,发现读取旧表的操作,非常缓慢,无法满足要求。

读取数据的示例代码,

conn = getConnection(); long start = System.currentTimeMillis(); ps = conn.prepareStatement(sql); rs = ps.executeQuery(); long mid_end = System.currentTimeMillis(); while (rs.next()) {     list.add(rs.getString(1)); } long end = System.currentTimeMillis(); rs.close(); System.out.println("Interval1=" + (mid_end - start)); System.out.println("Interval2=" + (end - mid_end));

SQL语句读取10000条记录,其中,

Interval1=160ms Interval2=29252ms

执行executeQuery()这个SQL检索的时间为160毫秒。

执行10000次rs.next以及rs.getString(1)的用时约为30秒,平均1条记录3毫秒。

如何才能提高读取的效率?

上面读取10000条记录,每一次rs.next时间只有3毫秒,但是由于需要10000次,所以才需要30秒,我们可以猜测,是否有可能每一次rs.next的执行,均需要和数据库交互,因为如果仅是字符串操作,不应该是这个数量级。

看一下官方文档的描述,《Database JDBC Developer's Guide》有一节介绍了Fetch Size,

By default, when Oracle JDBC runs a query, it retrieves a result set of 10 rows at a time from the database cursor. This is the default Oracle row fetch size value. You can change the number of rows retrieved with each trip to the database cursor by changing the row fetch size value. Standard JDBC also enables you to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size. In Oracle JDBC, the row-prefetch value is used as the default fetch size in a statement object. Setting the fetch size overrides the row-prefetch setting and affects subsequent queries run through that statement object. Fetch size is also used in a result set. When the statement object run a query, the fetch size of the statement object is passed to the result set object produced by the query. However, you can also set the fetch size in the result set object to override the statement fetch size that was passed to it. Changes made to the fetch size of a statement object after a result set is produced will have no affect on that result set.

JDBC默认每执行一次检索,会从游标中提取10行记录,10就是默认的row fetch size值,通过设置row fetch size,可以改变每次和数据库交互,提取出来的记录行总数。需要注意的是,需要在获得检索结果集之前,设置fetch size,否则就是无效。

可以使用如下方法设置,

Setting the Fetch Size The following methods are available in all Statement, PreparedStatement, CallableStatement, and ResultSet objects for setting and getting the fetch size:

  • void setFetchSize(int rows) throws SQLException
  • int getFetchSize() throws SQLException

简单来讲,Fetch相当于读缓存,默认Fetch Size值是10,读取10000条记录,一次数据库交互,即rs.next的操作,ResultSet会一次性从数据库服务器,得到10条记录,下次执行rs.next,就直接使用内存读取,不用和数据库交互了,但总计需要有1000次交互,如果使用setFetchSize设置Fetch Size为10000,则只需要一次数据库交互,本地缓存10000条记录,每次执行rs.next,只是内存操作,不会有数据库网络消耗,效率就会高些。但需要注意的是,Fetch Size值越高则占用内存越高,要避免出现OOM错误。

方案1:


rs = ps.executeQuery(); rs.setFetchSize(10000);

即在执行ps.executeQuery()之后,对rs设置值10000,统计如下,

执行executeQuery()这个SQL检索的时间为174毫秒。

执行10000次rs.next以及rs.getString(1)的用时约为190毫秒。

相比之前执行10000次rs.next,用了30秒,提高了将近150倍。


方案2:


ps = conn.prepareStatement(sql); ps.setFetchSize(10000);

即在执行conn.prepareStatement(sql)之后,执行ps.executeQuery()之前,对rs设置值为10000范围,统计如下,

执行executeQuery()这个SQL检索的时间为267毫秒。

执行10000次rs.next以及rs.getString(1)的用时约为87毫秒。

相比方案2,总用时几乎一致,但SQL执行和rs.next遍历的用时,有些区别。


针对方案1,

After you have run the query, you can call setFetchSize on the result set object to override the statement object fetch size that was passed to it. This will affect any subsequent trips to the database to get more rows for the original query, as well as affecting any later refetching of rows.

执行查询之后,对结果集设置setFetchSize,会影响任何接下来的数据库交互过程获得更多的记录行数,以及之后的fetch提取。

针对方案2,

To set the fetch size for a query, call setFetchSize on the statement object prior to running the query. If you set the fetch size to N, then N rows are fetched with each trip to the database.

执行查询之前,设置setFetchSize,表示每次和数据库交互,得到记录行数。

综上所述,建议执行SQL之前,设置此值,效率提升最高。

对于PrepareStatement、ResultSet和Statement,均有这一个方法,有一点出入的,就是默认值设置(0),从代码中使用getFetchSize(),得到的值均为10,不知道是我理解错了,还是有其他含义?欢迎各位指教。

PrepareStatement

  • setFetchSize void setFetchSize(int rows) throws SQLException Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for ResultSetobjects generated by this Statement. If the value specified is zero, then the hint is ignored. The default value is zero.
    • Parameters:
    • rows - the number of rows to fetch
    • Throws:
    • SQLException - if a database access error occurs, this method is called on a closed Statement or the condition rows >= 0 is not satisfied.
    • Since:
    • 1.2
    • See Also:
    • getFetchSize()

ResultSet

  • setFetchSize void setFetchSize(int rows) throws SQLException Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be.  The default value is set by the Statement object that created the result set.  The fetch size may be changed at any time.
    • Parameters:
    • rows - the number of rows to fetch
    • Throws:
    • SQLException - if a database access error occurs; this method is called on a closed result set or the condition rows >= 0 is not satisfied
    • Since:
    • 1.2
    • See Also:
    • getFetchSize()

Statement

  • setFetchSize void setFetchSize(int rows) throws SQLException Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for ResultSetobjects generated by this Statement. If the value specified is zero, then the hint is ignored. The default value is zero.
    • Parameters:
    • rows - the number of rows to fetch
    • Throws:
    • SQLException - if a database access error occurs, this method is called on a closed Statement or the condition rows >= 0 is not satisfied.
    • Since:
    • 1.2
    • See Also:
    • getFetchSize()

总结:

1. Fetch相当于读缓存,如果使用setFetchSize设置Fetch Size为10000,本地缓存10000条记录,每次执行rs.next,只是内存操作,不会有数据库网络消耗,效率就会高些。但需要注意的是,Fetch Size值越高则占用内存越高,要避免出现OOM错误。

2. 建议执行SQL语句之前设置,即ps.executeQuery();之前使用setFetchSize()函数设置。

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • imp错误IMP-00098: INTERNAL ERROR: impgst2Segmentation fault

    如果使用impdp要看dump的内容,可以使用sqlfile参数,他会将所有的DDL语句写入文件,

    bisal
  • ORA-12519的错误和解决

    版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。 ...

    bisal
  • 预估表和索引空间容量的方法

    最近有一个业务功能要上线,生产数据库环境之前已经到位,目前要做的是估算下,业务数据量对数据库空间,有何影响。开发同学根据表字段定义,分别统计出了最大占用空间,以...

    bisal
  • SAP CDS view自学教程之一:如何测试基于SAP CDS view自动生成的OData服务

    I am a newbie of CDS view related topic and recently I have to learn it. I will ...

    Jerry Wang
  • Kubernetes Scheduler Extender浅析

    Scheduler 组件可以视为一种监视 watche 和将 Pod 分配 assign 到 Node 的特殊类型控制器 controller。在 Kubern...

    runzhliu
  • 2019.8.15乘兴打Codeforces Round #569 (Div. 2)小记

    Recently, on the course of algorithms and data structures, Valeriy learned how t...

    glm233
  • The Note based on Data Structures and Algorithm Analysis in C CHAPTER 3 P1

    The Note based on Data Structures and Algorithm Analysis in C

    Chiptune
  • SAP WebClient UI界面元素ID生成的逻辑分析

    If we open a WebClient UI page with Chrome and inspect its UI element via Chrome...

    Jerry Wang
  • ZOJ 3202 Second-price Auction

    Time Limit: 1 Second      Memory Limit: 32768 KB

    ShenduCC
  • 安全与隐私(CS CR)

    线性查询可以提交到包含专用数据的服务器。服务器使用加性噪声对系统破坏的查询提供响应,以保护其数据存储在服务器上的用户的隐私。隐私的度量与费希尔信息矩阵的轨迹成反...

    WEIIILII

扫码关注云+社区

领取腾讯云代金券