前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >HttpClient在多线程环境下踩坑总结

HttpClient在多线程环境下踩坑总结

作者头像
编程随笔
发布2019-09-11 16:22:17
8.8K0
发布2019-09-11 16:22:17
举报
文章被收录于专栏:后端开发随笔后端开发随笔

问题现场

在多线程环境下使用HttpClient组件对某个HTTP服务发起请求,运行一段时间之后发现客户端主机CPU利用率呈现出下降趋势,而不是一个稳定的状态。 而且,从程序日志中判断有线程处于hang住的状态,应该是被阻塞了。

CPU使用率逐步下降
CPU使用率逐步下降

问题排查

一开始找不到原因,怀疑是多线程并发导致的死锁问题,但是通过代码审查并未定位到任何可能的多线程并发问题。 甚至开始怀疑是否是因为内存资源不够引起JVM频繁GC到导致业务线程被暂停,但是从GC的日志输出结果看,GC是正常的。 于是,进入一种丈二和尚摸不着头脑头脑的状态,再次Review代码,发现并未设置请求超时时间,于是设置超时控制,发现问题依然存在,彻底懵逼了。 最后,dump线程堆栈和内存堆栈,再对堆栈数据进行分析。从分析结果看,确认是因为Socket连接在读取数据时被阻塞引起线程hang住。搜索“httpclient 超时”关键字,找到各式各样设置HttpClient超时控制的方式,均尝试过但是并未生效。 实际上到后来才知道,HttpCient的超时控制在不同的版本中设置请求超时参数的方式均各不相同,这才导致了我使用了网上看到的方式设置之后并未生效。当然,根本原因还是因为对HttpClient这个组件不熟悉导致的.

问题重现

1.HttpClient版本

代码语言:javascript
复制
<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient</artifactId>
    <version>4.5.2</version>
</dependency>

2.Java代码

代码语言:javascript
复制
public class HttpClientTest {
    private AtomicInteger counter = new AtomicInteger(0);
    private String url = "http://www.baidu.com/";

    public static void main(String[] args) {
        new HttpClientTest().test();
    }

    // 执行测试
    private void test() {
        int number = 100000; // 总请求数
        int concurrent = 50; // 每次并发请求数
        CountDownLatch countDownLatch = new CountDownLatch(number); // 计数器
        ExecutorService threadPool = Executors.newFixedThreadPool(concurrent); // 线程池
        int concurrentPer = concurrent;
        boolean over = false;
        while(!over) {
            number = number - concurrent;
            if(number <= 0) {
                concurrentPer = number + concurrent;
                over = true;
            }

            // 线程池批量提交
            for(int i = 0; i < concurrentPer; i++) {
                threadPool.execute(new Runnable() {
                    @Override
                    public void run() {
                        try {
                            request(url);
                            Thread.sleep(100);
                        } catch (IOException | InterruptedException e) {
                            e.printStackTrace();
                        } finally {
                            countDownLatch.countDown();
                        }
                    }
                });
            }
        }
        try {
            countDownLatch.await();
            threadPool.shutdown();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    // 访问指定地址
    private void request(String url) throws IOException {
        HttpGet httpGet = new HttpGet(url);
        commnicate(httpGet);
    }

    // 负责底层通信处理
    private void commnicate(HttpRequestBase request) throws IOException {
        ResponseHandler<String> responseHandler = new ResponseHandler<String>() {
            @Override
            public String handleResponse(HttpResponse response) throws ClientProtocolException, IOException {
                return EntityUtils.toString(response.getEntity());
            }
        };

        HttpClient client = HttpClients.createDefault();
        String body = client.execute(request, responseHandler); // 线程可能会在这里被阻塞
        System.out.println(String.format("body size: %s, counter: %s", body.length(), counter.incrementAndGet()));
    }
}

运行上述代码一段时间后很容易可以重现出问题,如下为运行控制台信息:

HttpClient重现报错
HttpClient重现报错

并且线程全部hang住,进程无法正常结束.

查看端口状态存在大量请求处于建立连接状态(ESTABLISHED):

代码语言:javascript
复制
# netstat -anpt
tcp        0      0 172.17.7.81:56408           180.118.128.179:56033       ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:57644           115.202.238.177:27016       ESTABLISHED 3847/java           
tcp        0      0 172.17.7.81:36616           117.57.21.0:25719           ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:59944           112.245.197.118:57220       ESTABLISHED 3847/java           
tcp        0      0 172.17.7.81:48722           218.5.215.10:40835          ESTABLISHED 4007/java           
tcp        0      0 172.17.7.81:52734           115.194.17.14:45210         ESTABLISHED 4007/java           
tcp        0      0 172.17.7.81:60586           59.32.37.129:16637          ESTABLISHED 3686/java           
tcp        0      0 172.17.7.81:36776           222.89.86.109:21667         ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:51690           60.161.249.162:59039        ESTABLISHED 3927/java           
tcp        0      0 172.17.7.81:42226           58.218.200.59:80            TIME_WAIT   -                   
tcp        0      0 172.17.7.81:56566           117.70.47.194:40879         ESTABLISHED 3686/java           
tcp        0      0 172.17.7.81:43266           182.120.202.204:45893       ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:55630           60.169.223.16:21280         ESTABLISHED 3927/java           
tcp        0      0 172.17.7.81:54922           60.168.81.26:25464          ESTABLISHED 3927/java           
tcp        0      0 172.17.7.81:53352           112.252.97.83:53584         ESTABLISHED 3847/java           
tcp        0      0 172.17.7.81:52684           113.121.242.43:14447        ESTABLISHED 3927/java           
tcp        0      0 172.17.7.81:54750           113.121.241.168:45173       ESTABLISHED 3686/java           
tcp        0      0 172.17.7.81:41510           113.105.202.106:47288       ESTABLISHED 4007/java           
tcp        0      0 172.17.7.81:38804           121.232.148.62:57938        ESTABLISHED 3847/java           
tcp        0      0 172.17.7.81:41468           113.105.202.106:47288       ESTABLISHED 3927/java           
tcp        0      0 172.17.7.81:45444           123.163.81.185:22012        ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:54810           113.121.241.168:45173       ESTABLISHED 4007/java           
tcp        0      0 172.17.7.81:51542           175.153.23.147:20766        ESTABLISHED 3927/java           
tcp        0      0 172.17.7.81:45644           218.5.215.10:40835          ESTABLISHED 4007/java           
tcp        0      0 172.17.7.81:35730           116.53.197.198:30042        ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:54738           113.121.241.168:45173       ESTABLISHED 3686/java           
tcp        0      0 172.17.7.81:60600           59.32.37.129:16637          ESTABLISHED 3686/java           
tcp        0      0 172.17.7.81:54862           113.121.241.168:45173       ESTABLISHED 4007/java           
tcp        0      0 172.17.7.81:40980           115.225.153.215:17292       ESTABLISHED 3686/java           
tcp        0      0 172.17.7.81:54166           123.149.162.129:18269       ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:60712           120.35.190.184:33054        ESTABLISHED 3766/java           
tcp        0      0 172.17.7.81:55802           106.42.211.65:59547         ESTABLISHED 3766/java 

同时,分析线程堆栈信息(jstack -F -l pid > thread_stack.log)可以看到如下信息:

代码语言:javascript
复制
"pool-1-thread-45" #55 prio=5 os_prio=0 tid=0x00007f78702df000 nid=0x33d5 runnable [0x00007f7830c1d000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
    at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
    at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
    at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
    at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
    at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
    at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
    at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
    at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
    at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
    at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
    at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:220)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:164)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:139)
    at org.chench.extra.HttpClientTest.commnicate(HttpClientTest.java:106) # 线程在这里阻塞
    at org.chench.extra.HttpClientTest.request(HttpClientTest.java:93)
    at org.chench.extra.HttpClientTest.access$100(HttpClientTest.java:31)
    at org.chench.extra.HttpClientTest$1.run(HttpClientTest.java:62)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - <0x0000000086d50638> (a java.util.concurrent.ThreadPoolExecutor$Worker)

从堆栈日志中可以看到,线程处于RUNNABLE状态,并且阻塞在at org.chench.extra.HttpClientTest.commnicate(HttpClientTest.java:106)处.

解决方案

通过线程堆栈日志分析可以定位到线程hang住是因为HttpClient在执行访问时被阻塞了,结合源代码找到阻塞原因是因为未设置请求超时时间.

上述问题本质上是因为HttpClient组件并未设置请求超时控制导致的:虽然连接超时,但是读取失败,导致线程一直被阻塞. 那么,应该如何设置HttpClient的超时时间呢?鉴于HttpClient的官方文档没有明确说明,并且不同版本的HttpClient组件设置超时控制的方式不一致,所以建议直接查看源码. HttpClient执行访问请求时序图如下:

HttpClient访问请求时序图
HttpClient访问请求时序图

顺藤摸瓜,在MainClientExec.javaexecute()方法中看到有2处使用了timeout参数,其含义各不相同: (1)在获取HttpClientConnection对象时需要读取配置参数中的ConnectionRequestTimeout值,该参数用于控制获取连接的超时时间.

timeout参数
timeout参数

(2)获取到HttpClientConnection对象之后读取配置参数中的SocketTimeout值,设置Socket超时时间.

设置Socket超时
设置Socket超时

显然,这2个timeout参数都需要从RequestConfig对象中获取. 既然找到了使用timeout参数的地方,下一步需要确定该参数是如何设置的.沿着HttpClient的请求时序图路径往回查找,在InternalHttpClient.java类的doExecute()方法中可以很清晰地看到设置了RequestConfig对象参数.

代码语言:javascript
复制
@Override
    protected CloseableHttpResponse doExecute(
            final HttpHost target,
            final HttpRequest request,
            final HttpContext context) throws IOException, ClientProtocolException {
        Args.notNull(request, "HTTP request");
        HttpExecutionAware execAware = null;
        if (request instanceof HttpExecutionAware) {
            execAware = (HttpExecutionAware) request;
        }
        try {
            final HttpRequestWrapper wrapper = HttpRequestWrapper.wrap(request, target);
            final HttpClientContext localcontext = HttpClientContext.adapt(
                    context != null ? context : new BasicHttpContext());
            // 通过RequestConfig对象配置连接参数
            RequestConfig config = null; 
            if (request instanceof Configurable) {
                // 如果在HttpRequest对象中设置了RequestConfig属性,直接使用
                config = ((Configurable) request).getConfig(); 
            }
            if (config == null) {
                // 如果在HttpRequest对象中未设置RequestConfig对象属性,则获取HttpParams属性构造RequestConfig对象
                final HttpParams params = request.getParams(); 
                if (params instanceof HttpParamsNames) {
                    if (!((HttpParamsNames) params).getNames().isEmpty()) {
                        config = HttpClientParamConfig.getRequestConfig(params);
                    }
                } else {
                    config = HttpClientParamConfig.getRequestConfig(params);
                }
            }
            if (config != null) {
                // 使用RequestConfig对象配置连接参数
                localcontext.setRequestConfig(config); 
            }
            setupContext(localcontext);
            final HttpRoute route = determineRoute(target, wrapper, localcontext);
            return this.execChain.execute(route, wrapper, localcontext, execAware);
        } catch (final HttpException httpException) {
            throw new ClientProtocolException(httpException);
        }
    }

(3)HttpClient默认使用的连接池为PoolingHttpClientConnectionManager,在建立连接时(connect()方法)会使用其中的SocketConfig配置参数对Socket进行配置,如下所示:

  • PoolingHttpClientConnectionManager.java
代码语言:javascript
复制
@Override
public void connect(
        final HttpClientConnection managedConn,
        final HttpRoute route,
        final int connectTimeout,
        final HttpContext context) throws IOException {
    Args.notNull(managedConn, "Managed Connection");
    Args.notNull(route, "HTTP route");
    final ManagedHttpClientConnection conn;
    synchronized (managedConn) {
        final CPoolEntry entry = CPoolProxy.getPoolEntry(managedConn);
        conn = entry.getConnection();
    }
    final HttpHost host;
    if (route.getProxyHost() != null) {
        host = route.getProxyHost();
    } else {
        host = route.getTargetHost();
    }
    final InetSocketAddress localAddress = route.getLocalSocketAddress();
    SocketConfig socketConfig = this.configData.getSocketConfig(host);
    if (socketConfig == null) {
        // 使用配置参数SocketConfig
        socketConfig = this.configData.getDefaultSocketConfig();
    }
    if (socketConfig == null) {
        socketConfig = SocketConfig.DEFAULT;
    }
    this.connectionOperator.connect(
            conn, host, localAddress, connectTimeout, socketConfig, context);
}
  • DefaultHttpClientConnectionOperator.java
代码语言:javascript
复制
@Override
    public void connect(
            final ManagedHttpClientConnection conn,
            final HttpHost host,
            final InetSocketAddress localAddress,
            final int connectTimeout,
            final SocketConfig socketConfig,
            final HttpContext context) throws IOException {
        final Lookup<ConnectionSocketFactory> registry = getSocketFactoryRegistry(context);
        final ConnectionSocketFactory sf = registry.lookup(host.getSchemeName());
        if (sf == null) {
            throw new UnsupportedSchemeException(host.getSchemeName() +
                    " protocol is not supported");
        }
        final InetAddress[] addresses = host.getAddress() != null ?
                new InetAddress[] { host.getAddress() } : this.dnsResolver.resolve(host.getHostName());
        final int port = this.schemePortResolver.resolve(host);
        for (int i = 0; i < addresses.length; i++) {
            final InetAddress address = addresses[i];
            final boolean last = i == addresses.length - 1;

            Socket sock = sf.createSocket(context);
            // 使用socketConfig参数中的超时时间对Socket进行配置
            sock.setSoTimeout(socketConfig.getSoTimeout());
            sock.setReuseAddress(socketConfig.isSoReuseAddress());
            sock.setTcpNoDelay(socketConfig.isTcpNoDelay());
            sock.setKeepAlive(socketConfig.isSoKeepAlive());
            if (socketConfig.getRcvBufSize() > 0) {
                sock.setReceiveBufferSize(socketConfig.getRcvBufSize());
            }
            if (socketConfig.getSndBufSize() > 0) {
                sock.setSendBufferSize(socketConfig.getSndBufSize());
            }

            final int linger = socketConfig.getSoLinger();
            if (linger >= 0) {
                sock.setSoLinger(true, linger);
            }
            conn.bind(sock);

            // ...
        }
    }

经过源码解读可以很明确地知道,在HttpClient 4.5.2版本中,设置连接参数有3种方式: (1)在HttpRequest对象中设置RequestConfig对象属性 (2)在HttpRequest对象中设置HttpParams对象属性. (3)在连接池对象中设置SocketConfig对象属性

既然找到了根源,下面分别通过这3种方式设置超时参数进行验证. 方式1: 通过RequestConfig对象设置超时参数

代码语言:javascript
复制
int timeOut = 5000;
RequestConfig requestConfig = RequestConfig.custom()
        .setConnectionRequestTimeout(timeOut) // 获取连接超时时间
        .setConnectTimeout(timeOut) // 设置HTTP连接超时时间
        .setSocketTimeout(timeOut) // 设置Socket超时时间
        .build();
request.setConfig(requestConfig);

方式2: 通过HttpParams对象设置超时参数

代码语言:javascript
复制
int timeOut = 5000;
HttpParams  params = new BasicHttpParams();
params.setParameter(CoreConnectionPNames.SO_TIMEOUT, timeOut);  // 设置Socket超时时间
params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, timeOut); // 设置HTTP连接超时时间
request.setParams(params);

方式3: 通过连接池对象设置超时参数

代码语言:javascript
复制
int timeOut = 5000;
PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager();
// 对连接池设置SocketConfig对象
connManager.setDefaultSocketConfig(SocketConfig.custom().setSoTimeout(timeOut).build());
client = HttpClients.custom().setConnectionManager(connManager).build();

通过上述3种方式分别设置超时参数验证,虽然在运行过程中会有报错,但是不会导致线程被阻塞,进程能正常运行结束:

有报错但是不会出现线程hang住
有报错但是不会出现线程hang住

解决问题之后客户端CPU使用率恢复正常:

CPU使用率基本稳定
CPU使用率基本稳定

总结/教训/反思

  1. 之所以会遇到这样的问题,还是因为对HttpClient组件不熟悉导致的;另外,在发现问题之后如何快速定位问题,并搜索稳定的解决方案很重要。
  2. HttpClient组件每个版本的API变化都比较大,在使用时一定要彻底清楚当前使用的版本是如何设置超时时间的。而如何确定知道超时时间控制,通过源代码查看最为妥当.
  3. 在Java平台使用Http客户端组件,可以有多个选择: (1)直接使用JDK提供的URL类访问 (2)使用HttpClient组件,有坑,不同版本设置参数的方式变动较大,最好是阅读一下当前使用版本的源码实现,正确设置好超时时间等参数 (3)如果使用了Spring MVC框架,还可以使用Spring MVC提供的RestTemplate组件,底层是使用Netty实现的客户端
  4. 遇到的这个坑本身并不属于技术难点,但是面对这个问题的解决思路值得总结: (1)程序日志,运行日志非常关键,是定位问题时第一时间需要查看的 (2)代码review,逐行逐行地审查,首先排除可能存在的代码逻辑问题,比如:死锁等 (3)通过jstack命令查看线程堆栈信息: jstack -l -F <pid> > stack.log (4)通过jmap命令查看内存堆栈信息: jmap -dump:live format=b,file=heap.bin <pid> (5)如果结合搜索引擎和上述排查步骤依然未能解决问题,应该第一时间考虑直接阅读组件的源代码实现,特别是使用了开源组件时这可能才是真正解决问题的最佳路径

【参考】 https://blog.csdn.net/u011191463/article/details/78664896 HttpClient超时设置详解 https://my.oschina.net/jywm/blog/1834702 解决httpclient超时设置不生效的问题 https://www.jianshu.com/p/4b3e172c4f2d HttpClient 4.5.2-(四)连接超时的配置 https://www.jianshu.com/p/6a41c95855e3 HttpClient 4.5.2-(五)连接池的配置 https://www.jianshu.com/p/c852cbcf3d68 HttpClient高并发下性能优化-http连接池 https://blog.csdn.net/u011402596/article/details/44619443 HttpClient 多线程处理 https://field-notes.iteye.com/blog/2383759 多线程消费使用HttpClient引发的坑 http://blog.51cto.com/lihao001/1788490 httpclient4.3 导致线程阻塞 https://study121007.iteye.com/blog/2304274 HttpClient4.5.2 连接管理 https://www.jianshu.com/p/c852cbcf3d68 HttpClient高并发下性能优化-http连接池 https://alafqq.iteye.com/blog/2325041 httpclient 多线程执行(网上版本太多了。。。误人子弟) https://gaozzsoft.iteye.com/blog/2352241 HttpClient 4.5.2版本设置连接超时时间-CloseableHttpClient设置Timeout https://www.cnblogs.com/softidea/p/6964347.html HttpClient 专题 https://blog.csdn.net/Fhaohaizi/article/details/78217903 httpclient4.5如何确保资源释放 https://blog.csdn.net/u010634066/article/details/83120122 一场HttpClient调用未关闭流引发的问题 https://www.cnblogs.com/mumuxinfei/p/5066633.html Apache HttpClient使用之阻塞陷阱 https://issues.apache.org/jira/browse/HTTPCLIENT-1584 CloseableHttpClient - SSL Handshake has no Socket Timeout https://monkeyissexy.github.io/2016/11/11/httpclient_ssl_handshake_socketTimeout_bug/ httpclient ssl handshake socketTimeout bug 分析解决过程 http://geekerwang.com/2017/10/22/%E8%AE%B0%E5%BD%95%E4%B8%80%E6%AC%A1%E5%9D%91%EF%BC%9AsocketRead-hang/ 记录一次坑:socketRead hang https://www.cnblogs.com/jessezeng/p/7448636.html 解决: httpclient ssl 验证导致死锁问题 http://itbang.me/solu/detail/201 JAVA线程卡死问题如何定位?

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2019-03-27 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 问题现场
  • 问题排查
  • 问题重现
  • 解决方案
  • 总结/教训/反思
相关产品与服务
Elasticsearch Service
腾讯云 Elasticsearch Service(ES)是云端全托管海量数据检索分析服务,拥有高性能自研内核,集成X-Pack。ES 支持通过自治索引、存算分离、集群巡检等特性轻松管理集群,也支持免运维、自动弹性、按需使用的 Serverless 模式。使用 ES 您可以高效构建信息检索、日志分析、运维监控等服务,它独特的向量检索还可助您构建基于语义、图像的AI深度应用。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档