专栏首页做不甩锅的后端java线程池(三):ThreadPoolExecutor源码分析

java线程池(三):ThreadPoolExecutor源码分析

在前面分析了Executors工厂方法类之后,我们来看看AbstractExecutorService的最主要的一种实现类,ThreadpoolExecutor。

1.类的结构及其成员变量

1.类的基本结构

ThreadPoolExecutor类是AbstractExecutorService的一个实现类。其类的主要结构如下所示:

我们可以看看这个类的注释:

/**
 * An {@link ExecutorService} that executes each submitted task using
 * one of possibly several pooled threads, normally configured
 * using {@link Executors} factory methods.
 *
 * <p>Thread pools address two different problems: they usually
 * provide improved performance when executing large numbers of
 * asynchronous tasks, due to reduced per-task invocation overhead,
 * and they provide a means of bounding and managing the resources,
 * including threads, consumed when executing a collection of tasks.
 * Each {@code ThreadPoolExecutor} also maintains some basic
 * statistics, such as the number of completed tasks.
 *
 * <p>To be useful across a wide range of contexts, this class
 * provides many adjustable parameters and extensibility
 * hooks. However, programmers are urged to use the more convenient
 * {@link Executors} factory methods {@link
 * Executors#newCachedThreadPool} (unbounded thread pool, with
 * automatic thread reclamation), {@link Executors#newFixedThreadPool}
 * (fixed size thread pool) and {@link
 * Executors#newSingleThreadExecutor} (single background thread), that
 * preconfigure settings for the most common usage
 * scenarios. Otherwise, use the following guide when manually
 * configuring and tuning this class:
 *
 * <dl>
 *
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize}).
 *
 * When a new task is submitted in method {@link #execute(Runnable)},
 * and fewer than corePoolSize threads are running, a new thread is
 * created to handle the request, even if other worker threads are
 * idle.  If there are more than corePoolSize but less than
 * maximumPoolSize threads running, a new thread will be created only
 * if the queue is full.  By setting corePoolSize and maximumPoolSize
 * the same, you create a fixed-size thread pool. By setting
 * maximumPoolSize to an essentially unbounded value such as {@code
 * Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary
 * number of concurrent tasks. Most typically, core and maximum pool
 * sizes are set only upon construction, but they may also be changed
 * dynamically using {@link #setCorePoolSize} and {@link
 * #setMaximumPoolSize}. </dd>
 *
 * <dt>On-demand construction</dt>
 *
 * <dd>By default, even core threads are initially created and
 * started only when new tasks arrive, but this can be overridden
 * dynamically using method {@link #prestartCoreThread} or {@link
 * #prestartAllCoreThreads}.  You probably want to prestart threads if
 * you construct the pool with a non-empty queue. </dd>
 *
 * <dt>Creating new threads</dt>
 *
 * <dd>New threads are created using a {@link ThreadFactory}.  If not
 * otherwise specified, a {@link Executors#defaultThreadFactory} is
 * used, that creates threads to all be in the same {@link
 * ThreadGroup} and with the same {@code NORM_PRIORITY} priority and
 * non-daemon status. By supplying a different ThreadFactory, you can
 * alter the thread's name, thread group, priority, daemon status,
 * etc. If a {@code ThreadFactory} fails to create a thread when asked
 * by returning null from {@code newThread}, the executor will
 * continue, but might not be able to execute any tasks. Threads
 * should possess the "modifyThread" {@code RuntimePermission}. If
 * worker threads or other threads using the pool do not possess this
 * permission, service may be degraded: configuration changes may not
 * take effect in a timely manner, and a shutdown pool may remain in a
 * state in which termination is possible but not completed.</dd>
 *
 * <dt>Keep-alive times</dt>
 *
 * <dd>If the pool currently has more than corePoolSize threads,
 * excess threads will be terminated if they have been idle for more
 * than the keepAliveTime (see {@link #getKeepAliveTime(TimeUnit)}).
 * This provides a means of reducing resource consumption when the
 * pool is not being actively used. If the pool becomes more active
 * later, new threads will be constructed. This parameter can also be
 * changed dynamically using method {@link #setKeepAliveTime(long,
 * TimeUnit)}.  Using a value of {@code Long.MAX_VALUE} {@link
 * TimeUnit#NANOSECONDS} effectively disables idle threads from ever
 * terminating prior to shut down. By default, the keep-alive policy
 * applies only when there are more than corePoolSize threads. But
 * method {@link #allowCoreThreadTimeOut(boolean)} can be used to
 * apply this time-out policy to core threads as well, so long as the
 * keepAliveTime value is non-zero. </dd>
 *
 * <dt>Queuing</dt>
 *
 * <dd>Any {@link BlockingQueue} may be used to transfer and hold
 * submitted tasks.  The use of this queue interacts with pool sizing:
 *
 * <ul>
 *
 * <li> If fewer than corePoolSize threads are running, the Executor
 * always prefers adding a new thread
 * rather than queuing.</li>
 *
 * <li> If corePoolSize or more threads are running, the Executor
 * always prefers queuing a request rather than adding a new
 * thread.</li>
 *
 * <li> If a request cannot be queued, a new thread is created unless
 * this would exceed maximumPoolSize, in which case, the task will be
 * rejected.</li>
 *
 * </ul>
 *
 * There are three general strategies for queuing:
 * <ol>
 *
 * <li> <em> Direct handoffs.</em> A good default choice for a work
 * queue is a {@link SynchronousQueue} that hands off tasks to threads
 * without otherwise holding them. Here, an attempt to queue a task
 * will fail if no threads are immediately available to run it, so a
 * new thread will be constructed. This policy avoids lockups when
 * handling sets of requests that might have internal dependencies.
 * Direct handoffs generally require unbounded maximumPoolSizes to
 * avoid rejection of new submitted tasks. This in turn admits the
 * possibility of unbounded thread growth when commands continue to
 * arrive on average faster than they can be processed.  </li>
 *
 * <li><em> Unbounded queues.</em> Using an unbounded queue (for
 * example a {@link LinkedBlockingQueue} without a predefined
 * capacity) will cause new tasks to wait in the queue when all
 * corePoolSize threads are busy. Thus, no more than corePoolSize
 * threads will ever be created. (And the value of the maximumPoolSize
 * therefore doesn't have any effect.)  This may be appropriate when
 * each task is completely independent of others, so tasks cannot
 * affect each others execution; for example, in a web page server.
 * While this style of queuing can be useful in smoothing out
 * transient bursts of requests, it admits the possibility of
 * unbounded work queue growth when commands continue to arrive on
 * average faster than they can be processed.  </li>
 *
 * <li><em>Bounded queues.</em> A bounded queue (for example, an
 * {@link ArrayBlockingQueue}) helps prevent resource exhaustion when
 * used with finite maximumPoolSizes, but can be more difficult to
 * tune and control.  Queue sizes and maximum pool sizes may be traded
 * off for each other: Using large queues and small pools minimizes
 * CPU usage, OS resources, and context-switching overhead, but can
 * lead to artificially low throughput.  If tasks frequently block (for
 * example if they are I/O bound), a system may be able to schedule
 * time for more threads than you otherwise allow. Use of small queues
 * generally requires larger pool sizes, which keeps CPUs busier but
 * may encounter unacceptable scheduling overhead, which also
 * decreases throughput.  </li>
 *
 * </ol>
 *
 * </dd>
 *
 * <dt>Rejected tasks</dt>
 *
 * <dd>New tasks submitted in method {@link #execute(Runnable)} will be
 * <em>rejected</em> when the Executor has been shut down, and also when
 * the Executor uses finite bounds for both maximum threads and work queue
 * capacity, and is saturated.  In either case, the {@code execute} method
 * invokes the {@link
 * RejectedExecutionHandler#rejectedExecution(Runnable, ThreadPoolExecutor)}
 * method of its {@link RejectedExecutionHandler}.  Four predefined handler
 * policies are provided:
 *
 * <ol>
 *
 * <li> In the default {@link ThreadPoolExecutor.AbortPolicy}, the
 * handler throws a runtime {@link RejectedExecutionException} upon
 * rejection. </li>
 *
 * <li> In {@link ThreadPoolExecutor.CallerRunsPolicy}, the thread
 * that invokes {@code execute} itself runs the task. This provides a
 * simple feedback control mechanism that will slow down the rate that
 * new tasks are submitted. </li>
 *
 * <li> In {@link ThreadPoolExecutor.DiscardPolicy}, a task that
 * cannot be executed is simply dropped.  </li>
 *
 * <li>In {@link ThreadPoolExecutor.DiscardOldestPolicy}, if the
 * executor is not shut down, the task at the head of the work queue
 * is dropped, and then execution is retried (which can fail again,
 * causing this to be repeated.) </li>
 *
 * </ol>
 *
 * It is possible to define and use other kinds of {@link
 * RejectedExecutionHandler} classes. Doing so requires some care
 * especially when policies are designed to work only under particular
 * capacity or queuing policies. </dd>
 *
 * <dt>Hook methods</dt>
 *
 * <dd>This class provides {@code protected} overridable
 * {@link #beforeExecute(Thread, Runnable)} and
 * {@link #afterExecute(Runnable, Throwable)} methods that are called
 * before and after execution of each task.  These can be used to
 * manipulate the execution environment; for example, reinitializing
 * ThreadLocals, gathering statistics, or adding log entries.
 * Additionally, method {@link #terminated} can be overridden to perform
 * any special processing that needs to be done once the Executor has
 * fully terminated.
 *
 * <p>If hook or callback methods throw exceptions, internal worker
 * threads may in turn fail and abruptly terminate.</dd>
 *
 * <dt>Queue maintenance</dt>
 *
 * <dd>Method {@link #getQueue()} allows access to the work queue
 * for purposes of monitoring and debugging.  Use of this method for
 * any other purpose is strongly discouraged.  Two supplied methods,
 * {@link #remove(Runnable)} and {@link #purge} are available to
 * assist in storage reclamation when large numbers of queued tasks
 * become cancelled.</dd>
 *
 * <dt>Finalization</dt>
 *
 * <dd>A pool that is no longer referenced in a program <em>AND</em>
 * has no remaining threads will be {@code shutdown} automatically. If
 * you would like to ensure that unreferenced pools are reclaimed even
 * if users forget to call {@link #shutdown}, then you must arrange
 * that unused threads eventually die, by setting appropriate
 * keep-alive times, using a lower bound of zero core threads and/or
 * setting {@link #allowCoreThreadTimeOut(boolean)}.  </dd>
 *
 * </dl>
 *
 * <p><b>Extension example</b>. Most extensions of this class
 * override one or more of the protected hook methods. For example,
 * here is a subclass that adds a simple pause/resume feature:
 *
 *  <pre> {@code
 * class PausableThreadPoolExecutor extends ThreadPoolExecutor {
 *   private boolean isPaused;
 *   private ReentrantLock pauseLock = new ReentrantLock();
 *   private Condition unpaused = pauseLock.newCondition();
 *
 *   public PausableThreadPoolExecutor(...) { super(...); }
 *
 *   protected void beforeExecute(Thread t, Runnable r) {
 *     super.beforeExecute(t, r);
 *     pauseLock.lock();
 *     try {
 *       while (isPaused) unpaused.await();
 *     } catch (InterruptedException ie) {
 *       t.interrupt();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void pause() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = true;
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void resume() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = false;
 *       unpaused.signalAll();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 * }}</pre>
 *
 * @since 1.5
 * @author Doug Lea
 */
public class ThreadPoolExecutor extends AbstractExecutorService {

}

其大意为: 一个ExecutorService的实现类,它使用一个具有多个线程的池中的一个线程来执行提交的任务,这些线程通常使用工厂方法Executors进行配置。 线程池用于解决两个不同的问题,由于减少了每个任务的调用开销,他们通常在执行大量异步任务的时候可提供改进的性能,并且他们提供了一种绑定和管理资源(包括线程)的方法。该资源在执行集合的执行时消耗任务,每个ThreadPoolExecutor还维护了一些基本的统计信息,例如已完成的任务数量。 为了在广泛的上下文中有用,该类提供了许多可调整的参数和可扩展的钩子函数。但是,强烈建议程序员使用Executors工厂方法的newCachedThreadPool,该方法的线程池无边界,具有自动的线程回收。newFixedThreadPool(固定大小的线程池)和newSingleThreadExecutor(单个后台线程)。可以为最常见的使用场景预配置设置。否则,在手动配置和调整此类时,请使用以下指南:

核心和最大池大小: ThreadPoolExecutor将根据corePoolSize和maximumPoolSize设置范围自动调整池的大小。请参见getPoolSize。当在方法execute中提交新任务,并且正在运行的线程少于corePoolSize线程时,即使其他工作线程处于空闲状态,也会创建一个新的线程来处理请求。如果运行的线程数大于corePoolSize,但是小于maximumPoolSize。则仅在队列已满的时候才创建线程。通过corePoolSize和maximumPoolSize设置为相同,可以创建为固定大小的线程池。通过将maximumPoolSize设置为本质上不受限制的Integer.MAX_VALUE,则可以允许线程池容纳任意数量的并发任务,通常,核心线程和最大的池大小仅在构造的时候设置。但也可以使用setCorePoolSize和setMaximumPoolSize动态更改。

按需构建: 默认情况下,甚至只有在有新任务到达的时候才开始启动核心线程,但是可以使用方法prestartCoreThread或者prestartAllCoreThreads动态的进行覆盖。如果使用非空队列构造线程池,则可能需要预启动线程。

创建新线程: 使用ThreadFactory创建新线程,如果没有指定,那么采用Executors的defaultThreadFactory默认方法。该方法创建的全部线程具有相同的NORM_PRIORITY优先级和非守护线程的状态。通过提供不同的ThreadFactiry可以更改线程的名称,线程组、优先级、守护线程状态等。如果通过向newThread返回null时要求ThreadFactory创建线程失败,执行程序将继续,但可能无法执行任何任务。线程应具有modifyThread的RuntimePermission。如果使用该线程池的工作线程或者其他线程不具有此权限,则服务可能会降级,配置更改可能不会即时生效。并且关闭线程池可能保持在可能终止但是没有完成的状态。

Keep-alive时间: 如果当前的池中的线程数超过corePoolSize,则多余的线程将在空闲的时间超过keepAliveTime时终止。请参考getKeepAliveTime,当不积极使用线程池时,这提供了减少资源消耗的办法,也可以使用方法setKeepAliveTime动态调整(long,TimeUnit)动态调整此参数。如果使用Long.MAX_VALUE和TimeUnit#NANOSECONDS,则有效的使用空闲线程不会再线程池关闭之前关闭。默认情况下,仅当corePoolSize线程数多时,保持活动策略才适用。但是方法allowCoreThreadTimeOut也可以用于将此超时策略应用于核心线程,只要keepAliveTime值不为0即可。

排队: 任何BlockingQueue均可用于传输和保留提交的任务,此队列的使用与线程池的大小互相影响:如果正在运行的线程少于corePoolSize线程,则执行程序总是添加新的线程来执行任务,而不是排队。如果正在运行corePoolSize或者更多的线程,则执行程序总是喜欢对请求进行排队,而不是添加新线程。如果无法将请求放入队列中,则创建一个新线程,除非该线程超过了maximumPoolSize,在这种情况下,该任务将被拒绝。

排队一般有三种策略:

  • 直接传递:SynchronousQueue是工作队列的一个默认的选择。他可以将任务移交给线程,而不必另外保留。在这里,如果没有立即可用的线程来运行任务,则试图将任务进行排队的尝试将失败,因此需要构造一个新的线程。在处理可能具有内部依赖性的请求集的时候,此策略避免了锁。直接切换通常需要无限制的maximumPoolSizes以避免拒绝提交的新任务。反过来,当平均而言,命令继续以比其处理速度更快的到达时,这可能会带来无限线程增长的可能性。
  • 无界队列:当所有corePoolSize都处于忙的时候,使用无届队列,如没有预定容量的LinkedBlockingQueue。将导致新任务在队列中等待。因此,仅创建corePoolSize线程。maximumPoolSize将没有任何作用。当每个任务完全独立于其他任务的时候,这可能是适当的。因此任务不会影响彼此的执行,例如,在网页服务器中。尽管这种排队方式对于消除短暂的突发请求很有用,但它承认当命令平均继续以比处理速度更快的速度到达时,工作队列会无限增长,这可能会造成OOM。
  • 有界队列:与有限的maximumPoolSizes一起使用时,有界队列如ArrayBlockingQueue,有助于防止资源耗尽,但是调优和控制将会非常困难。队列大小和最大的线程池大小可能会互相折衷。使用大队列和小的池可以最大程度的减少CPU的使用率,操作系统和上下文切换的开销,但是会人为的导致吞吐量下降。如果任务频繁阻塞,如I/O,则系统可能认为你安排的线程调度的时间超出了允许的范围。使用小队列通常需要更大的池的大小,这会使得CPU繁忙,但是可能会遇到无法接受的调度开销,这也会降低吞吐量。

拒绝任务: 当执行器关闭时,并且执行器对最大线程数和工作队列容量使用有限范围时,在方法execute提交的新任务将被拒绝。处于饱和。无论那种情况,execute将用RejectedExecutionHandler的RejectedExecutionHandler#rejectedExecution(Runnable,ThreadPoolExecutor)方法。提供了4个预订的处理策略:

  • 默认的拒绝策略ThreadPoolExecutor.AbortPolicy,处理程序在拒绝的时候会抛出RejectedExecutionException异常。
  • 在ThreadPoolExecutor.CallerRunsPolicy中,调用execute本身的线程运行任务。这提供了一种简单的反馈机制,将降低新任务的提交速度。
  • 在ThreadPoolExecutor.DiscardPolicy中,简单的删除了无法执行的任务。
  • 在ThreadPoolExecutor.DiscardOldestPolicy策略中,如果未关闭执行程序,则将丢弃工作队列开头的任务,然后重试执行,该操作可能再次失败,导致重复执行此操作。 也可以定义和使用其他类型的RejectedExecutionHandler,但是这样做需要格外小心,尤其在设计策略仅在特定容量或者排队策略下才能工作时。

Hook方法: 线程池类提供了protected权限的可重写的beforeExecute和afterExecute方法。这些方法在每个线程的之前前后被调用。这些可以用来对执行环境进行操作,例如,重新初始化ThreadLocals收集统计信息或者添加统计条目,此外,一旦线程完成终止,方法terminated可以被覆盖以执行需要执行的任何特殊处理。 如果钩子函数或者回掉方法出现异常,内部工作的线程可能失败并突然终止。

队列维护: 方法getQueue允许访问工作队列,以进行监视和调试,强烈建议不要将此方法用于任何其他目的,当取消大量排队任务的时候,可以使用提供的两种方法,remove和purge来帮助回收内存。

Finalization: 在一个程序中如果不再对线程池进行引用,或者没有剩余的线程,线程池将自动shutdown。如果即使用户忘记调用shutdown的情况下,如果你想确保回收未引用的线程池,则必须使用0核心线程的下限来设置适当的keepAlive时间,以使未使用的线程最终消亡。通过allowCoreThreadTimeOut设置。 扩展示例:此类的大多数扩展都覆盖一个或者多个受保护的hook方法,例如:以下是一个子类,它添加了一个简单的暂停/继续功能:

class PausableThreadPoolExecutor extends ThreadPoolExecutor {
   private boolean isPaused;
    private ReentrantLock pauseLock = new ReentrantLock();
    private Condition unpaused = pauseLock.newCondition();
 
    public PausableThreadPoolExecutor(...) { super(...); }
 
    protected void beforeExecute(Thread t, Runnable r) {
      super.beforeExecute(t, r);
      pauseLock.lock();
      try {
        while (isPaused) unpaused.await();
      } catch (InterruptedException ie) {
        t.interrupt();
      } finally {
        pauseLock.unlock();
      }
    }
 
    public void pause() {
      pauseLock.lock();
      try {
        isPaused = true;
      } finally {
        pauseLock.unlock();
      }
    }
 
    public void resume() {
      pauseLock.lock();
      try {
        isPaused = false;
        unpaused.signalAll();
      } finally {
        pauseLock.unlock();
      }
    }
  }}

1.2 成员变量及常量

在类的内部,还有大段关于线程池状态的注释:

/**
 * The main pool control state, ctl, is an atomic integer packing
 * two conceptual fields
 *   workerCount, indicating the effective number of threads
 *   runState,    indicating whether running, shutting down etc
 *
 * In order to pack them into one int, we limit workerCount to
 * (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
 * billion) otherwise representable. If this is ever an issue in
 * the future, the variable can be changed to be an AtomicLong,
 * and the shift/mask constants below adjusted. But until the need
 * arises, this code is a bit faster and simpler using an int.
 *
 * The workerCount is the number of workers that have been
 * permitted to start and not permitted to stop.  The value may be
 * transiently different from the actual number of live threads,
 * for example when a ThreadFactory fails to create a thread when
 * asked, and when exiting threads are still performing
 * bookkeeping before terminating. The user-visible pool size is
 * reported as the current size of the workers set.
 *
 * The runState provides the main lifecycle control, taking on values:
 *
 *   RUNNING:  Accept new tasks and process queued tasks
 *   SHUTDOWN: Don't accept new tasks, but process queued tasks
 *   STOP:     Don't accept new tasks, don't process queued tasks,
 *             and interrupt in-progress tasks
 *   TIDYING:  All tasks have terminated, workerCount is zero,
 *             the thread transitioning to state TIDYING
 *             will run the terminated() hook method
 *   TERMINATED: terminated() has completed
 *
 * The numerical order among these values matters, to allow
 * ordered comparisons. The runState monotonically increases over
 * time, but need not hit each state. The transitions are:
 *
 * RUNNING -> SHUTDOWN
 *    On invocation of shutdown(), perhaps implicitly in finalize()
 * (RUNNING or SHUTDOWN) -> STOP
 *    On invocation of shutdownNow()
 * SHUTDOWN -> TIDYING
 *    When both queue and pool are empty
 * STOP -> TIDYING
 *    When pool is empty
 * TIDYING -> TERMINATED
 *    When the terminated() hook method has completed
 *
 * Threads waiting in awaitTermination() will return when the
 * state reaches TERMINATED.
 *
 * Detecting the transition from SHUTDOWN to TIDYING is less
 * straightforward than you'd like because the queue may become
 * empty after non-empty and vice versa during SHUTDOWN state, but
 * we can only terminate if, after seeing that it is empty, we see
 * that workerCount is 0 (which sometimes entails a recheck -- see
 * below).
 */

大意为; 主线程池的控制状态,ctl,是一个打包两个概念的字段workerCount的原子整数。以及指示线程有效运行数量的runState。用来指示是否运行和关闭等状态。为了将他们打包到一个int上,我们将workerCount限制为(2 ^ 29 )-1约为5亿个线程。而不是(2 ^ 31)-1(20亿)个线程。如果将来有问题,可以使用AtomicLong进行替换。并调整shift / mask常数。但是在需要之前,使用int可以使代码更快。更简单。workCount是允许被启动不停止的worker数量,该值可能与活动线程的实际数量有暂时的不同,例如,ThreadFactory在被请求的时候未能创建线程,以及当退出线程任在终止之前执行记账操作等,用户可见的池大小报告为工作集合的当前大小。 runState提供了主生命周期控制,其值为:

  • RUNNING: 接收新任务并处理排队的任务。
  • SHUTDOWN:不接收新任务,但是处理排队的任务。
  • STOP:不接收新任务,不处理排队任务,并中断正在进行的任务。
  • TIDYING:所有任务已终止,workerCount为零,转换到状态TIDYING的线程将运行Terminated()钩子方法。
  • TERMINATED:terminated()方法已完成。

这些值之间的数字顺序很重要,可以进行有序的比较。runState随时间单调增加,但不必达到每个状态。可以按如下方式过渡:

  • RUNNING -> SHUTDOWN:关于shutdown()的调用,可能在finalize()中隐式地调用。
  • (RUNNING or SHUTDOWN) -> STOP:shutdownNow()相关调用。
  • STOP -> TIDYING 当pool为空的时候。
  • TIDYING -> TERMINATED: terminated()被调用的时候。 在waittermination()中等待的线程将返回状态到达终止。 检测从SHUTDOWN到TIDYING的转换并不像您想要的那样简单,因为在SHUTDOWN状态期间,队列在非空之后可能变为空,反之亦然,但是只有在看到它为空之后才能看到workerCount为0(有时需要重新检查-参见下文)。

1.2.1 ctl

ctl是ThreadPoolExecutor对线程状态runState和线程池workerCount的一个综合字段。将这两个属性打包放置在AtomicInteger上,ctl长度为32位,前面3位用于存放runState,后面29位存放workerCount。因此线程池最大的线程数量位2^29-1。其代码如下:

private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

// Packing and unpacking ctl
private static int runStateOf(int c)     { return c & ~CAPACITY; }
private static int workerCountOf(int c)  { return c & CAPACITY; }
private static int ctlOf(int rs, int wc) { return rs | wc; }

/*
 * Bit field accessors that don't require unpacking ctl.
 * These depend on the bit layout and on workerCount being never negative.
 */

private static boolean runStateLessThan(int c, int s) {
    return c < s;
}

private static boolean runStateAtLeast(int c, int s) {
    return c >= s;
}

private static boolean isRunning(int c) {
    return c < SHUTDOWN;
}

/**
 * Attempts to CAS-increment the workerCount field of ctl.
 */
private boolean compareAndIncrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect + 1);
}

/**
 * Attempts to CAS-decrement the workerCount field of ctl.
 */
private boolean compareAndDecrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect - 1);
}

/**
 * Decrements the workerCount field of ctl. This is called only on
 * abrupt termination of a thread (see processWorkerExit). Other
 * decrements are performed within getTask.
 */
private void decrementWorkerCount() {
    do {} while (! compareAndDecrementWorkerCount(ctl.get()));
}

我们可以看看各状态的二进制情况:

可以看到,通过位移操作,将最高的三位用于标识runState状态,而后面29位用于存放workerCount. 方法runStateOf和workerCountOf则可以将这两部分数据通过位运算的方式取出来: 如下图所示,假定需要计算的c为0110 0000 0000 0000 0000 1110 0110 0000,那么位移过程如下:

这两个方法就可以很快的计算出结果。 而ctlOf方法,rs | wc,这就非常容易理解了。

上图就非常方便的将runState和workerCount进行了组合。 而根据第一张图可以看到,RUNNING状态为负数,是最小的,这些状态的全部ctl满足如下规则:

RUNNING < SHUTDOWN < STOP < TIDYING < TERMINATED

方法runStateLessThan和runStateAtLeast则是根据上述规则判断线程池当前所处的状态。 可以发现的是,比SHUTDOWN小的任何状态都是iRUNNING状态。这也是isRunning方法的原因。 此外还提供了基于CAS的增减WorkerCount的方法: compareAndIncrementWorkerCount、compareAndDecrementWorkerCount和decrementWorkerCount。

1.2.2 workQueue

/**
 * The queue used for holding tasks and handing off to worker
 * threads.  We do not require that workQueue.poll() returning
 * null necessarily means that workQueue.isEmpty(), so rely
 * solely on isEmpty to see if the queue is empty (which we must
 * do for example when deciding whether to transition from
 * SHUTDOWN to TIDYING).  This accommodates special-purpose
 * queues such as DelayQueues for which poll() is allowed to
 * return null even if it may later return non-null when delays
 * expire.
 */
private final BlockingQueue<Runnable> workQueue;

workQueue是用于保留任务并移交给工作线程的队列,我们不要指望通过workQueue.poll返回为null来判断队列是否为空,我们仅仅通过workQueue.isEmpty方法来判断队列是否为空,(将SHUTDOWN状态过度到TIDYING状态的时候可以采用这样做。)因为一些特殊的队列如DelayQueues允许poll的时候返回空,即使在延迟后的档期可以非空。这个workQueue依赖于构造函数传入。

1.2.3 workers

/**
 * Set containing all worker threads in pool. Accessed only when
 * holding mainLock.
 */
private final HashSet<Worker> workers = new HashSet<Worker>();

线程池中所有工作线程的集合。访问workers的时候需要获得锁mainLock。这个workers是hashSet。

1.2.4 mainLock & termination

/**
 * Lock held on access to workers set and related bookkeeping.
 * While we could use a concurrent set of some sort, it turns out
 * to be generally preferable to use a lock. Among the reasons is
 * that this serializes interruptIdleWorkers, which avoids
 * unnecessary interrupt storms, especially during shutdown.
 * Otherwise exiting threads would concurrently interrupt those
 * that have not yet interrupted. It also simplifies some of the
 * associated statistics bookkeeping of largestPoolSize etc. We
 * also hold mainLock on shutdown and shutdownNow, for the sake of
 * ensuring workers set is stable while separately checking
 * permission to interrupt and actually interrupting.
 */
private final ReentrantLock mainLock = new ReentrantLock();
    /**
 * Wait condition to support awaitTermination
 */
private final Condition termination = mainLock.newCondition();

调用这个锁的时候需要锁定worker的位置,并进行相关的记录,尽管我们使用某种并发集,但是事实证明,使用锁的方式通常是可取的。原因之一是它可以对interruptIdleWorkers进行序列化。从而可以避免不必要的中断风暴。尤其是在关机期间,否则,退出线程将同时中断哪些尚未中断的线程,它也简化了一些相关的数据。如maximumPoolSize。我们还在shutdown和shutdownNow上保留了mainLock,以确保在单独检查中断和实际中断权限的时候,workerSet是稳定的。 termination是一个wait条件,以支持awaitTermination。

1.2.5 其他成员变量

/**
 * Tracks largest attained pool size. Accessed only under
 * mainLock.
 */
private int largestPoolSize;

/**
 * Counter for completed tasks. Updated only on termination of
 * worker threads. Accessed only under mainLock.
 */
private long completedTaskCount;

/*
 * All user control parameters are declared as volatiles so that
 * ongoing actions are based on freshest values, but without need
 * for locking, since no internal invariants depend on them
 * changing synchronously with respect to other actions.
 */

/**
 * Factory for new threads. All threads are created using this
 * factory (via method addWorker).  All callers must be prepared
 * for addWorker to fail, which may reflect a system or user's
 * policy limiting the number of threads.  Even though it is not
 * treated as an error, failure to create threads may result in
 * new tasks being rejected or existing ones remaining stuck in
 * the queue.
 *
 * We go further and preserve pool invariants even in the face of
 * errors such as OutOfMemoryError, that might be thrown while
 * trying to create threads.  Such errors are rather common due to
 * the need to allocate a native stack in Thread.start, and users
 * will want to perform clean pool shutdown to clean up.  There
 * will likely be enough memory available for the cleanup code to
 * complete without encountering yet another OutOfMemoryError.
 */
private volatile ThreadFactory threadFactory;

/**
 * Handler called when saturated or shutdown in execute.
 */
private volatile RejectedExecutionHandler handler;

/**
 * Timeout in nanoseconds for idle threads waiting for work.
 * Threads use this timeout when there are more than corePoolSize
 * present or if allowCoreThreadTimeOut. Otherwise they wait
 * forever for new work.
 */
private volatile long keepAliveTime;

/**
 * If false (default), core threads stay alive even when idle.
 * If true, core threads use keepAliveTime to time out waiting
 * for work.
 */
private volatile boolean allowCoreThreadTimeOut;

/**
 * Core pool size is the minimum number of workers to keep alive
 * (and not allow to time out etc) unless allowCoreThreadTimeOut
 * is set, in which case the minimum is zero.
 */
private volatile int corePoolSize;

/**
 * Maximum pool size. Note that the actual maximum is internally
 * bounded by CAPACITY.
 */
private volatile int maximumPoolSize;

/**
 * The default rejected execution handler
 */
private static final RejectedExecutionHandler defaultHandler =
    new AbortPolicy();

/**
 * Permission required for callers of shutdown and shutdownNow.
 * We additionally require (see checkShutdownAccess) that callers
 * have permission to actually interrupt threads in the worker set
 * (as governed by Thread.interrupt, which relies on
 * ThreadGroup.checkAccess, which in turn relies on
 * SecurityManager.checkAccess). Shutdowns are attempted only if
 * these checks pass.
 *
 * All actual invocations of Thread.interrupt (see
 * interruptIdleWorkers and interruptWorkers) ignore
 * SecurityExceptions, meaning that the attempted interrupts
 * silently fail. In the case of shutdown, they should not fail
 * unless the SecurityManager has inconsistent policies, sometimes
 * allowing access to a thread and sometimes not. In such cases,
 * failure to actually interrupt threads may disable or delay full
 * termination. Other uses of interruptIdleWorkers are advisory,
 * and failure to actually interrupt will merely delay response to
 * configuration changes so is not handled exceptionally.
 */
private static final RuntimePermission shutdownPerm =
    new RuntimePermission("modifyThread");

/* The context to be used when executing the finalizer, or null. */
private final AccessControlContext acc;

类中还有部分其他的成员变量,整理如下:

变量名

类型

说明

largestPoolSize

int

线程池的大小,需要获得mainLock

completedTaskCount

long

已完成的任务的计数器,在工作线程终止的时候更新,也需要获得mainLock

threadFactory

volatile ThreadFactory

创建线程的工厂方法,需要在构造函数中传入

handler

volatile RejectedExecutionHandler

拒绝策略的调用方法,在线程池饱和或者关闭的时候如果有任务传入就调用

keepAliveTime

volatile long

以纳秒为单位的worker的等待超时时间。当当前大于corePoolSize或者allowCoreThreadTimeOut的时候,线程调用此超时,否则会永远等待新的worker

allowCoreThreadTimeOut

volatile boolean

如果为false,则核心线程即使在空闲的时候也保持活动,否则,核心线程将使用keepAliveTime来超时等待。默认值为false

corePoolSize

volatile int

核心池的大小是维持生存的worker的最小数量,除非设置了allowCoreThreadTimeOut,这种情况下,最小值为0

maximumPoolSize

volatile int

线程池大小的最大值,需要注意的是最大容量在内部是由CAPACITY决定的

defaultHandler

static final RejectedExecutionHandler

默认的拒绝策略:new AbortPolicy();

shutdownPerm

static final RuntimePermission

调用shutdown和shutdownNow所需要的权限,请参阅checkShutdownAccess要求调用者有权实际中断工作程序集中的线程,由Thread.interrupt控制,依赖于ThreadGroup.checkAccess。依次依赖于SecurityManager.checkAccess。仅在这些检查通过之后,才尝试执行shutdown。Thread.interrupt的所有实际调用,都会忽略SecurityExceptions。这意味着尝试中断会以静默的方式失败。除非SecurityManager的策略不一致。否则它们不应失败。在这种情况下,无法真正中断线程可能会禁用或延迟完全终止。建议使用interruptIdleWorkers的其他用途,而实际中断的失败只会延迟对配置更改的响应,因此不会进行特殊处理。

2.重要的内部类

在ThreadPoolExecutor中,内部类有两类,一类是执行线程的Worker,还有一类是拒绝策略。

2.1 Worker

Worker这个类,继承了AbstractQueueSynchronizer。

其代码如下:

/**
 * Class Worker mainly maintains interrupt control state for
 * threads running tasks, along with other minor bookkeeping.
 * This class opportunistically extends AbstractQueuedSynchronizer
 * to simplify acquiring and releasing a lock surrounding each
 * task execution.  This protects against interrupts that are
 * intended to wake up a worker thread waiting for a task from
 * instead interrupting a task being run.  We implement a simple
 * non-reentrant mutual exclusion lock rather than use
 * ReentrantLock because we do not want worker tasks to be able to
 * reacquire the lock when they invoke pool control methods like
 * setCorePoolSize.  Additionally, to suppress interrupts until
 * the thread actually starts running tasks, we initialize lock
 * state to a negative value, and clear it upon start (in
 * runWorker).
 */
private final class Worker
    extends AbstractQueuedSynchronizer
    implements Runnable
{
    /**
     * This class will never be serialized, but we provide a
     * serialVersionUID to suppress a javac warning.
     */
    private static final long serialVersionUID = 6138294804551838833L;

    //当前worker工作的线程,如果factory方法失败则为空
    /** Thread this worker is running in.  Null if factory fails. */
    final Thread thread;
    //需要运行的初始化任务,可能是空
    /** Initial task to run.  Possibly null. */
    Runnable firstTask;
    //线程任务计数器
    /** Per-thread task counter */
    volatile long completedTasks;

    /**
     * Creates with given first task and thread from ThreadFactory.
     * @param firstTask the first task (null if none)
     */
     //从ThreadFactory创建线程给firstTask任务
    Worker(Runnable firstTask) {
       //在运行worker之前禁止中断
        setState(-1); // inhibit interrupts until runWorker
        this.firstTask = firstTask;
        //调用ThreadFactory
        this.thread = getThreadFactory().newThread(this);
    }

    /** Delegates main run loop to outer runWorker  */
    //将main的run委托给外部的runWorker方法。
    public void run() {
        runWorker(this);
    }

    // Lock methods
    //
    // The value 0 represents the unlocked state.
    // The value 1 represents the locked state.
    //0标识未锁定,1标识锁定
    protected boolean isHeldExclusively() {
        return getState() != 0;
    }

    //尝试获得锁
    protected boolean tryAcquire(int unused) {
        if (compareAndSetState(0, 1)) {
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }
    //尝试释放锁
    protected boolean tryRelease(int unused) {
        setExclusiveOwnerThread(null);
        setState(0);
        return true;
    }

    public void lock()        { acquire(1); }
    public boolean tryLock()  { return tryAcquire(1); }
    public void unlock()      { release(1); }
    public boolean isLocked() { return isHeldExclusively(); }
    //将所有线程中断
    void interruptIfStarted() {
        Thread t;
        if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
            try {
                t.interrupt();
            } catch (SecurityException ignore) {
            }
        }
    }
}

其注释大意为: worker类主要维护运行任务的线程的中断控制的状态,以及对各种情况进行记录,这个类继承了AbstractQueuedSynchronizer,以简化获取和释放围绕每个任务的锁,这样可以防止中断,这些中断旨在唤醒等待任务的工作线程,而不是中断正在运行的任务。我们实现了一个简单的不可重入锁,而不是使用ReentrantLock,因为我们不希望工作线程在调用诸如setCorePoolSize这样的池控制方法的时候能够重新获得锁。此外,为了在线程实际开始运行之前抑制中断。我们将锁的初始化状态设置为负值,并在启动的时候将其清除。 实际上,比较特殊的是Worker继承了AQS,并且实现了Runnable接口,使用firstTask来保存传入的任务,thread则是使用ThreadFactory来创建的线程。这个线程来处理任务。 在调用构造方法的时候,将任务传入,通过getThreadFactory().newThread(this);创建一个新线程。这个worker对象在启动的时候会调用其run方法,因为worker实现了Runnable接口,其本身也是一个线程。 为什么使用继承AQS而不是使用ReentrantLock呢,关键是在于tryAcquire这个方法是不允许重入的。而ReentrantLock则是允许重入。 lock方法一旦获取的独占锁,则标识当前线程正在执行任务。Worker继承了AQS,因此自身也是一把锁。在执行任务的过程中,不会释放锁。用来保证任务的正常执行。 因此,任务在运行的过程中,是不能被中断的。 如果Worker不是独占锁,也是空闲状态,则说明这个Worker没有处理任务,可以对其进行中断。线程池在执行shutdown和tryTerminate的时候会对空闲的线程进行中断。interruptIdleWorkers方法会使用tryLock方法来判断线程池中的线程是否是空闲状态。

2.2 拒绝策略类

在ThreadPoolExecutor中,支持的拒绝策略主要有4种,都是通过内部类提供的。分别是CallerRunsPolicy、AbortPolicy、DiscardPolicy、DiscardOldestPolicy。这些类都实现了RejectedExecutionHandler接口。

2.2.1 CallerRunsPolicy

/**
 * A handler for rejected tasks that runs the rejected task
 * directly in the calling thread of the {@code execute} method,
 * unless the executor has been shut down, in which case the task
 * is discarded.
 */
public static class CallerRunsPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code CallerRunsPolicy}.
     */
    public CallerRunsPolicy() { }

    /**
     * Executes task r in the caller's thread, unless the executor
     * has been shut down, in which case the task is discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
       //判断线程池的状态,如果没有关闭,则用提交任务的线程来执行run方法
        if (!e.isShutdown()) {
            r.run();
        }
    }
}

这个拒绝策略采用执行exec的线程来运行任务,除非当前线程池处于关闭状态。这个拒绝策略在正常情况下不会导致任务失败,可以有效的降低生产任务的速度。用户根据场景自行选择。

2.2.2 AbortPolicy

/**
 * A handler for rejected tasks that throws a
 * {@code RejectedExecutionException}.
 */
public static class AbortPolicy implements RejectedExecutionHandler {
    /**
     * Creates an {@code AbortPolicy}.
     */
    public AbortPolicy() { }

    /**
     * Always throws RejectedExecutionException.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     * @throws RejectedExecutionException always
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
      //抛出异常
        throw new RejectedExecutionException("Task " + r.toString() +
                                             " rejected from " +
                                             e.toString());
    }
}

这个拒绝策略会拒绝任务的执行,直接抛出异常。

2.2.3 DiscardPolicy

/**
 * A handler for rejected tasks that silently discards the
 * rejected task.
 */
public static class DiscardPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardPolicy}.
     */
    public DiscardPolicy() { }

    /**
     * Does nothing, which has the effect of discarding task r.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    }
}

该拒绝策略会丢弃当前任务,并且什么都不做。

2.2.4 DiscardOldestPolicy

/**
 * A handler for rejected tasks that discards the oldest unhandled
 * request and then retries {@code execute}, unless the executor
 * is shut down, in which case the task is discarded.
 */
public static class DiscardOldestPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardOldestPolicy} for the given executor.
     */
    public DiscardOldestPolicy() { }

    /**
     * Obtains and ignores the next task that the executor
     * would otherwise execute, if one is immediately available,
     * and then retries execution of task r, unless the executor
     * is shut down, in which case task r is instead discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
       //如果线程池未关闭,则将线程池任务队列中的旧任务poll丢弃,然后将当前任务调用execute添加到队列。
        if (!e.isShutdown()) {
            e.getQueue().poll();
            e.execute(r);
        }
    }
}

这个拒绝策略将当队列中的旧任务丢弃,之后将当前任务添加到队列,但是这个拒绝策略并不能保证当前任务能执行,还是会有可能被后续的任务继续丢弃。

3. 构造函数

ThreadpoolExecutor提供了4种构造函数,分别对前面可变的7个成员变量进行赋值。

3.1 ThreadPoolExecutor(7参数)

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
     //  corePoolSize必须大于等于0  maximumPoolSize必须大于0   且maximumPoolSize>=corePoolSize ,否则参数非法异常。                
    if (corePoolSize < 0 ||
        maximumPoolSize <= 0 ||
        maximumPoolSize < corePoolSize ||
        keepAliveTime < 0)
        throw new IllegalArgumentException();
    if (workQueue == null || threadFactory == null || handler == null)
        throw new NullPointerException();
    //访问控制权限
    this.acc = System.getSecurityManager() == null ?
            null :
            AccessController.getContext();
    //赋值
    this.corePoolSize = corePoolSize;
    this.maximumPoolSize = maximumPoolSize;
    this.workQueue = workQueue;
    this.keepAliveTime = unit.toNanos(keepAliveTime);
    this.threadFactory = threadFactory;
    this.handler = handler;
}

实际上对应系统的变量只有6个,这是因为,keepAliveTime和unit最终都转换为纳秒数unit.toNanos(keepAliveTime)。

变量

说明

corePoolSize

线程池常驻线程数,如果设置了allowCoreThreadTimeOut,则常驻线程在一定时间之后就会变成0,范围0<=corePoolSize<=maximumPoolSize

maximumPoolSize

线程池允许的最大线程数,其范围0<maximumPoolSize<=(2^29-1)

keepAliveTime

当线程数大于核心线程corePoolSize的时候,这些多余的线程在当前任务终止之后最长的驻留时间。

unit

keepAliveTime的单位,最终以纳秒在系统中生效

workQueue

任务队列,依赖于构造函数传入

threadFactory

产生线程的工厂方法

handler

拒绝策略,如果任务达到任务队列的长度将会触发拒绝策略,拒绝策略有四种

3.2 ThreadPoolExecutor(默认threadFactory)

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          RejectedExecutionHandler handler) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), handler);
}

实际上调用的还是3.1中的七参数的构造函数,只是使用了默认的Executors.defaultThreadFactory()做为线程产生的工厂方法。 这个方法实际上在Executors中。

public static ThreadFactory defaultThreadFactory() {
    return new DefaultThreadFactory();
}
DefaultThreadFactory() {
    SecurityManager s = System.getSecurityManager();
    group = (s != null) ? s.getThreadGroup() :
                          Thread.currentThread().getThreadGroup();
    namePrefix = "pool-" +
                  poolNumber.getAndIncrement() +
                 "-thread-";
}

3.3 ThreadPoolExecutor(默认defaultHandler)

这个方法将采用默认的拒绝策略:

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         threadFactory, defaultHandler);
}

而默认的拒绝策略在前面变量和常量中已经做了说明:

private static final RejectedExecutionHandler defaultHandler =
        new AbortPolicy();

实际上采用的是丢弃当前任务的策略。

3.4 ThreadPoolExecutor(默认defaultThreadFactory和defaultHandler)

那么自然根据上述两个构造函数可以得到:

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), defaultHandler);
}

将defaultThreadFactory和defaultHandler都采用默认值。因此这个构造方法是我们自己手动new ThreadPoolExecutor的时候会经常使用的。

4.基本原理

实际上在了解了前面的成员变量以及注释和构造函数之后,ThreadPoolExecutor的基本结构就非常清除了。个人觉得相比concurrentHashMap要简单很多。ThreadpoolExecutor的组成如下:

由一个HashSet为数据结构的workersPool来存放所有的线程。然后将所有Runnable的任务都放在构造函数定义的workQueue中,这个workQueue可以是任意的阻塞队列。总之可以自行定义。那么,当线程池正常启动之后,这个数据结构就开始工作。当外部线程添加一个Runnable的task提交sumbit方法的时候。此时,submit方法中有三个选项:

  • 1.新线程执行:一是判断,当前pool中的线程数量如果小于corePoolSize那么会调用构造函数定义的线程创建的工厂方法来创建一个线程。将这个线程添加到workers中。用这个新线程执行任务,即便works中有线程空闲。
  • 2.加入队列等待:当提交任务的时候,当线程池中workers的数量达到corePoolSize的时候,如果此时workQueue中不满,则将任务存入workQueue中。添加到队列的尾部。
  • 3.如果任务队列已满,而corePoolSize小于maxmumCoreSize,则用构造函数提供的构造方法创建一个新的线程,用这个新线程来执行提交的任务。
  • 4.线程池workPool中的workers在执行完当前任务之后处于空闲的情况下,将从workQueue中取出队首的任务。
  • 5.如果线程池workPool达到maxmumPoolSize,同时workQueue中的任务也存满,达到最大值的时候,此时将触发拒绝策略。执行任务的线程根据上述的4种拒绝策略来执行是否将任务丢弃或者用当前线程执行任务。
  • 6.如果allowCoreThreadTimeOut没有设置,默认为false,则空闲时间超过keepAliveTime的线程将会自动销毁。以保持线程池中的线程数维持在corePoolSize。
  • 7.如果设置了allowCoreThreadTimeOut为true,则所有的线程在时间超过keepAliveTime之后,都会自动销毁。如果线程池空闲,则线程池不会占用任何资源。

以上7点是ThreadPoolExecutor的核心。也是面试过程中经常被问到的部分。 结合在第一部分中线程的状态,各状态之间的切换如下图:

5.重要方法

在理解了线程池工作的基本原理之后,现在对线程池的一些常用方法进行分析。

5.1 execute

public void execute(Runnable command) {
    //如果任务为空,抛出异常
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
     //对于任务的处理,有三个步骤来处理,如果当前工作线程低于corepoolSize,则创建一个新的线程
    int c = ctl.get();
    if (workerCountOf(c) < corePoolSize) {
    //创建新线程执行任务,成功则返回,否则再次获取线程池的状态和线程数
        if (addWorker(command, true))
            return;
        c = ctl.get();
    }
    //如果线程池处于run状态,且工作队列workQueue可以添加
    if (isRunning(c) && workQueue.offer(command)) {
      //再次获取线程池状态和线程数
        int recheck = ctl.get();
        //判断运行状态,如果不处于运行状态且可以从阻塞队列中删除任务。则执行拒绝策略
        if (! isRunning(recheck) && remove(command))
           //执行拒绝策略
            reject(command);
        //如果线程池中的线程数量为0,则创建线程
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
    //如果阻塞队列已满,无法添加,则执行拒绝策略
    else if (!addWorker(command, false))
        reject(command);
}

上述过程可以用流程图表示如下:

5.2 addWorker

private boolean addWorker(Runnable firstTask, boolean core) {
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        //检查线程池状态,如果处于SHUTDOWN的时候firstTask为null和workQueue不为空。或者线程池状态大于SHUTDOWN,这些状态下都不允许创建线程,因此直接返回false。此处相当于double check。
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;
        //死循环 采用CAS的方式修改count
        for (;;) {
            //获取count 这个方法是采用位运算
            int wc = workerCountOf(c);
            //如果数量大于总体容量2^29-1
            if (wc >= CAPACITY ||
                //此处根据传入的core决定通过corePoolSize还是maxmumPoolSize判断
                wc >= (core ? corePoolSize : maximumPoolSize))
                //上述这些条件如果为真则返回false
                return false;
            //通过cas的方式修改count,如果修改成功,则跳出死循环
            if (compareAndIncrementWorkerCount(c))
                //注意此处break和continue的区别  break是跳出当前的retry而continue则是继续执行
                break retry;
            c = ctl.get();  // Re-read ctl
            if (runStateOf(c) != rs)
                continue retry;
            // else CAS failed due to workerCount change; retry inner loop
        }
    }
    //初始化变量
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        //创建一个worker 其task即是firstTask
        w = new Worker(firstTask);
        //线程t指向worker的线程
        final Thread t = w.thread;
        //如果线程不为空
        if (t != null) {
           //获取mainLock
            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock();
            try {
                // Recheck while holding lock.
                // Back out on ThreadFactory failure or if
                // shut down before lock acquired.
                //获取线程池状态
                int rs = runStateOf(ctl.get());
                //如果rs为RUNNING状态或者处于SHUTDOWN的时候,firstTask为空
                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    //线程是否处于活动状态  如果线程以及run了则说明该线程已经启动,则会出问题,因为这个worker是新new的,并没有运行线程的run
                    if (t.isAlive()) // precheck that t is startable
                        throw new IllegalThreadStateException();
                    //将worker添加到workers
                    workers.add(w);
                    //计算workers的size
                    int s = workers.size();
                    //判断size是否大于lagestPoolSize,那么修改LargestPoolSize
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    //worker的添加状态为true
                    workerAdded = true;
                }
            //不要忘记解锁
            } finally {
                mainLock.unlock();
            }
            //如果线程加入pool成功,则启动线程,并修改线程的启动状态为true
            if (workerAdded) {
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        //如果线程启动状态不为true,则判定启动失败,将worker添加到失败的worker中
        if (! workerStarted)
            addWorkerFailed(w);
    }
    //返回worker的启动状态做为线程的执行结果
    return workerStarted;
}

经过这个方法,我们可以看到,实际上mainLock是在向workers的pool中添加和删除队列的时候才会用到。在添加或者删除之前要获取这个锁。 另外,本文开始定义了标签语句 retry: ,需要注意的是,通过continue会结束当前的循环重新开始retry。而break则会跳出retry。结束循环。

5.3 runWorker

我们在前面静态内部类部分,对worker进行了分析。那么实际上,worker在被启动之后,是怎么执行的呢?

public void run() {
    runWorker(this);
}

实际上调用的就是这个runWorker方法。

final void runWorker(Worker w) {
    //定义线程为wt
    Thread wt = Thread.currentThread();
    //定义任务为task 
    Runnable task = w.firstTask;
    w.firstTask = null;
    //此处调用unlock,这不是mainLock,这是worker继承了AQS,实际上是AQS中的release方法,还记得所有的worker中被初始化的状态是-1吗?此处调用release方法,就将-1改为了1,这样后面的中断方法就能对处于运行状态的线程进行中断
    w.unlock(); // allow interrupts
    //
    boolean completedAbruptly = true;
    try {
        //如果task不为null,或者task执行getTask也不为空。需要注意的是此处的getTask,是从队列中去获取的。因此,此处的逻辑就是,如果这个worker有任务那么就执行,执行完之后就遍历队列,继续执行任务。
        while (task != null || (task = getTask()) != null) {
            //执行任务的过程中加锁,那么加锁之后就不允许中断了
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            //判断线程池运行状态,并结合线程的中断状态,之后对线程进行中断
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                //前置操作
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    //调用run方法开始执行
                    task.run();
                //异常处理
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    //后置操作 
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                w.completedTasks++;
                w.unlock();
            }
        }
        completedAbruptly = false;
    } finally {
        //处理完之后,当线程退出的时候,触发processWorkerExit方法,从workers中移除当前的worker
        processWorkerExit(w, completedAbruptly);
    }
}

5.4 processWorkerExit

我们再看看上述方法提到的线程退出的时候执行的方法

private void processWorkerExit(Worker w, boolean completedAbruptly) {
    if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
        decrementWorkerCount();
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //更新completedTaskCount 每个worker会维护一个completedTasks,当worker退出的时候,更新到线程池中
        completedTaskCount += w.completedTasks;
        //从队列中移除worker
        workers.remove(w);
    } finally {
        //解锁
        mainLock.unlock();
    }
    //尝试线程池是否可以进入terminate
    tryTerminate();
    //获得锁状态和线程数
    int c = ctl.get();
    //判断锁状态
    if (runStateLessThan(c, STOP)) {
        if (!completedAbruptly) {
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
            if (min == 0 && ! workQueue.isEmpty())
                min = 1;
            if (workerCountOf(c) >= min)
                return; // replacement not needed
        }
        //添加worker
        addWorker(null, false);
    }
}

5.5 tryTerminate

这个方法是每次线程退出的时候就触发,尝试判断线程池是否可以Terminate

final void tryTerminate() {
    //死循环 
    for (;;) {
        //获得运行状态和线程数,之后进行条件判断,如果添加不满足则退出
        int c = ctl.get();
        if (isRunning(c) ||
            runStateAtLeast(c, TIDYING) ||
            (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
            return;
        //如果条件满足,但是workercount不为0,则对所有空闲的线程执行中断方法
        if (workerCountOf(c) != 0) { // Eligible to terminate
            interruptIdleWorkers(ONLY_ONE);
            return;
        }
        //获得锁
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            //采用cas进行更新
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    terminated();
                } finally {
                    ctl.set(ctlOf(TERMINATED, 0));
                    //这个termination 在awaitTermination中会等待keepAlive的时间。此处将唤醒这些等待的线程
                    termination.signalAll();
                }
                return;
            }
        } finally {
           //解锁
            mainLock.unlock();
        }
        // else retry on failed CAS
    }
}

5.6 awaitTermination

public boolean awaitTermination(long timeout, TimeUnit unit)
    throws InterruptedException {
    //将超时时间转换为纳秒
    long nanos = unit.toNanos(timeout);
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //死循环
        for (;;) {
            //判断运行状态,如果已经处于TERMINATED状态则直接退出
            if (runStateAtLeast(ctl.get(), TERMINATED))
                return true;
            //计算的纳秒时间必须大于0
            if (nanos <= 0)
                return false;
            //通过lock的条件变量等待,将当前线程变为TIME_WAITING状态
            nanos = termination.awaitNanos(nanos);
        }
    } finally {
        //解锁
        mainLock.unlock();
    }
}

同上述方法可以看出,mainLock是对worker操作的时候使用的,在添加和删除workers的时候需要获得锁。此外,当调用awaitTermination()方法的时候,对空闲的worker执行WAIT操作也采用lock的条件变量termination来执行。之后当线程进入TERMINATION状态的时候统一唤醒。

5.7 getTask

最后再分析一个关键方法。getTask,也就是前面的worker获取任务的方法。这个方法非常重要。

private Runnable getTask() {
    //是否需要使用超时时间 
    boolean timedOut = false; // Did the last poll() time out?
    //死循环
    for (;;) {
        //获得线程状态和count
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        //状态判断 当rs大于SHUTDOWN或者STOP状态的时候判断workQueue是否为空
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            //缩减workerCount
            decrementWorkerCount();
            return null;
        }
        //获取wc
        int wc = workerCountOf(c);

        // Are workers subject to culling?
        //判断wc是否大于corePoolSize,决定timed的状态,也就是说,当wc小于corePoolSize的时候,就不考虑阻塞队列的超时
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
        //如果wc大于最大线程或者timed的状态不满足 则continue
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
            //如果cas减少workerCount失败则退出
            if (compareAndDecrementWorkerCount(c))
                return null;
            continue;
            //通过这个循环将wc减少
        } 

        try {
            //如果需要超时获取,则按超时的方式获取任务,这也是阻塞队列的作用,如果有超时时间则会一直阻塞
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
            //r为从队列中拿到的任务
            if (r != null)
                return r;
            timedOut = true;
            //如果被中断 则设置timedOut为false
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

实际上通过这个方法的代码可以发现,之所以要使用阻塞队列,原因就在于,这个方法中如果需要通过使用keepAlive的时间,那么此方法就会用pull(timeout)方法来阻塞当前调用线程。这样就能将每个worker阻塞再getTask的过程中。 重点需要注意getTask被阻塞的条件:

 allowCoreThreadTimeOut || wc > corePoolSize;

开启核心线程超时或者当前线程数大于核心线程。

5.8 interruptIdleWorkers

如果5.7方法中有线程进入了TIME_WAITING状态。那么如果需要及时使用,那么就需要对这些阻塞的线程进行中断。

private void interruptIdleWorkers(boolean onlyOne) {
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //遍历workers
        for (Worker w : workers) {
            Thread t = w.thread;
            //获得AQS的锁,如果能获得AQS的锁且不处于中断状态则进行中断
            if (!t.isInterrupted() && w.tryLock()) {
                try {
                   //调用线程的中断方法
                    t.interrupt();
                } catch (SecurityException ignore) {
                } finally {
                   //worker的AQS解锁
                    w.unlock();
                }
            }
            //是否只执行1次的标识
            if (onlyOne)
                break;
        }
    } finally {
        //解锁
        mainLock.unlock();
    }
}

调用中断的时候,需要回去两层锁,一层是mainLock,另外一层是线程的AQS锁,如果被占用则该线程不会被打断。这样一来就可以明白,再最开始刚创建的worker是不会被打断的,另外处于工作中的线程也不会被打算,只有wait状态的worker才会被打断。 而interruptIdleWorkers方法被调用的时机:

  • tryTerminate
  • shutdown
  • setCorePoolSize -> workerCountOf(ctl.get()) > corePoolSize
  • allowCoreThreadTimeOut
  • setMaximumPoolSize -> workerCountOf(ctl.get()) > maximumPoolSize
  • setKeepAliveTime

6.总结

本文对ThreadPoolExecutor线程池的源码进行了分析。相对于ConcurrentHashMap这个类的代码并不是特别复杂。实际上ThreadpoolExecutor由一个hashSet构成的workerPool和一个自定义的阻塞队列workQueue组成。基本结构如下:

worker本身继承了AQS,然后还实现了Runnable接口。之后worker会不断的从任务队列workQueue中去获取任务。有两种获取任务的方法,getTask()。当开启了核心线程keepAlive或者当前线程数大于核心线程corepoolSize的时候,就会触发keepAliveTime的阻塞。被阻塞的线程就可以认为是空闲的线程,这些线程被阻塞队列阻塞住,这也是为什么线程池要用阻塞阻塞队列的原因。不需要被阻塞的线程则可以不断的执行run,源源不断的从工作队列workQueue中获取task执行。 由于worker实现AQS,其作用是在于当大部分线程都处于休眠的时候,线程池中活动的线程降低到队列核心线程数以下之后,此时就通过中断的方式唤醒哪些没有工作的线程。在关闭或者调整线程池的时候都适用。而AQS的目的就是在打断的时候,需要获得AQS锁,而新创建的worker和处于正常运行中的worker则会导致获取锁失败,不会被打断。 ReentrantLocak mainLock的作用在于,对workers的操作,增减或者打断都需要获得这个锁才能执行。而这个锁上的条件变量termination的唯一作用是在tryTermination中调用。 最后需要注意的是,线程池采用了大量的cas操作,最关键的是将线程的状态和workerSize合并到一个AotmicInteger对象上。通过位运算,最高三位标识状态。因此worker的大小也就是2^29-1。这个位运算的操作也是我们值得借鉴的地方。与HashMap中的位运算操作一样,都是让在看代码的时候觉得,代码还能这样写。特别提升设计能力的地方。

最关键的部分:

  • 5个状态及切换过程
  • 7个操作(基本原理一节)
  • 4个拒绝策略
  • 7个参数(构造函数)
  • 用阻塞队列的原因
  • 两种锁:AQS和mainLock
  • 添加任务、gettask、执行任务、以及中断过程

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 多线程基础(一): 线程概念及生命周期

    什么是进程,相信大家都知道什么是进程却很难解释清楚。百科中的解释是:进程(Process)是计算机中的程序关于某数据集合上的一次运行活动,是系统进行资源分配和调...

    冬天里的懒猫
  • A Java Fork/Join Framework(Doug Lea 关于java Fork/Join框架的论文翻译)

    Doug Lea State University of New York at Oswego Oswego NY 13126 315−341−2688...

    冬天里的懒猫
  • java线程池(一):java线程池基本使用及Executors

    在前面学习线程组的时候就提到过线程池。实际上线程组在我们的日常工作中已经不太会用到,但是线程池恰恰相反,是我们日常工作中必不可少的工具之一。现在开始对线...

    冬天里的懒猫
  • JAVA三年面试总结,金九银十,你准备好了吗?

    Mshu
  • 让人头大的各种锁,从这里让你思绪清晰

    说到了锁我们经常会联想到生活中的锁,在我们日常中我们经常会接触到锁。比如我们的手机锁,电脑锁,再比如我们生活中的门锁,这些都是锁。

    乱敲代码
  • Vista 及后续版本的新线程池

    在上一篇的博文中,说了下老版本的线程池,在Vista之后,微软重新设计了一套线程池机制,并引入一组新的线程池API,新版线程池相对于老版本的来说,它的可控性更高...

    Masimaro
  • 【你问我答】这些Java并发问题,专家是这么回答的

    针对上期Java高并发【你问我答】中读者提出的问题,王锐同学的回答如下。 一 ---- 美团内部使用过Akka么?有什么坑? ——Absurd “ 答: 只简...

    美团技术团队
  • Python | 面试必问,线程与进程的区别,Python中如何创建多线程?

    其实关于元类还有很多种用法,比如说如何在元类当中设置参数啦,以及一些规约的用法等等。只不过这些用法比较小众,使用频率非常低,所以我们不过多阐述了,可以在用到的时...

    TechFlow-承志
  • 彻底理解Java线程池原理篇

    核心线程(corePool):线程池最终执行任务的角色肯定还是线程,同时我们也会限制线程的数量,所以我们可以这样理解核心线程,有新任务提交时,首先检查核心线程数...

    三好码农
  • Java多线程知识点

    进程和线程的区别?多线程有什么好处? 进程:正在进行中的程序(直译)。 线程:就是进程中一个负责程序执行的控制单元(执行路径)

    六月的雨

扫码关注云+社区

领取腾讯云代金券