JVM的Finalization Delay引起的OOM

今天在压力测试环境某一个服务出现crash了,经过一番检查,终于发现是由于JVM的Finalization Delay引起的,这个问题比较特殊,这里记录一下。

这个服务是用Java写的,主要完成的功能是根据特定的指令文件生成mp4文件,用到的java库主要有javacv,这个库底层其实是使用JNI调用操作系统里安装的ffmpeg。

检查日志文件

首先检查日志文件,发现日志里出现了OOM的报错

java.lang.OutOfMemoryError: null
	at sun.misc.Unsafe.allocateMemory(Native Method) ~[na:1.7.0_79]
	at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127) ~[na:1.7.0_79]
	at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) ~[na:1.7.0_79]
	at org.bytedeco.javacv.Frame.<init>(Frame.java:105) ~[javacv-1.1.jar:1.1]
	at org.bytedeco.javacv.Java2DFrameConverter.getFrame(Java2DFrameConverter.java:712) ~[javacv-1.1.jar:1.1]
	at org.bytedeco.javacv.Java2DFrameConverter.getFrame(Java2DFrameConverter.java:679) ~[javacv-1.1.jar:1.1]
	at org.bytedeco.javacv.Java2DFrameConverter.getFrame(Java2DFrameConverter.java:673) ~[javacv-1.1.jar:1.1]
	at org.bytedeco.javacv.Java2DFrameConverter.convert(Java2DFrameConverter.java:62) ~[javacv-1.1.jar:1.1]

很明显这里是申请DirectByteBuffer出现了OOM,所以应该是Direct Memory申请太多了。为了确认问题,将服务跑起来,使用jconsole看了下JVM的堆内存使用情况,发现堆内存使用一直都是比较稳定的,但使用top -p ${pid}查看进程占用的内存,发现RES字段的值一直是在增长的,而且增长得很快,不到半个小时值就从原来的500M增长到1.6G。

分析代码

接下来看一下关键代码

public void encodeFrame(BufferedImage image, long frameTime) {
    try {
        long t = frameTime * 1000L;
        if(t>recorder.getTimestamp()) {
            recorder.setTimestamp(t);
        }
        Frame frame = java2dConverter.convert(image);
        recorder.record(frame);
    } catch (FrameRecorder.Exception e) {
        log.error("JavaCVMp4Encoder encode frame error.", e);
    }
}

业务层是不停地调用encodeFrame将每一张图片编码到mp4文件里。

java2dConverter.convert(image);这句代码的实现里会申请一个DirectByteBuffer。如下面的代码。

public Frame(int width, int height, int depth, int channels) {
    int pixelSize = Math.abs(depth) / 8;
    this.imageWidth = width;
    this.imageHeight = height;
    this.imageDepth = depth;
    this.imageChannels = channels;
    this.imageStride = ((imageWidth * imageChannels * pixelSize + 7) & ~7) / pixelSize; // 8-byte aligned
    this.image = new Buffer[1];

    ByteBuffer buffer = ByteBuffer.allocateDirect(imageHeight * imageStride * pixelSize).order(ByteOrder.nativeOrder());
    switch (imageDepth) {
        case DEPTH_BYTE:
        case DEPTH_UBYTE:  image[0] = buffer;                  break;
        case DEPTH_SHORT:
        case DEPTH_USHORT: image[0] = buffer.asShortBuffer();  break;
        case DEPTH_INT:    image[0] = buffer.asIntBuffer();    break;
        case DEPTH_LONG:   image[0] = buffer.asLongBuffer();   break;
        case DEPTH_FLOAT:  image[0] = buffer.asFloatBuffer();  break;
        case DEPTH_DOUBLE: image[0] = buffer.asDoubleBuffer(); break;
        default: throw new UnsupportedOperationException("Unsupported depth value: " + imageDepth);
    }
}

这里ByteBuffer.allocateDirect方法申请的DirectByteBuffer并不是Java堆内存,而是直接在C堆上申请的。而DirectByteBuffer申请的C堆内存释放很特殊,并不是简单地由JVM GC完成的。

先看一下DirectByteBuffer的定义

class DirectByteBuffer extends MappedByteBuffer implements DirectBuffer {
  ...
  protected static final Unsafe unsafe = Bits.unsafe();
  ...
  private static class Deallocator
        implements Runnable
    {

        private static Unsafe unsafe = Unsafe.getUnsafe();

        private long address;
        private long size;
        private int capacity;

        private Deallocator(long address, long size, int capacity) {
            assert (address != 0);
            this.address = address;
            this.size = size;
            this.capacity = capacity;
        }

        public void run() {
            if (address == 0) {
                // Paranoia
                return;
            }
            unsafe.freeMemory(address);
            address = 0;
            Bits.unreserveMemory(size, capacity);
        }

    }

    private final Cleaner cleaner;
    ...

    DirectByteBuffer(int cap) {                   // package-private

        super(-1, 0, cap, cap);
        boolean pa = VM.isDirectMemoryPageAligned();
        int ps = Bits.pageSize();
        long size = Math.max(1L, (long)cap + (pa ? ps : 0));
        Bits.reserveMemory(size, cap);

        long base = 0;
        try {
            base = unsafe.allocateMemory(size);
        } catch (OutOfMemoryError x) {
            Bits.unreserveMemory(size, cap);
            throw x;
        }
        unsafe.setMemory(base, size, (byte) 0);
        if (pa && (base % ps != 0)) {
            // Round up to page boundary
            address = base + ps - (base & (ps - 1));
        } else {
            address = base;
        }
        cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
        att = null;



    }
    ...
}

可以看到创建DirectByteBuffer对象时实际上使用unsafe.allocateMemory申请一块C堆内存的。DirectByteBuffer对象内部有一个Cleaner cleaner,看样子应该是这个东东负责对申请申请的C堆内存进行释放的。看一下Cleaner的定义:

public class Cleaner extends PhantomReference<Object> {
...
    private static final ReferenceQueue<Object> dummyQueue = new ReferenceQueue();
        private Cleaner(Object var1, Runnable var2) {
        super(var1, dummyQueue);
        this.thunk = var2;
    }

    public static Cleaner create(Object var0, Runnable var1) {
        return var1 == null?null:add(new Cleaner(var0, var1));
    }

    public void clean() {
        if(remove(this)) {
            try {
                this.thunk.run();
            } catch (final Throwable var2) {
                AccessController.doPrivileged(new PrivilegedAction() {
                    public Void run() {
                        if(System.err != null) {
                            (new Error("Cleaner terminated abnormally", var2)).printStackTrace();
                        }

                        System.exit(1);
                        return null;
                    }
                });
            }

        }
    }
...
}

原来Cleaner实际上是一个PhantomReference

PhantomReference虚引用主要用来跟踪对象被垃圾回收器回收的活动。虚引用与软引用和弱引用的一个区别在于:虚引用必须和引用队列 (ReferenceQueue)联合使用。当垃圾回收器准备回收一个对象时,如果发现它还有虚引用,就会在回收对象的内存之前,把这个虚引用加入到与之关联的引用队列中。

在这个场景里也就是说当JVM垃圾回收器准备回收某个DirectByteBuffer对象时,发现这个DirectByteBuffer对象有虚引用,就会将虚引用加入到与之关联的引用队列中。将虚引用加入到与之关联的引用队列中有什么作用?看一下Reference的实现代码

public abstract class Reference<T> {
  ...
  private T referent;
  ...

  static private class Lock { }
    private static Lock lock = new Lock();


    /* List of References waiting to be enqueued.  The collector adds
     * References to this list, while the Reference-handler thread removes
     * them.  This list is protected by the above lock object. The
     * list uses the discovered field to link its elements.
     */
    private static Reference<Object> pending = null;

    /* High-priority thread to enqueue pending References
     */
    private static class ReferenceHandler extends Thread {

        private static void ensureClassInitialized(Class<?> clazz) {
            try {
                Class.forName(clazz.getName(), true, clazz.getClassLoader());
            } catch (ClassNotFoundException e) {
                throw (Error) new NoClassDefFoundError(e.getMessage()).initCause(e);
            }
        }

        static {
            // pre-load and initialize InterruptedException and Cleaner classes
            // so that we don't get into trouble later in the run loop if there's
            // memory shortage while loading/initializing them lazily.
            ensureClassInitialized(InterruptedException.class);
            ensureClassInitialized(Cleaner.class);
        }

        ReferenceHandler(ThreadGroup g, String name) {
            super(g, name);
        }

        public void run() {
            while (true) {
                tryHandlePending(true);
            }
        }
    }

    /**
     * Try handle pending {@link Reference} if there is one.<p>
     * Return {@code true} as a hint that there might be another
     * {@link Reference} pending or {@code false} when there are no more pending
     * {@link Reference}s at the moment and the program can do some other
     * useful work instead of looping.
     *
     * @param waitForNotify if {@code true} and there was no pending
     *                      {@link Reference}, wait until notified from VM
     *                      or interrupted; if {@code false}, return immediately
     *                      when there is no pending {@link Reference}.
     * @return {@code true} if there was a {@link Reference} pending and it
     *         was processed, or we waited for notification and either got it
     *         or thread was interrupted before being notified;
     *         {@code false} otherwise.
     */
    static boolean tryHandlePending(boolean waitForNotify) {
        Reference<Object> r;
        Cleaner c;
        try {
            synchronized (lock) {
                if (pending != null) {
                    r = pending;
                    // 'instanceof' might throw OutOfMemoryError sometimes
                    // so do this before un-linking 'r' from the 'pending' chain...
                    c = r instanceof Cleaner ? (Cleaner) r : null;
                    // unlink 'r' from 'pending' chain
                    pending = r.discovered;
                    r.discovered = null;
                } else {
                    // The waiting on the lock may cause an OutOfMemoryError
                    // because it may try to allocate exception objects.
                    if (waitForNotify) {
                        lock.wait();
                    }
                    // retry if waited
                    return waitForNotify;
                }
            }
        } catch (OutOfMemoryError x) {
            // Give other threads CPU time so they hopefully drop some live references
            // and GC reclaims some space.
            // Also prevent CPU intensive spinning in case 'r instanceof Cleaner' above
            // persistently throws OOME for some time...
            Thread.yield();
            // retry
            return true;
        } catch (InterruptedException x) {
            // retry
            return true;
        }

        // Fast path for cleaners
        if (c != null) {
            c.clean();
            return true;
        }

        ReferenceQueue<? super Object> q = r.queue;
        if (q != ReferenceQueue.NULL) q.enqueue(r);
        return true;
    }

    static {
        ThreadGroup tg = Thread.currentThread().getThreadGroup();
        for (ThreadGroup tgn = tg;
             tgn != null;
             tg = tgn, tgn = tg.getParent());
        Thread handler = new ReferenceHandler(tg, "Reference Handler");
        /* If there were a special system-only priority greater than
         * MAX_PRIORITY, it would be used here
         */
        handler.setPriority(Thread.MAX_PRIORITY);
        handler.setDaemon(true);
        handler.start();

        // provide access in SharedSecrets
        SharedSecrets.setJavaLangRefAccess(new JavaLangRefAccess() {
            @Override
            public boolean tryHandlePendingReference() {
                return tryHandlePending(false);
            }
        });
    }
    ...
    Reference(T referent, ReferenceQueue<? super T> queue) {
        this.referent = referent;
        this.queue = (queue == null) ? ReferenceQueue.NULL : queue;
    }

}

这里代码看着有些糊涂,并没有代码给pending这个类变量赋值,为啥ReferenceHandler这个线程执行体里又在读取它的值,但看了看private static Reference<Object> pending = null;这一行上面的注释,想了想终于明白了,原来JVM垃圾回收器将将虚引用加入到与之关联的引用队列后,JVM垃圾回收器又负责逐个将引用队列中的引用拿出来赋于pending,然后通知ReferenceHandler线程,ReferenceHandler线程拿到引用后,发现如果是Cleaner,则调用其clean方法。然后终于与DirectByteBuffer里的Deallocator接上了,最终DirectByteBuffer申请的C堆内存被释放。

既然DirectByteBuffer申请的C堆内存释放是自动的,为啥在这个场景里会出现OOM呢?查阅java的bug记录,终于找到原因。http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4857305http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4469299

意思是如果DirectByteBuffer创建得过于频繁,服务器的CPU太繁忙,C堆内存还是会OOM的,原因是JVM来不及进行GC及Finalization,大量对象的销毁工作被推后,最终C堆内存无法得到释放。

解决方案

bug记录提到了3个解决方案:

Insert occasional explicit System.gc() invocations to ensure that direct buffers are reclaimed.

Reduce the size of the young generation to force more frequent GCs.

Explicitly pool direct buffers at the application level.

我这里采用了第一个解决方案,代码如下:

public void encodeFrame(BufferedImage image, long frameTime) {
    try {
        long t = frameTime * 1000L;
        if(t>recorder.getTimestamp()) {
            recorder.setTimestamp(t);
        }
        Frame frame = java2dConverter.convert(image);
        recorder.record(frame);
        if(System.currentTimeMillis() - lastGCTime > 60000){
            System.gc();
            System.runFinalization();
            lastGCTime = System.currentTimeMillis();
            Thread.yield();
            TimeUnit.SECONDS.sleep(3);
        }
    } catch (FrameRecorder.Exception e) {
        log.error("JavaCVMp4Encoder encode frame error.", e);
    }
}

意思是说每隔1分钟显式地调用System.gc();System.runFinalization();,并让出CPU休息3秒钟。经过长达10几个小时的测试,目前一切都正常了。

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏mini188

java中的锁

java中有哪些锁 这个问题在我看了一遍<java并发编程>后尽然无法回答,说明自己对于锁的概念了解的不够。于是再次翻看了一下书里的内容,突然有点打开脑门的感觉...

54090
来自专栏芋道源码1024

Spring Webflux —— 源码阅读之 handler 包

查找给定请求的handler,如果找不到特定的请求,则返回一个空的Mono。这个方法被getHandler(org.springframework.web.se...

42350
来自专栏Spark生态圈

[Spark SQL] 源码解析之Analyzer

Analyzer模块将Unresolved LogicalPlan结合元数据catalog进行绑定,最终转化为Resolved LogicalPlan。跟着代码...

14220
来自专栏coolblog.xyz技术专栏

MyBatis 源码分析 - SQL 的执行过程

本篇文章较为详细的介绍了 MyBatis 执行 SQL 的过程。该过程本身比较复杂,牵涉到的技术点比较多。包括但不限于 Mapper 接口代理类的生成、接口方法...

86620
来自专栏喔家ArchiSelf

从构造函数看线程安全

线程是编程中常用而且强大的手段,在使用过程中,我们经常面对的就是线程安全问题了。对于Java中常见的数据结构而言,一般的,ArrayList是非线程安全的,Ve...

12120
来自专栏Android 研究

OKHttp源码解析(五)--OKIO简介及FileSystem

okio是由square公司开发的,它补充了java.io和java.nio的不足,以便能够更加方便,快速的访问、存储和处理你的数据。OKHttp底层也是用该库...

28430
来自专栏源码之家

word如何自动分割成多个文档

47450
来自专栏Android 研究

APK安装流程详解5——Installer、InstallerConnection和Installd守护进程

因为Installer继承自SystemService,所以我们看下Installer的onStart方法 代码在Installer.java 396行

18310
来自专栏iOS开发

iOS开发之 Method Swizzling 深入浅出

如果产品经理突然说:"在所有页面添加统计功能,也就是用户进入这个页面就统计一次"。我们会想到下面的一些方法:

49370
来自专栏Android知识点总结

Java总结IO篇之File类和Properties类

打开颜色选择器 :读流I-->字符串分割-->字符串存入Map-->使用Map对象还原用户配置 修改配置时 :写流O-->创建Map对象-->字符...

19520

扫码关注云+社区

领取腾讯云代金券