我尝试过升级到Apache Spark 1.6.0 RC3。现在,我的应用程序几乎在每个任务中都会出现这些错误:
Managed memory leak detected; size = 15735058 bytes, TID = 830
我已经将org.apache.spark.memory.TaskMemoryManager
的日志记录级别设置为DEBUG
,并在日志中查看:
I2015-12-18 16:54:41,125 TaskSetManager: Starting task 0.0 in stage 7.0 (TID 6, localhost, partition 0,NODE_LOCAL, 3026 bytes)
I2015-12-18 16:54:41,125 Executor: Running task 0.0 in stage 7.0 (TID 6)
I2015-12-18 16:54:41,130 ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
I2015-12-18 16:54:41,130 ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
D2015-12-18 16:54:41,188 TaskMemoryManager: Task 6 acquire 5.0 MB for null
I2015-12-18 16:54:41,199 ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
I2015-12-18 16:54:41,199 ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
D2015-12-18 16:54:41,262 TaskMemoryManager: Task 6 acquire 5.0 MB for null
D2015-12-18 16:54:41,397 TaskMemoryManager: Task 6 release 5.0 MB from null
E2015-12-18 16:54:41,398 Executor: Managed memory leak detected; size = 5245464 bytes, TID = 6
如何调试这些错误?有没有一种方法可以记录分配和释放的堆栈跟踪,这样我就可以找到泄漏的内容?
我对新的统一内存管理器(SPARK-10000)了解不多。泄漏很可能是我的错,还是可能是Spark bug?
发布于 2020-05-18 14:10:58
我也发现了这条警告消息,但它是由"df.repartition(rePartNum,df(“id”))引起的。我的df为空,并且警告消息的行等于rePartNum。版本: spark2.4 win10
https://stackoverflow.com/questions/34359211
复制相似问题