我正在使用Apache Spark本地模式运行pyspark 2.2.0作业,并看到以下警告:
WARN RowBasedKeyValueBatch: Calling spill() on RowBasedKeyValueBatch. Will not spill but return 0.
出现此警告的原因可能是什么?这是我应该关心的事情,还是我可以安全地忽略它?
发布于 2018-01-25 22:50:56
发布于 2018-02-08 20:36:51
我猜这条消息比一个简单的警告更糟糕:它处于错误的边缘。
看一下源代码:
/**
* Sometimes the TaskMemoryManager may call spill() on its associated MemoryConsumers to make
* space for new consumers. For RowBasedKeyValueBatch, we do not actually spill and return 0.
* We should not throw OutOfMemory exception here because other associated consumers might spill
*/
public final long spill(long size, MemoryConsumer trigger) throws IOException {
logger.warn("Calling spill() on RowBasedKeyValueBatch. Will not spill but return 0.");
return 0;
}
所以我会说,你在一个无限循环中,“需要溢出,但实际上没有溢出”。
https://stackoverflow.com/questions/46907447
复制相似问题