背景:
python中的pyspark代码,而不是pyspark env中的pyspark代码。每一段代码都可以工作,然后把它写下来。但是,“有时”,当代码完成和退出时,下面的错误甚至会在time.sleep(10)之后出现。
{{py4j.java_gateway:1038}} INFO - Error while receiving.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1035, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
Py4JNetworkError: Answer from Java side is empty
[2018-11-22 09:06:40,293] {{root:899}} ERROR - Exception while sending command.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1040, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
[2018-11-22 09:06:40,293] {{py4j.java_gateway:443}} DEBUG - Exception while shutting down a socket
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 441, in quiet_shutdown
socket_instance.shutdown(socket.SHUT_RDWR)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
raise error(EBADF, 'Bad file descriptor')
error: [Errno 9] Bad file descriptor
我想原因是父进程python试图从终止的子进程'jvm‘获取日志消息。但连线的问题是错误并不总是引起..。
有什么建议吗?
发布于 2018-12-03 10:16:18
这个根本原因是“py4j”日志级别。
我将python日志级别设置为DEBUG,这让'py4j‘客户端& 'java’在关闭pyspark时引发连接错误。
因此,将python日志级别设置为INFO或更高级别将解决此问题。
参考文献:网关在关闭时引发异常。
参考文献:调低回调服务器消息的日志记录级别
参考文献:PySpark内件
https://stackoverflow.com/questions/53440309
复制相似问题