首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >gRPC:终止于(StatusCode.INTERNAL,接收到带有错误代码2的RST_STREAM )的会合

gRPC:终止于(StatusCode.INTERNAL,接收到带有错误代码2的RST_STREAM )的会合
EN

Stack Overflow用户
提问于 2018-01-09 18:05:56
回答 2查看 11.1K关注 0票数 7

我正在用python实现gRPC客户机和服务器。服务器成功地从客户端接收数据,但客户端收到“带有错误代码2的RST_STREAM”。

它到底意味着什么,我该如何修复它?

这是我的proto文件:

代码语言:javascript
运行
复制
service MyApi {
    rpc SelectModelForDataset (Dataset) returns (SelectedModel) {
    }
}
message Dataset {
    // ...
}
message SelectedModel {
    // ...
}

My Services实现如下所示:

代码语言:javascript
运行
复制
class MyApiServicer(my_api_pb2_grpc.MyApiServicer):
def SelectModelForDataset(self, request, context):
    print("Processing started.")
    selectedModel = ModelSelectionModule.run(request, context)  
    print("Processing Completed.")
    return selectedModel

我使用以下代码启动服务器:

代码语言:javascript
运行
复制
import grpc
from concurrent import futures
#...
server = grpc.server(futures.ThreadPoolExecutor(max_workers=100))
my_api_pb2_grpc.add_MyApiServicer_to_server(MyApiServicer(), server)
server.add_insecure_port('[::]:50051')
server.start()

我的当事人看起来是这样的:

代码语言:javascript
运行
复制
channel = grpc.insecure_channel(target='localhost:50051')
stub = my_api_pb2_grpc.MyApiStub(channel)
dataset = my_api_pb2.Dataset() 
# fill the object ...
model = stub.SelectModelForDataset(dataset)  # call server

客户端调用之后,服务器将开始处理直到完成(大约需要一分钟),但是客户端将立即返回,其中包含以下错误:

代码语言:javascript
运行
复制
Traceback (most recent call last):                                                                   
File "Client.py", line 32, in <module>                                                               
    run()                                                                                            
File "Client.py", line 26, in run                                                                    
    model = stub.SelectModelForDataset(dataset)  # call server                                       
File "/usr/local/lib/python3.5/dist-packages/grpc/_channel.py", line 484, in __call__
    return _end_unary_response_blocking(state, call, False, deadline)                                
File "/usr/local/lib/python3.5/dist-packages/grpc/_channel.py", line 434, in _end_unary_response_blocking                                                                                               
    raise _Rendezvous(state, None, None, deadline)                                                 
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INTERNAL, Received RST_STREAM with error code 2)>

如果我异步执行请求并等待将来,

代码语言:javascript
运行
复制
model_future = stub.SelectModelForDataset.future(dataset)  # call server
model = model_future.result()

客户端等待直到完成,但在完成之后仍然返回一个错误:

代码语言:javascript
运行
复制
Traceback (most recent call last):                                                                   
File "AsyncClient.py", line 35, in <module>                                                          
    run()                                                                                            
File "AsyncClient.py", line 29, in run                                                               
    model = model_future.result()                                                                    
File "/usr/local/lib/python3.5/dist-packages/grpc/_channel.py", line 276, in result                  
    raise self                                                                                     
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INTERNAL, Received RST_STREAM with error code 2)>

UPD:启用跟踪GRPC_TRACE=all之后,我发现了以下内容:

客户,在请求后立即:

代码语言:javascript
运行
复制
E0109 17:59:42.248727600    1981 channel_connectivity.cc:126] watch_completion_error: {"created":"@1515520782.248638500","description":"GOAWAY received","file":"src/core/ext/transport/chttp2/transport/chttp2_transport.cc","file_line":1137,"http2_error":0,"raw_bytes":"Server shutdown"}            
E0109 17:59:42.451048100    1979 channel_connectivity.cc:126] watch_completion_error: "Cancelled"  
E0109 17:59:42.451160000    1979 completion_queue.cc:659]    Operation failed: tag=0x7f6e5cd1caf8, error={"created":"@1515520782.451034300","description":"Timed out waiting for connection state change","file":"src/core/ext/filters/client_channel/channel_connectivity.cc","file_line":133}
...(last two messages keep repeating 5 times every second)

服务器:

代码语言:javascript
运行
复制
E0109 17:59:42.248201000    1985 completion_queue.cc:659]    Operation failed: tag=0x7f3f74febee8, error={"created":"@1515520782.248170000","description":"Server Shutdown","file":"src/core/lib/surface/server.cc","file_line":1249}                                                                    
E0109 17:59:42.248541100    1975 tcp_server_posix.cc:231]    Failed accept4: Invalid argument                                                                             
E0109 17:59:47.362868700    1994 completion_queue.cc:659]    Operation failed: tag=0x7f3f74febee8, error={"created":"@1515520787.362853500","description":"Server Shutdown","file":"src/core/lib/surface/server.cc","file_line":1249}                                                                                                                                             
E0109 17:59:52.430612500    2000 completion_queue.cc:659]    Operation failed: tag=0x7f3f74febee8, error={"created":"@1515520792.430598800","description":"Server Shutdown","file":"src/core/lib/surface/server.cc","file_line":1249}
... (last message kept repeating every few seconds)                                                             

UPD2:

我的Server.py文件的全部内容:

代码语言:javascript
运行
复制
import ModelSelectionModule
import my_api_pb2_grpc
import my_api_pb2
import grpc
from concurrent import futures
import time

class MyApiServicer(my_api_pb2_grpc.MyApiServicer):
    def SelectModelForDataset(self, request, context):
        print("Processing started.")
        selectedModel = ModelSelectionModule.run(request, context)
        print("Processing Completed.")
        return selectedModel


# TODO(shalamov): what is the best way to run a python server?
def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=100))
    my_api_pb2_grpc.add_MyApiServicer_to_server(MyApiServicer(), server)
    server.add_insecure_port('[::]:50051')
    server.start()

    print("gRPC server started\n")
    try:
        while True:
            time.sleep(24 * 60 * 60)  # run for 24h
    except KeyboardInterrupt:
        server.stop(0)


if __name__ == '__main__':
    serve()

UPD3:似乎是ModelSelectionModule.run造成了这个问题。我试着把它隔离成一个单独的线程,但是没有帮助。selectedModel最终是计算出来的,但当时客户端已经不在了。如何防止此调用干扰grpc?

代码语言:javascript
运行
复制
pool = ThreadPool(processes=1)
async_result = pool.apply_async(ModelSelectionModule.run(request, context))
selectedModel = async_result.get()

这个调用相当复杂,它生成和连接许多线程,调用不同的库,如scikit-learnsmac等。如果我把所有的东西都贴在这里,那就太过分了。

在调试过程中,我发现在客户端请求之后,服务器保持两个连接处于打开状态(fd 3fd 8)。如果我手动关闭fd 8或向它写入一些字节,我在客户机中看到的错误将变成Stream removed (而不是Received RST_STREAM with error code 2)。看来,套接字(fd 8)以某种方式被子进程破坏了。怎么可能?如何保护套接字不被子进程访问?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-01-18 19:38:25

这是在流程处理程序中使用fork()的结果。gRPC不支持这个用例。

票数 2
EN

Stack Overflow用户

发布于 2020-01-08 08:04:29

我刚才遇到了这个问题并解决了,您使用了with_call()方法吗?

错误码:

代码语言:javascript
运行
复制
response = stub.SayHello.with_call(request=request, metadata=metadata)

反应是一个元组。

成功代码:不要使用with_call()

代码语言:javascript
运行
复制
response = stub.SayHello(request=request, metadata=metadata)

响应是响应对象。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48174240

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档