前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >使用gRPC基于Protobuf传输大文件或数据流

使用gRPC基于Protobuf传输大文件或数据流

作者头像
程序员小涛
发布2024-05-26 17:32:21
2970
发布2024-05-26 17:32:21
举报
文章被收录于专栏:涛的程序人生涛的程序人生

使用gRPC基于Protobuf传输大文件或数据流

在现代软件开发中,性能通常是关键的考虑因素之一,尤其是在进行大文件传输时。高效的协议和工具可以显著提升传输速度和可靠性。本文详细介绍如何使用gRPC和Protobuf进行大文件传输,并与传统TCP传输进行性能比较。

1. 背景和技术选择

在过去,大文件传输常常使用传统的TCP/IP协议,虽然简单但在处理大规模数据传输时往往速度较慢,尤其在网络条件不佳的环境下更是如此。在最近一个项目中,就有传输大数据文件的需求,用传统方式进行测试发现传输延时无法达到要求,于是在网上查阅资料发现了一个较优的解决方案。

相对于TCP,gRPC提供了一种现代的、高性能的解决方案。gRPC是一个高性能的远程过程调用(RPC)框架,由Google主导开发,使用HTTP/2作为传输层协议,支持多种开发语言,如C++, Java, Python和Go等。Protobuf(Protocol Buffers)则是一种轻量级的数据交换格式,可以高效地序列化结构化数据。

1.1 gRPC的优势
  • 高性能: 利用HTTP/2协议,支持多路复用、服务器推送等现代网络技术。
  • 跨语言支持: 支持多种编程语言,便于在不同的系统间交互。
  • 接口定义: 使用.proto文件定义服务,自动生成服务端和客户端代码,减少重复工作量。
  • 流控制: 支持流式传输数据,适合大文件传输和实时数据处理。
1.2 Protocol Buffers的优势
  • 高效: 编码和解码迅速,且生成的数据包比XML小3到10倍。
  • 灵活: 支持向后兼容性,新旧数据格式可以无缝转换。
  • 简洁: 简化了复杂数据结构的处理,易于开发者使用。

2. 项目配置与环境搭建

为了使用gRPC进行项目开发,首先需要在开发环境中安装gRPC及其依赖的库。以下是gRPC安装的步骤,适用于多种操作系统,包括Windows、Linux和macOS。

2.1 安装gRPC和Protocol Buffers

gRPC的安装可以通过多种方式进行,包括使用包管理器或从源代码编译。以下介绍Ubuntu下安装C++版本的gRPC(捆绑了Protocol Buffers)

注:如果gRPC和Protocol Buffers的版本不匹配会有问题,无法正常使用

2.1.1 安装Cmake
代码语言:javascript
复制
sudo apt install -y cmake
2.1.2 设置环境变量
代码语言:javascript
复制
export MY_INSTALL_DIR=$HOME/.local
mkdir -p $MY_INSTALL_DIR
export PATH="$MY_INSTALL_DIR/bin:$PATH" # 为了永久生效可以将该命令写入~/.bashrc文件中
2.1.3 安装必要的依赖
代码语言:javascript
复制
sudo apt install -y build-essential autoconf libtool pkg-config
2.1.4 下载gRPC源码
代码语言:javascript
复制
git clone --recurse-submodules -b v1.62.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc
2.1.5 编译gRPC和 Protocol Buffers
代码语言:javascript
复制
cd grpc
mkdir -p cmake/build
pushd cmake/build
cmake -DgRPC_INSTALL=ON \
      -DgRPC_BUILD_TESTS=OFF \
      -DCMAKE_INSTALL_PREFIX=$MY_INSTALL_DIR \
      ../..
make -j 4
make install
popd
2.2 CMake配置详解
2.1.1 通用配置

common.cmake 是一个辅助性的 CMake 模块文件,通常用于存放项目中共用的 CMake 配置,以简化和集中管理 CMakeLists.txt 文件中的代码。这种做法有助于提升项目的可维护性和可读性。

在 gRPC 项目中,示例代码中的common.cmake 包括以下内容:

  • 变量设置:定义项目中使用的常见路径和变量,例如 gRPC 和 protobuf 的安装路径,以便在整个项目中重用。
  • 库查找:使用 find_package()find_library() 命令来查找和配置项目所需的依赖库,如 gRPC、protobuf、SSL 等。
  • 编译器选项:统一设置编译器标志,例如 C++ 版本标准、优化级别、警告处理等。
  • 宏定义:创建复用的 CMake 宏或函数,例如用于处理 proto 文件生成相关命令的宏,这有助于避免在 CMakeLists.txt 文件中重复相同的代码块。
代码语言:javascript
复制
# Copyright 2018 gRPC authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# cmake build file for C++ route_guide example.
# Assumes protobuf and gRPC have been installed using cmake.
# See cmake_externalproject/CMakeLists.txt for all-in-one cmake build
# that automatically builds all the dependencies before building route_guide.

cmake_minimum_required(VERSION 3.8)

if(MSVC)
  add_definitions(-D_WIN32_WINNT=0x600)
endif()

find_package(Threads REQUIRED)

if(GRPC_AS_SUBMODULE)
  # One way to build a projects that uses gRPC is to just include the
  # entire gRPC project tree via "add_subdirectory".
  # This approach is very simple to use, but the are some potential
  # disadvantages:
  # * it includes gRPC's CMakeLists.txt directly into your build script
  #   without and that can make gRPC's internal setting interfere with your
  #   own build.
  # * depending on what's installed on your system, the contents of submodules
  #   in gRPC's third_party/* might need to be available (and there might be
  #   additional prerequisites required to build them). Consider using
  #   the gRPC_*_PROVIDER options to fine-tune the expected behavior.
  #
  # A more robust approach to add dependency on gRPC is using
  # cmake's ExternalProject_Add (see cmake_externalproject/CMakeLists.txt).

  # Include the gRPC's cmake build (normally grpc source code would live
  # in a git submodule called "third_party/grpc", but this example lives in
  # the same repository as gRPC sources, so we just look a few directories up)
  add_subdirectory(../../.. ${CMAKE_CURRENT_BINARY_DIR}/grpc EXCLUDE_FROM_ALL)
  message(STATUS "Using gRPC via add_subdirectory.")

  # After using add_subdirectory, we can now use the grpc targets directly from
  # this build.
  set(_PROTOBUF_LIBPROTOBUF libprotobuf)
  set(_REFLECTION grpc++_reflection)
  set(_ORCA_SERVICE grpcpp_orca_service)
  if(CMAKE_CROSSCOMPILING)
    find_program(_PROTOBUF_PROTOC protoc)
  else()
    set(_PROTOBUF_PROTOC $<TARGET_FILE:protobuf::protoc>)
  endif()
  set(_GRPC_GRPCPP grpc++)
  if(CMAKE_CROSSCOMPILING)
    find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)
  else()
    set(_GRPC_CPP_PLUGIN_EXECUTABLE $<TARGET_FILE:grpc_cpp_plugin>)
  endif()
elseif(GRPC_FETCHCONTENT)
  # Another way is to use CMake's FetchContent module to clone gRPC at
  # configure time. This makes gRPC's source code available to your project,
  # similar to a git submodule.
  message(STATUS "Using gRPC via add_subdirectory (FetchContent).")
  include(FetchContent)
  FetchContent_Declare(
    grpc
    GIT_REPOSITORY https://github.com/grpc/grpc.git
    # when using gRPC, you will actually set this to an existing tag, such as
    # v1.25.0, v1.26.0 etc..
    # For the purpose of testing, we override the tag used to the commit
    # that's currently under test.
    GIT_TAG        vGRPC_TAG_VERSION_OF_YOUR_CHOICE)
  FetchContent_MakeAvailable(grpc)

  # Since FetchContent uses add_subdirectory under the hood, we can use
  # the grpc targets directly from this build.
  set(_PROTOBUF_LIBPROTOBUF libprotobuf)
  set(_REFLECTION grpc++_reflection)
  set(_PROTOBUF_PROTOC $<TARGET_FILE:protoc>)
  set(_GRPC_GRPCPP grpc++)
  if(CMAKE_CROSSCOMPILING)
    find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)
  else()
    set(_GRPC_CPP_PLUGIN_EXECUTABLE $<TARGET_FILE:grpc_cpp_plugin>)
  endif()
else()
  # This branch assumes that gRPC and all its dependencies are already installed
  # on this system, so they can be located by find_package().

  # Find Protobuf installation
  # Looks for protobuf-config.cmake file installed by Protobuf's cmake installation.
  option(protobuf_MODULE_COMPATIBLE TRUE)
  find_package(Protobuf CONFIG REQUIRED)
  message(STATUS "Using protobuf ${Protobuf_VERSION}")

  set(_PROTOBUF_LIBPROTOBUF protobuf::libprotobuf)
  set(_REFLECTION gRPC::grpc++_reflection)
  if(CMAKE_CROSSCOMPILING)
    find_program(_PROTOBUF_PROTOC protoc)
  else()
    set(_PROTOBUF_PROTOC $<TARGET_FILE:protobuf::protoc>)
  endif()

  # Find gRPC installation
  # Looks for gRPCConfig.cmake file installed by gRPC's cmake installation.
  find_package(gRPC CONFIG REQUIRED)
  message(STATUS "Using gRPC ${gRPC_VERSION}")

  set(_GRPC_GRPCPP gRPC::grpc++)
  if(CMAKE_CROSSCOMPILING)
    find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)
  else()
    set(_GRPC_CPP_PLUGIN_EXECUTABLE $<TARGET_FILE:gRPC::grpc_cpp_plugin>)
  endif()
endif()
2.2.2 项目配置

这个配置文件包括了从proto文件生成C++代码的命令,以及编译这些生成的源代码文件为库和可执行文件的命令。利用CMake,我们能够确保项目在不同环境中具有可重复构建的能力。

代码语言:javascript
复制
cmake_minimum_required(VERSION 3.15)

project(grpcDemo)

set(CMAKE_CXX_STANDARD 17)

include(common.cmake)

# Proto file
get_filename_component(transfer_proto "transferfile.proto" ABSOLUTE)
get_filename_component(transfer_proto_path "${transfer_proto}" PATH)

# Generated sources
set(transfer_proto_srcs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.pb.cc")
set(transfer_proto_hdrs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.pb.h")
set(transfer_grpc_srcs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.grpc.pb.cc")
set(transfer_grpc_hdrs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.grpc.pb.h")
add_custom_command(
      OUTPUT "${transfer_proto_srcs}" "${transfer_proto_hdrs}" "${transfer_grpc_srcs}" "${transfer_grpc_hdrs}"
      COMMAND ${_PROTOBUF_PROTOC}
      ARGS --grpc_out "${CMAKE_CURRENT_BINARY_DIR}"
        --cpp_out "${CMAKE_CURRENT_BINARY_DIR}"
        -I "${transfer_proto_path}"
        --plugin=protoc-gen-grpc="${_GRPC_CPP_PLUGIN_EXECUTABLE}"
        "${transfer_proto}"
      DEPENDS "${transfer_proto}")

# Include generated *.pb.h files
include_directories("${CMAKE_CURRENT_BINARY_DIR}")

# transfer_grpc_proto
add_library(transfer_grpc_proto
  ${transfer_grpc_srcs}
  ${transfer_grpc_hdrs}
  ${transfer_proto_srcs}
  ${transfer_proto_hdrs})
target_link_libraries(transfer_grpc_proto
  ${_REFLECTION}
  ${_GRPC_GRPCPP}
  ${_PROTOBUF_LIBPROTOBUF})


# Targets greeter_[async_](client|server)
foreach(_target
  transfer_client transfer_server)
  add_executable(${_target} "${_target}.cpp")
  target_link_libraries(${_target}
    transfer_grpc_proto
    ${_REFLECTION}
    ${_GRPC_GRPCPP}
    ${_PROTOBUF_LIBPROTOBUF})
endforeach()
2.3 服务定义与proto文件

在gRPC中,服务和消息的定义是通过.proto文件进行的。例如,定义一个文件传输服务,可以在transferfile.proto中如下定义:

代码语言:javascript
复制
syntax = "proto3";

package filetransfer;

service FileTransferService {
  rpc Upload(stream FileChunk) returns (UploadStatus) {}
}

message FileChunk {
  bytes content = 1;
}

message UploadStatus {
  bool success = 1;
  string message = 2;
}

这里定义了一个FileTransferService服务,包含了一个Upload方法,该方法接受一个FileChunk类型的流,并返回一个UploadStatus状态。

3. 客户端和服务端的实现

客户端和服务端的实现是通过gRPC框架生成的接口进行的,这些接口基于前面定义的.proto文件。

3.1 gRPC客户端实现

客户端的主要职责是打开文件,读取数据,然后以流的形式发送到服务端。实现代码如下:

代码语言:javascript
复制
#include <iostream>
#include <string>
#include <fstream>
#include <chrono>
#include <grpcpp/grpcpp.h>
#include <grpcpp/channel.h>
#include <grpcpp/client_context.h>
#include <grpcpp/create_channel.h>
#include <grpcpp/security/credentials.h>
#include "transferfile.grpc.pb.h"

using grpc::Channel;
using grpc::ClientContext;
using grpc::ClientWriter;
using grpc::Status;
using transferfile::FileChunk;
using transferfile::FileUploadStatus;
using transferfile::TransferFile;

#define CHUNK_SIZE 3 * 1024 * 1024 // 3MB

class TransferClient
{
public:
    TransferClient(std::shared_ptr<Channel> channel) : stub_(TransferFile::NewStub(channel)) {}

    void uploadFile(const std::string &filename);

private:
    std::unique_ptr<TransferFile::Stub> stub_;
};

void TransferClient::uploadFile(const std::string &filename)
{
    FileChunk chunk;
    char *buffer = new char[CHUNK_SIZE];
    FileUploadStatus status;
    ClientContext context;
    std::ifstream infile;
    unsigned long len = 0;

    auto start = std::chrono::steady_clock::now();
    infile.open(filename, std::ios::binary | std::ios::in);
    if (!infile.is_open())
    {
        std::cerr << "Error: File not found" << std::endl;
        return;
    }

    std::unique_ptr<ClientWriter<FileChunk>> writer(stub_->UploadFile(&context, &status)); // Create a writer to send chunks of file
    while (!infile.eof())
    {
        infile.read(buffer, CHUNK_SIZE);
        chunk.set_buffer(buffer, infile.gcount());
        if (!writer->Write(chunk))
        {
            std::cerr << "Error: Failed to write chunk" << std::endl;
            break;
        }
        len += infile.gcount();
    }

    infile.close();
    delete[] buffer;

    writer->WritesDone();
    Status status1 = writer->Finish();
    if (status1.ok() && len != status.length())
    {
        auto end = std::chrono::steady_clock::now();
        std::cout << "File uploaded successfully" << std::endl;
        std::cout << "Time taken: " << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
        std::cout << "File size: " << len << " bytes" << std::endl;
        auto speed = (len / std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()) * 1000 / 1024 / 1024;
        std::cout << "Speed: " << speed << " MB/s" << std::endl;
    }
    else
    {
        std::cerr << "Error: " << status1.error_message() << "len: " << len << " status.length(): " << status.length()
                  << std::endl;
    }
}

int main(int argc, char *argv[])
{
    if (argc < 2)
    {
        std::cerr << "Usage: " << argv[0] << " <filename> [server_ip]" << std::endl;
        return 1;
    }

    std::string filename(argv[1]);
    std::string server_ip = "localhost";

    if (argc == 3)
    {
        server_ip = argv[2];
    }

    TransferClient client(grpc::CreateChannel(server_ip + ":50051", grpc::InsecureChannelCredentials()));
    client.uploadFile(filename);

    return 0;
}

客户端代码展示了如何创建一个gRPC客户端,如何打开文件,如何将文件切割成块,并且如何将这些块通过网络发送到服务端。

3.2 gRPC服务端实现

服务端的实现则负责接收来自客户端的数据块,并将其写入到服务器上的文件中。服务端代码如下:

代码语言:javascript
复制
#include <iostream>
#include <string>
#include <fstream>
#include <grpcpp/grpcpp.h>
#include <grpcpp/server.h>
#include <grpcpp/server_builder.h>
#include <grpcpp/server_context.h>
#include <grpcpp/security/server_credentials.h>
#include "transferfile.grpc.pb.h"
#include <sys/resource.h>
#include <unistd.h>
#include <chrono>

using grpc::Server;
using grpc::ServerBuilder;
using grpc::ServerContext;
using grpc::ServerReader;
using grpc::Status;
using transferfile::FileChunk;
using transferfile::FileUploadStatus;
using transferfile::TransferFile;
using namespace std::chrono;

#define CHUNK_SIZE 3 * 1024 * 1024 // 3MB

class TransferServiceImpl final : public TransferFile::Service
{
public:
    Status UploadFile(ServerContext *context, ServerReader<FileChunk> *reader, FileUploadStatus *status) override;

protected:
    void printUsage(const struct rusage &start, const struct rusage &end)
    {
        std::cout << "CPU Usage: User time: " << (end.ru_utime.tv_sec - start.ru_utime.tv_sec) + (end.ru_utime.tv_usec - start.ru_utime.tv_usec) / 1000000.0
                  << "s, System time: " << (end.ru_stime.tv_sec - start.ru_stime.tv_sec) + (end.ru_stime.tv_usec - start.ru_stime.tv_usec) / 1000000.0 << "s" << std::endl;
        std::cout << "Max resident set size: " << (end.ru_maxrss - start.ru_maxrss) << " KB" << std::endl;
    }
};

Status TransferServiceImpl::UploadFile(ServerContext *context, ServerReader<FileChunk> *reader, FileUploadStatus *status)
{
    FileChunk chunk;
    std::ofstream outfile;
    const char *data;

    struct rusage usage_start, usage_end;
    getrusage(RUSAGE_SELF, &usage_start);

    auto start = high_resolution_clock::now();

    outfile.open("output.bin", std::ios::binary | std::ios::out | std::ios::trunc);
    if (!outfile.is_open())
    {
        std::cerr << "Error: Failed to open file" << std::endl;
        return Status::CANCELLED;
    }

    while (reader->Read(&chunk))
    {
        data = chunk.buffer().c_str();
        outfile.write(data, chunk.buffer().length());
    }

    long pos = outfile.tellp();
    status->set_length(pos);
    outfile.close();

    auto end = high_resolution_clock::now();
    getrusage(RUSAGE_SELF, &usage_end);
    auto duration = duration_cast<milliseconds>(end - start);

    printUsage(usage_start, usage_end);
    std::cout << "Total transmission time: " << duration.count() << " ms" << std::endl;
    std::cout << "Total data transmitted: " << pos << " bytes" << std::endl;
    std::cout << "Transmission rate: " << (pos * 1000.0 / duration.count()) / 1024 / 1024 << " MB/s" << std::endl;

    return Status::OK;
}

void RunServer()
{
    std::string server_address("0.0.0.0:50051");
    TransferServiceImpl service;

    ServerBuilder builder;
    builder.AddListeningPort(server_address, grpc::InsecureServerCredentials());
    builder.RegisterService(&service);

    std::unique_ptr<Server> server(builder.BuildAndStart());
    std::cout << "Server listening on " << server_address << std::endl;
    server->Wait();
}

int main(int argc, char **argv)
{
    RunServer();
    return 0;
}

服务端代码展示了如何创建一个gRPC服务端,如何接收客户端发送的数据块,以及如何将这些数据块写入到磁盘文件中。

3.3 TCP客户端实现

功能同gRPC客户端

代码语言:javascript
复制
#include <iostream>
#include <fstream>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <string>
#include <cstring>

int main(int argc, char *argv[])
{
    const char *server_ip = "127.0.0.1";
    int server_port = 8888;
    std::string file_path = argv[1];

    // Create socket
    int sock = socket(AF_INET, SOCK_STREAM, 0);
    if (sock < 0)
    {
        std::cerr << "Socket creation failed." << std::endl;
        return 1;
    }

    // Define the server address
    struct sockaddr_in server_address;
    memset(&server_address, 0, sizeof(server_address));
    server_address.sin_family = AF_INET;
    server_address.sin_port = htons(server_port);
    inet_pton(AF_INET, server_ip, &server_address.sin_addr);

    // Connect to the server
    if (connect(sock, (struct sockaddr *)&server_address, sizeof(server_address)) < 0)
    {
        std::cerr << "Connection to server failed." << std::endl;
        close(sock);
        return 1;
    }

    std::ifstream file(file_path, std::ios::binary);
    if (!file.is_open())
    {
        std::cerr << "Failed to open file: " << file_path << std::endl;
        close(sock);
        return 1;
    }

    // Send file contents
    char *buffer = new char[1024];
    while (file.read(buffer, sizeof(buffer)) || file.gcount())
    {
        if (send(sock, buffer, file.gcount(), 0) < 0)
        {
            std::cerr << "Failed to send data." << std::endl;
            break;
        }
    }

    delete[] buffer;

    std::cout << "File sent successfully." << std::endl;

    file.close();
    close(sock);

    return 0;
}
3.4 TCP服务端实现

功能同gRPC服务端

代码语言:javascript
复制
#include <iostream>
#include <fstream>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <cstring>
#include <string>
#include <sys/resource.h>
#include <unistd.h>
#include <chrono>

#define CHUNK_SIZE 3 * 1024 * 1024 // 3MB

using namespace std::chrono;

void printUsage(const struct rusage &start, const struct rusage &end)
{
    std::cout << "CPU Usage: User time: " << (end.ru_utime.tv_sec - start.ru_utime.tv_sec) + (end.ru_utime.tv_usec - start.ru_utime.tv_usec) / 1000000.0
              << "s, System time: " << (end.ru_stime.tv_sec - start.ru_stime.tv_sec) + (end.ru_stime.tv_usec - start.ru_stime.tv_usec) / 1000000.0 << "s" << std::endl;
    std::cout << "Max resident set size: " << (end.ru_maxrss - start.ru_maxrss) << " KB" << std::endl;
}

int main()
{
    int server_port = 8888;
    const char *output_file = "output_received_file";

    // Create socket
    int server_sock = socket(AF_INET, SOCK_STREAM, 0);
    if (server_sock < 0)
    {
        std::cerr << "Socket creation failed." << std::endl;
        return 1;
    }

    // Bind socket to IP / port
    struct sockaddr_in server_address;
    memset(&server_address, 0, sizeof(server_address));
    server_address.sin_family = AF_INET;
    server_address.sin_port = htons(server_port);
    server_address.sin_addr.s_addr = INADDR_ANY;

    if (bind(server_sock, (struct sockaddr *)&server_address, sizeof(server_address)) < 0)
    {
        std::cerr << "Bind failed." << std::endl;
        close(server_sock);
        return 1;
    }

    // Listen
    if (listen(server_sock, 10) < 0)
    {
        std::cerr << "Listen failed." << std::endl;
        close(server_sock);
        return 1;
    }

    std::cout << "Server is listening on port " << server_port << std::endl;

    // Accept connection
    struct sockaddr_in client_address;
    socklen_t client_len = sizeof(client_address);
    int client_sock = accept(server_sock, (struct sockaddr *)&client_address, &client_len);
    if (client_sock < 0)
    {
        std::cerr << "Accept failed." << std::endl;
        close(server_sock);
        return 1;
    }

    struct rusage usage_start, usage_end;
    getrusage(RUSAGE_SELF, &usage_start);

    auto start = high_resolution_clock::now();

    std::ofstream file(output_file, std::ios::binary);
    if (!file.is_open())
    {
        std::cerr << "Failed to open file for writing." << std::endl;
        close(client_sock);
        close(server_sock);
        return 1;
    }

    // Receive data
    char *buffer = new char[CHUNK_SIZE];
    int bytes_received;
    while ((bytes_received = recv(client_sock, buffer, sizeof(buffer), 0)) > 0)
    {
        file.write(buffer, bytes_received);
    }

    if (bytes_received < 0)
    {
        std::cerr << "Error in recv()." << std::endl;
    }
    else
    {
        std::cout << "File received successfully." << std::endl;
        long pos = file.tellp();
        auto end = high_resolution_clock::now();
        getrusage(RUSAGE_SELF, &usage_end);
        auto duration = duration_cast<milliseconds>(end - start);

        printUsage(usage_start, usage_end);
        std::cout << "Total transmission time: " << duration.count() << " ms" << std::endl;
        std::cout << "Total data transmitted: " << pos << " bytes" << std::endl;
        std::cout << "Transmission rate: " << (pos * 1000.0 / duration.count()) / 1024 / 1024 << " MB/s" << std::endl;
    }

    delete[] buffer;

    file.close();
    close(client_sock);
    close(server_sock);
    return 0;
}

4. 性能测试与分析

为了验证gRPC与Protobuf的效率,我设置了一个基准测试,比较使用gRPC和传统TCP直接传输大文件的性能差异。

4.1 测试方法

测试方法包括:

  1. 准备一定大小的测试文件,这里随机生成了2GB的文件。(fallocate -l 2G 2GBfile.txt
  2. 分别使用gRPC和TCP传输此文件,记录所需的总时间和CPU、内存等资源的使用情况。
  3. 重复测试,确保数据的准确性。
4.1.1 gPRC测试结果
grpc
grpc
4.1.2 TCP测试结果
tcp
tcp
4.1.3 多次测试结果

User time

System time

Max resident set size

Transmission time

Transmission rate

gRPC

28.5267 s

17.2359 s

18240 KB

18.265 s

112.127 MB/s

TCP

43.2163 s

143.779 s

0 KB

187.408 s

10.928 MB/s

gRPC

29.3631 s

17.477 s

17724 KB

18.312 s

111.839 MB/s

TCP

42.2783 s

140.658 s

0 KB

183.123 s

11.1837 MB/s

gRPC

28.8431 s

17.7514 s

16852 KB

18.691 s

109.571 MB/s

TCP

43.4233 s

144.232 s

0 KB

189.021 s

10.834 MB/s

gRPC

37.5632 s

19.1164 s

15424 KB

18.589 s

110.173 MB/s

TCP

40.2059 s

140.3 s

0 KB

180.772 s

11.3292 MB/s

gRPC

36.712 s

18.1812 s

14016 KB

18.274 s

112.072 MB/s

TCP

37.9126 s

144.276 s

0 KB

182.392 s

11.2286 MB/s

4.2 性能比较结果
vs
vs
4.2.1 CPU 使用

gRPC在CPU使用上明显低于传统的TCP socket方法。这主要因为gRPC内部使用了更现代的HTTP/2协议,它支持多路复用、服务器推送等高效的数据传输机制,而不需要像TCP那样对每个文件传输任务建立单独的连接。此外,gRPC的实现中可能包含了更优化的数据处理路径,减少了上下文切换和系统调用的开销。

4.2.2 内存复用

Max resident set size(最大常驻集大小)表示进程在内存中占用的最大空间。TCP方式的最大常驻集大小一直是0KB,而gRPC分配的比较多,这是由于gRPC框架本身的内存需求,以及可能的内存缓冲机制,这有助于提高数据处理的速率和效率。

4.2.3 传输时间和速率

gRPC在传输速度上极大超过了TCP socket。这种巨大的差异主要来自于gRPC使用HTTP/2的优势,如头部压缩、二进制帧传输和连接复用。HTTP/2的二进制帧结构使得传输更加高效,并减少了因为文本解析带来的开销。此外,连接复用允许在单一连接上并行交换消息,从而显著提升了数据传输效率,减少了因建立和关闭多个TCP连接所产生的延迟和资源消耗。

测试结果显示,使用gRPC和Protobuf传输大文件在多个方面均优于传统TCP方法:

  • 传输速度: gRPC利用HTTP/2的多路复用功能,可以在一个连接中并行传输多个文件,显著提升了传输效率。
  • 资源利用率: gRPC的传输过程中CPU利用率较低,内存复用率较高,更适合长时间运行的应用场景。
  • 错误处理: gRPC内置的错误处理机制能有效地管理网络问题和数据传输错误,保证数据的完整性。
  • 高效的数据序列化: Protobuf非常高效,生成的数据包体积小,通常比相等的XML小3到10倍。这意味着在网络上传输相同的数据量时,Protobuf需要的带宽更少。
  • 快速序列化与反序列化: Protobuf提供了非常快的数据序列化和反序列化能力,这对于性能要求高的应用尤其重要,可以显著减少数据处理时间。
  • 明确的结构定义: 使用Protobuf可以让数据结构更加清晰和严格,有助于团队内部的沟通和后期的维护。
  • 避免手动解析:与自定义的二进制格式相比,Protobuf避免了手动解析数据的错误和复杂性,因为解析工作是自动化的,由工具链支持。

5. 结论

使用gRPC和Protobuf传输大文件,不仅提高了传输速度,而且确保了更高的可靠性和更低的资源消耗。这使得gRPC成为大规模数据处理和分布式系统中的理想选择。未来,随着技术的进一步成熟和优化,预计gRPC在更多场景中将显示出其优越性。

希望本文对于那些寻求改进大文件传输性能的开发者有所帮助,并能够启发更多的技术创新和应用。

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2024-05-12,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 使用gRPC基于Protobuf传输大文件或数据流
    • 1. 背景和技术选择
      • 1.1 gRPC的优势
      • 1.2 Protocol Buffers的优势
    • 2. 项目配置与环境搭建
      • 2.1 安装gRPC和Protocol Buffers
      • 2.2 CMake配置详解
      • 2.3 服务定义与proto文件
    • 3. 客户端和服务端的实现
      • 3.1 gRPC客户端实现
      • 3.2 gRPC服务端实现
      • 3.3 TCP客户端实现
      • 3.4 TCP服务端实现
    • 4. 性能测试与分析
      • 4.1 测试方法
      • 4.2 性能比较结果
    • 5. 结论
    相关产品与服务
    领券
    问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档