文章/答案/技术大牛

发布

社区首页 >问答首页 >试图解析分块传输编码，但它不起作用，我解码的文件完全不可读

问试图解析分块传输编码，但它不起作用，我解码的文件完全不可读
EN

Stack Overflow用户

提问于 2021-03-31 14:12:39

回答 1查看 816关注 0票数 3

我试图解析Rest中块传输编码生成的数据，当我试图用字符串打印值时，我确实看到了数据有值，我认为它应该是工作的，但是当我试图将值赋值给文件时，文件是完全不可读的，下面的代码我使用了boost库，我将在代码中详细说明我的想法，我们将从代码的响应部分开始，我不知道我做错了什么。

   // Send the request.
    boost::asio::write(socket, request);

    // Read the response status line. The response streambuf will automatically
    // grow to accommodate the entire line. The growth may be limited by passing
    // a maximum size to the streambuf constructor.
    boost::asio::streambuf response;
    boost::asio::read_until(socket, response, "\r\n");

    // Check that response is OK.
    std::istream response_stream(&response);
    std::string http_version;
    response_stream >> http_version;
    unsigned int status_code;
    response_stream >> status_code;
    std::string status_message;
    std::getline(response_stream, status_message);
    if (!response_stream || http_version.substr(0, 5) != "HTTP/")
    {
        //std::cout << "Invalid response\n";
        return 9002;
         
    }
    if (status_code != 200)
    {
        //std::cout << "Response returned with status code " << status_code << "\n";
        return 9003;
    }
    
    // Read the response headers, which are terminated by a blank line.
    boost::asio::read_until(socket, response, "\r\n\r\n");

    // Process the response headers.
    //this portion of code I tried to parse the file name in the header of response which the file name is in the  content-disposition of header
    std::string header;
    std::string fullHeader = "";
    string zipfilename="", txtfilename="";
    bool foundfilename = false;
    while (std::getline(response_stream, header) && header != "\r")
    {
        fullHeader.append(header).append("\n");
        std::transform(header.begin(), header.end(), header.begin(),
            [](unsigned char c){ return std::tolower(c); });
        string containstr = "content-disposition";
        string containstr2 = "filename";
        string quotestr = "\"";
        if (header.find(containstr) != std::string::npos && header.find(containstr2) != std::string::npos)
        {
            int countquotes = 0;
            bool foundquote = true;
            
            std::size_t startpos = 0, beginpos, endpos;
            while (foundquote)
            {
                
                std::size_t myfound = header.find(quotestr, startpos);
                if (myfound != std::string::npos)
                {
                    if (countquotes % 2 == 0)
                        beginpos = myfound;
                    else
                    {
                        endpos = myfound;
                        foundfilename = true;
                    }

                    startpos = myfound + 1;
                    
                }
                else
                   foundquote = false;

                countquotes++;
            }

            if (endpos > beginpos && foundfilename)
            {
                size_t zipfileleng = endpos - beginpos;
                zipfilename = header.substr(beginpos+1, zipfileleng-1);
                txtfilename = header.substr(beginpos+1, zipfileleng-5);
            }
            else
                return 9004;

        }
    }

    if (foundfilename == false || zipfilename.length() == 0 || txtfilename.length() == 0)
        return 9005;

     //when the zipfilename has been found, we gonna get the data from the body of response, due to the response was  chunked transfer encoding, I tried to parse it,it's not complicated due to I saw it on the Wikipedia, it just first line was length of data,the next line was data,and it's the loop which over and over again ,all I tried to do was spliting all the data from the body of response by "\r\n" into a vector<string>, and I gonna read the data from that vector

      // Write whatever content we already have to output.
    std::string fullResponse = "";
    if (response.size() > 0)
    {
        std::stringstream ss;
        ss << &response;
        fullResponse = ss.str();
     
    
    }
    //tried split the entire body of response into a vector<string>

     vector<string> allresponsedata;
    split_regex(allresponsedata, fullResponse, boost::regex("(\r\n)+"));
    
    //tried to merge the data of response
    string zipfiledata;
    int myindex = 0;
    for (auto &x : allresponsedata) {
        std::cout << "Split: " << x << std::endl;// I tried to print the data, I did see the value in the variable of x

        if (myindex % 2 != 0)
        {
            zipfiledata = zipfiledata + x;//tried to accumulate the datas
        }


        myindex++;
    }
    
    //tried to write the data into a file
    std::ofstream zipfilestream(zipfilename, ios::out | ios::binary);
    zipfilestream.write(zipfiledata.c_str(), zipfiledata.length());
    zipfilestream.close();

    //afterward, the zipfile was built, but it's unreadable which it's not able to open,the zip utlities software says it's a damaged zip file though

我甚至尝试过类似于这个slow http client based on boost::asio - (Chunked Transfer)的其他方法，但是这种方式并不是那么有效，VS说

  1 IntelliSense: no instance of overloaded function "boost::asio::read" matches the argument list
        argument types are: (boost::asio::ip::tcp::socket, boost::asio::streambuf, boost::asio::detail::transfer_exactly_t, std::error_code)

它只是无法在以下行中编译：

size_t n = asio::read(socket, response, asio::transfer_exactly(chunk_bytes_to_read), error);

即使我读过asio::transfer_exactly的例子，在https://www.boost.org/doc/libs/1_57_0/doc/html/boost_asio/reference/transfer_exactly.html中也没有这样的例子

知道吗？

c++

http

boost

boost-asio

chunked-encoding

Stack Overflow用户

回答已采纳

发布于 2021-03-31 16:11:45

我看你看不出正确的格式：https://en.wikipedia.org/wiki/Chunked_transfer_encoding#Format

在积累完整的响应体之前，您需要读取块长度(十六进制)和任何可选的块扩展。

这需要在前面完成，因为您拆分的序列\r\n可以很容易地出现在块数据中。

再一次，我建议只使用Beast的支持，使其变得简单

 http::response<http::string_body> response;
 boost::asio::streambuf buf;
 http::read(socket, buf, response);

并且您将对标题进行完整的解析和解释(包括Trailer头！)以及response.body()中的内容作为std::string。

即使服务器不使用分块编码或与不同的编码选项组合，它也会做正确的事情。

根本没有理由重新发明轮子。

全演示

这演示了来自https://jigsaw.w3.org/HTTP/的分块编码测试url。

#include <boost/process.hpp>
#include <boost/beast.hpp>
#include <iostream>
namespace http = boost::beast::http;
using boost::asio::ip::tcp;

int main() {
    http::response<http::string_body> response;

    boost::asio::io_context ctx;
    tcp::socket socket(ctx);

    connect(socket, tcp::resolver{ctx}.resolve("jigsaw.w3.org", "http"));

    http::write(
            socket,
            http::request<http::empty_body>(
                http::verb::get, "/HTTP/ChunkedScript", 11));

    boost::asio::streambuf buf;
    http::read(socket, buf, response);

    std::cout << response.body() << "\n";
    std::cout << "Effective headers are:" << response.base() << "\n";
}

打印

This output will be chunked encoded by the server, if your client is HTTP/1.1
Below this line, is 1000 repeated lines of 0-9.
-------------------------------------------------------------------------
01234567890123456789012345678901234567890123456789012345678901234567890
01234567890123456789012345678901234567890123456789012345678901234567890
...996 lines removed ...
01234567890123456789012345678901234567890123456789012345678901234567890
01234567890123456789012345678901234567890123456789012345678901234567890

Effective headers are:HTTP/1.1 200 OK
cache-control: max-age=0
date: Wed, 31 Mar 2021 20:09:50 GMT
transfer-encoding: chunked
content-type: text/plain
etag: "1j3k6u8:tikt981g"
expires: Wed, 31 Mar 2021 20:09:49 GMT
last-modified: Mon, 18 Mar 2002 14:28:02 GMT
server: Jigsaw/2.3.0-beta3

票数 3

查看全部 1 条回答

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66889515

复制

相似问题

问试图解析分块传输编码，但它不起作用，我解码的文件完全不可读
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问试图解析分块传输编码，但它不起作用，我解码的文件完全不可读EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问试图解析分块传输编码，但它不起作用，我解码的文件完全不可读
EN