首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >我需要做什么才能使Boost.Beast HTTP解析器找到主体的末尾?

我需要做什么才能使Boost.Beast HTTP解析器找到主体的末尾?
EN

Stack Overflow用户
提问于 2021-03-20 17:06:31
回答 1查看 2.1K关注 0票数 3

我正在尝试使用boost::beast::http::parser解析HTTPS响应。我的解析器的定义如下:

代码语言:javascript
运行
复制
boost::beast::http::parser<false, boost::beast::http::string_body> response_parser;

异步读取的回调如下所示:

代码语言:javascript
运行
复制
void AsyncHttpsRequest::on_response_read(const boost::system::error_code &error_code, uint32_t bytes_transferred)
{
    if (bytes_transferred > 0)
    {
        response_parser.put(boost::asio::buffer(data_buffer, bytes_transferred), http_error_code);
        std::cout << "Parser status: " << http_error_code.message() << std::endl;
        std::cout << "Read " << bytes_transferred << " bytes of HTTPS response" << std::endl;
        std::cout << std::string(data_buffer, bytes_transferred) << std::endl;
    }
    if (error_code)
    {
        std::cout << "Error during HTTPS response read: " << error_code.message() << std::endl;
        callback(error_code, response_parser.get());
    }
    else
    {
        if (response_parser.is_done())
        {
            callback(error_code, response_parser.get());
        }
        else
        {
            std::cout << "Response is not yet finished, reading more" << std::endl;
            read_response();
        }
    }
}

当响应没有主体时,一切正常工作,response_parser.is_done()返回true。但是,当响应包含一个主体时,它总是返回false,即使body是完全读取的。响应还有一个Content-Length头,它与正文中的字节数相匹配,所以没有问题。

Boost docs说,如果消息的语义表明需要一个主体,并且解析了整个主体,那么response_parser.is_done()应该返回true

当我使用Connection: keep-alive发送请求时,我被困在读取响应上,因为服务器没有什么可以发送的了,response_parser还没有完成。当我使用Connection: close时,会调用finish回调,但解析的boost::beast::http::message内部没有实体。然而,我在stdout中的日志显示,这里有body,并且它是完整的。

当从boost::beast::http::parser读取的字节数等于Content-Length时,我需要做什么才能使is_done()识别正文结束并在is_done()上返回Content-Length

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-03-21 00:36:51

你的期望是对的。

背景、细节和注意事项:

你可以观察到它确实起作用:

住在Coliru

代码语言:javascript
运行
复制
#include <boost/beast/http.hpp>
#include <iostream>
#include <iomanip>
#include <random>
using boost::system::error_code;
namespace http = boost::beast::http;

int main() {
    std::mt19937 prng { std::random_device{}() };
    std::uniform_int_distribution<size_t> packet_size { 1, 372 };

    std::string const response = 
"HTTP/1.1 200 OK\r\n"
"Age: 207498\r\n"
"Cache-Control: max-age=604800\r\n"
"Content-Type: text/html; charset=UTF-8\r\n"
"Date: Sat, 20 Mar 2021 23:24:40 GMT\r\n"
"Etag: \"3147526947+ident\"\r\n"
"Expires: Sat, 27 Mar 2021 23:24:40 GMT\r\n"
"Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT\r\n"
"Server: ECS (bsa/EB15)\r\n"
"Vary: Accept-Encoding\r\n"
"X-Cache: HIT\r\n"
"Content-Length: 1256\r\n"
"\r\n"
"<!doctype html>\n<html>\n<head>\n    <title>Example Domain</title>\n\n    <meta charset=\"utf-8\" />\n    <meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\" />\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\" />\n    <style type=\"text/css\">\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n        font-family: -apple-system, system-ui, BlinkMacSystemFont, \"Segoe UI\", \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif;\n        \n    }\n    div {\n        width: 600px;\n        margin: 5em auto;\n        padding: 2em;\n        background-color: #fdfdff;\n        border-radius: 0.5em;\n        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n    }\n    a:link, a:visited {\n        color: #38488f;\n        text-decoration: none;\n    }\n    @media (max-width: 700px) {\n        div {\n            margin: 0 auto;\n            width: auto;\n        }\n    }\n    </style>    \n</head>\n\n<body>\n<div>\n    <h1>Example Domain</h1>\n    <p>This domain is for use in illustrative examples in documents. You may use this\n    domain in literature without prior coordination or asking for permission.</p>\n    <p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n</div>\n</body>\n</html>\n";

    std::string const input = response + response;
    std::string_view emulated_stream = input;

    error_code ec;
    while (not emulated_stream.empty()) {
        std::cout << "== Emulated stream of " << emulated_stream.size()
                  << " remaining" << std::endl;

        http::parser<false, http::string_body> response_parser;

        while (not (ec or response_parser.is_done() or emulated_stream.empty())) {
            auto next     = std::min(packet_size(prng), emulated_stream.size());
            auto consumed = response_parser.put(
                boost::asio::buffer(emulated_stream.data(), next), ec);

            std::cout << "Consumed " << consumed << std::boolalpha
                      << "\tHeaders done:" << response_parser.is_header_done()
                      << "\tDone:" << response_parser.is_done()
                      << "\tChunked:" << response_parser.chunked()
                      << "\t" << ec.message() << std::endl;

            if (ec == http::error::need_more)
                ec.clear();

            emulated_stream.remove_prefix(consumed);
        }

        auto res = response_parser.release();

        std::cout << "== Content length " << res["Content-Length"] << " and body "
                  << res.body().length() << std::endl;
        std::cout << "== Headers: " << res.base() << std::endl;
    }

    std::cout << "== Stream depleted " << ec.message() << std::endl;
}

印刷品。

代码语言:javascript
运行
复制
== Emulated stream of 3182 remaining
Consumed 101    Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 234    Headers done:true   Done:false  Chunked:false   Success
Consumed 305    Headers done:true   Done:false  Chunked:false   Success
Consumed 326    Headers done:true   Done:false  Chunked:false   Success
Consumed 265    Headers done:true   Done:false  Chunked:false   Success
Consumed 216    Headers done:true   Done:false  Chunked:false   Success
Consumed 144    Headers done:true   Done:true   Chunked:false   Success
== Content length 1256 and body 1256
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Emulated stream of 1591 remaining
Consumed 204    Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 131    Headers done:true   Done:false  Chunked:false   Success
Consumed 355    Headers done:true   Done:false  Chunked:false   Success
Consumed 137    Headers done:true   Done:false  Chunked:false   Success
Consumed 139    Headers done:true   Done:false  Chunked:false   Success
Consumed 89 Headers done:true   Done:false  Chunked:false   Success
Consumed 87 Headers done:true   Done:false  Chunked:false   Success
Consumed 66 Headers done:true   Done:false  Chunked:false   Success
Consumed 355    Headers done:true   Done:false  Chunked:false   Success
Consumed 28 Headers done:true   Done:true   Chunked:false   Success
== Content length 1256 and body 1256
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Stream depleted Success

也许吧

  • 流内容实际上不是有效的HTTP
  • 您的响应根本没有内容长度标题。在本例中,在完成头文件解析后,true的值: 根据标题的内容,解析器可能需要和文件结束通知才能知道主体的末尾在哪里。如果这个函数返回真正的put_eof,那么当输入中永远没有额外的数据时,就必须调用
  • 你的包太小了。如果您将数据包大小分布缩小到如下所示的极限,您就会看到这种效果: std::uniform_int_distribution packet_size { 1,3 }; 这将不会导致任何内容被消耗。医生: 在某些情况下,输入缓冲区中的八进制数可能不足,以便向前推进。这是由代码error::need_more表示的。当发生这种情况时,调用方应该将额外的字节放入缓冲区序列并再次调用put。错误代码error::need_more是特殊的。当返回此错误时,如果已更新缓冲区,则后续的put调用可能会成功。

在您的实际代码中,您将不会继续使用少量的重试,因为缓冲区只会积累并最终满足取得进展的要求。

另请参阅

奖励:简化!

好消息是,你不需要经常使用这样复杂的东西。在大多数情况下,您只需将http::readhttp::async_read直接放入响应对象即可。

这将与引擎盖下的解析器一起完成整个舞蹈,而不必费心讨论细节:

住在Coliru

代码语言:javascript
运行
复制
boost::beast::flat_buffer buf;
boost::system::error_code ec;
for (http::response<http::string_body> res; !ec && read(pipe, buf, res, ec); res.clear()) {
    std::cout << "== Content length " << res["Content-Length"] << " and body "
              << res.body().length() << std::endl;
    std::cout << "== Headers: " << res.base() << std::endl;
}

std::cout << "== Stream depleted " << ec.message() << "\n" << std::endl;

那是all仍有指纹:

代码语言:javascript
运行
复制
== Content length 1256 and body 1256
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Content length 1256 and body 2512
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Stream depleted end of stream
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66724298

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档