文章/答案/技术大牛

发布

社区首页 >问答首页 >在Erlang中，当进程的邮箱增长更大时，它运行得更慢，为什么？

问在Erlang中，当进程的邮箱增长更大时，它运行得更慢，为什么？
EN

Stack Overflow用户

提问于 2016-03-25 08:06:11

回答 3查看 1.9K关注 0票数 5

下面是示例：server.erl

当一个进程在其邮箱中收到10000条消息时，需要0.043秒才能完成。当数字为50000时，应该需要0.215秒，但实际时间是2.4秒，是速度的10倍。为什么？

Erlang/OTP 18 [erts-7.1] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:true]

Eshell V7.1 (abort with ^G)

1> test_for_gen_server:start_link().

{ok,<0.36.0>}

2> test_for_gen_server:test(10000).

ok

======gen_server: Times:10000 Cost:42863

3> test_for_gen_server:test(10000).

ok

======gen_server: Times:10000 Cost:43096

4> test_for_gen_server:test(10000).

ok

======gen_server: Times:10000 Cost:43223

5> test_for_gen_server:test(50000).

ok

======gen_server: Times:50000 Cost:2504395

6> test_for_gen_server:test(50000).

ok

======gen_server: Times:50000 Cost:2361987

7> test_for_gen_server:test(50000).

ok

======gen_server: Times:50000 Cost:2304715

erlang

回答 3

Stack Overflow用户

回答已采纳

发布于 2016-03-25 10:19:00

在注释之后，在本例中，它实际上不是由邮箱大小引起的，因为在gen_server中邮箱消息总是匹配的。见循环播放。

在本例中，执行速度较慢的原因是代码的额外复杂性，特别是需要由垃圾收集器释放少量数据的多次分配(因此与邮箱大小无关，但与代码执行次数无关)。

下面是一个稍微修改过的代码版本，主要区别是消息队列在接收到start消息后被填满。除了您的示例之外，还有其他7个变体，每一个都稍微修改了/简化了循环的版本。第二个循环基于您可以找到代码的流。

-module (test_for_gen_server).

-behaviour (gen_server).

%% APIs
-export([test1/1, test2/1, test3/1, test4/1, test5/1, test6/1, test7/1,
         test8/1, test9/1]).

%% gen_server callbacks
-export([init/1, handle_call/3, handle_cast/2, handle_info/2,
         terminate/2, code_change/3]).

test1(N) ->
    {ok, Pid} = gen_server:start_link(?MODULE, [], []),
    Pid ! {start, N}.

test2(N) -> Pid = spawn(fun() -> loop2([undefined, 0]) end), Pid ! {start, N}.
test3(N) -> Pid = spawn(fun() -> loop3([undefined, 0]) end), Pid ! {start, N}.
test4(N) -> Pid = spawn(fun() -> loop4([undefined, 0]) end), Pid ! {start, N}.
test5(N) -> Pid = spawn(fun() -> loop5([undefined, 0]) end), Pid ! {start, N}.
test6(N) -> Pid = spawn(fun() -> loop6([undefined, 0]) end), Pid ! {start, N}.
test7(N) -> Pid = spawn(fun() -> loop7([undefined, 0]) end), Pid ! {start, N}.
test8(N) -> Pid = spawn(fun() -> loop8(undefined, 0) end), Pid ! {start, N}.
test9(N) -> Pid = spawn(fun() -> loop9({undefined, 0}) end), Pid ! {start, N}.

%%==============================================================================

init([]) ->
    {ok, []}.
handle_call(_Request, _From, State) ->
    {reply, nomatch, State}.
handle_cast(_Msg, State) ->
    {noreply, State}.

handle_info({start, N}, _State) ->
    do_test(N),
    {A,B,C} = os:timestamp(),
    Timestamp = (A * 1000000 + B) * 1000000 + C,
    {noreply, [Timestamp, 0]};
handle_info(stop, [Timestamp, Times]) ->
    {A,B,C} = os:timestamp(),
    Timestamp1 = (A * 1000000 + B) * 1000000 + C,
    Cost = Timestamp1 - Timestamp,
    io:format("======gen_server:  Times:~p Cost:~p~n", [Times, Cost]),
    {stop, normal, []};
handle_info(_Info, [Timestamp, Times]) ->
    {noreply, [Timestamp, Times + 1]}.

terminate(_Reason, _State) -> ok.

code_change(_OldVer, State, _Extra) -> {ok, State}.

do_test(0) -> self() ! stop;
do_test(N) -> self() ! a, do_test(N - 1).

%%==============================================================================

loop2(State) ->
    Msg = receive
              Input -> Input
          end,
    Reply = {ok, handle_info(Msg, State)},
    handle_common_reply(Reply, Msg, State).

handle_common_reply(Reply, _Msg, _State) ->
    case Reply of
        {ok, {noreply, NState}} -> loop2(NState);
        {ok, {stop, normal, _}} -> ok
    end.

%%==============================================================================

loop3(State) ->
    Msg = receive
              Input -> Input
          end,
    Reply = {ok, handle_info(Msg, State)},
    case Reply of
        {ok, {noreply, NState}} -> loop3(NState);
        {ok, {stop, normal, _}} -> ok
    end.

%%==============================================================================

loop4(State) ->
    Msg = receive
              Input -> Input
          end,
    case handle_info(Msg, State) of
        {noreply, NState} -> loop4(NState);
        {stop, normal, _} -> ok
    end.

%%==============================================================================

loop5(State) ->
    receive
        Input ->
            case handle_info(Input, State) of
                {noreply, NState} -> loop5(NState);
                {stop, normal, _} -> ok
            end
    end.

%%==============================================================================

loop6(State) ->
    receive
        {start, _N} = Msg ->
            {noreply, NState} = handle_info(Msg, State),
            loop6(NState);
        stop = Msg ->
            {stop, normal, []} = handle_info(Msg, State);
        Info ->
            {noreply, NState} = handle_info(Info, State),
            loop6(NState)
    end.

%%==============================================================================

loop7([Timestamp, Times]) ->
    receive
        {start, N} ->
            do_test(N),
            {A,B,C} = os:timestamp(),
            NTimestamp = (A * 1000000 + B) * 1000000 + C,
            loop7([NTimestamp, 0]);
        stop ->
            {A,B,C} = os:timestamp(),
            NTimestamp = (A * 1000000 + B) * 1000000 + C,
            Cost = NTimestamp - Timestamp,
            io:format("======Times:~p Cost:~p~n", [Times, Cost]);
        _Info ->
            loop7([Timestamp, Times + 1])
    end.

%%==============================================================================

loop8(Timestamp, Times) ->
    receive
        {start, N} ->
            do_test(N),
            {A,B,C} = os:timestamp(),
            NTimestamp = (A * 1000000 + B) * 1000000 + C,
            loop8(NTimestamp, 0);
        stop ->
            {A,B,C} = os:timestamp(),
            NTimestamp = (A * 1000000 + B) * 1000000 + C,
            Cost = NTimestamp - Timestamp,
            io:format("======Times:~p Cost:~p~n", [Times, Cost]);
        _Info ->
            loop8(Timestamp, Times + 1)
    end.

%%==============================================================================

loop9({Timestamp, Times}) ->
    receive
        {start, N} ->
            do_test(N),
            {A,B,C} = os:timestamp(),
            NTimestamp = (A * 1000000 + B) * 1000000 + C,
            loop9({NTimestamp, 0});
        stop ->
            {A,B,C} = os:timestamp(),
            NTimestamp = (A * 1000000 + B) * 1000000 + C,
            Cost = NTimestamp - Timestamp,
            io:format("======Times:~p Cost:~p~n", [Times, Cost]);
        _Info ->
            loop9({Timestamp, Times + 1})
    end.

结果：

28> c(test_for_gen_server).          
{ok,test_for_gen_server}
29> test_for_gen_server:test1(50000).
{start,50000}
======gen_server:  Times:50000 Cost:2285054

30> test_for_gen_server:test2(50000).
{start,50000}
======gen_server:  Times:50000 Cost:2170294

31> test_for_gen_server:test3(50000).
{start,50000}
======gen_server:  Times:50000 Cost:1520796

32> test_for_gen_server:test4(50000).
{start,50000}
======gen_server:  Times:50000 Cost:1526084

33> test_for_gen_server:test5(50000).
{start,50000}
======gen_server:  Times:50000 Cost:1510738

34> test_for_gen_server:test6(50000).
{start,50000}
======gen_server:  Times:50000 Cost:1496024

35> test_for_gen_server:test7(50000).
{start,50000}
======Times:50000 Cost:863876

36> test_for_gen_server:test8(50000).
{start,50000}
======Times:50000 Cost:5830

47> test_for_gen_server:test9(50000).
{start,50000}
======Times:50000 Cost:640157

您可以看到，每次更改都会使执行时间变得越来越短。注意test2和test3之间的区别，其中代码中唯一的区别是附加函数调用。特别要注意test7和test8之间的巨大差异，其中代码中唯一的区别是额外创建和销毁一个在test7情况下每次执行循环的元素列表。

可以执行最后一个循环，而无需在堆栈上分配任何东西，只使用VM虚拟寄存器，因此将是最快的。其他循环总是在堆栈上分配一些数据，然后垃圾收集器必须定期释放这些数据。

Note

刚添加了test9，以表明在函数之间传递参数时使用元组而不是列表通常可以提供更好的性能。

先前的答案留待参考

这是因为receive子句需要将传入的消息与该子句中可能出现的模式匹配。它从邮箱中获取每条消息，并尝试将其与模式匹配。第一个匹配的将被处理。

因此，如果队列是由于消息不匹配而生成的，则处理每个新传入消息需要的时间越来越长(因为匹配总是从队列中的第一个消息开始)。

因此，按照乔·阿姆斯特朗博士论文的建议(第5.8节)，在gen服务器中始终刷新未知消息是一个很好的实践。

本文用更多的细节来解释它：Erlang解释道:选择性接收，并在前面提到的Joe论文的3.4节中对其进行了解释。

票数 12

Stack Overflow用户

发布于 2016-12-12 03:32:22

最近，我在药剂上遇到了同样的问题，最后我找到了答案在本文中

在第三节中，erlang的内存体系结构以进程为中心.每个进程分配和管理自己的内存区域，其中通常包括PCB、专用堆栈和专用堆。

这导致了缺点:内存碎片很高。

一个进程不能利用另一个进程的内存(例如堆)，即使该内存区域中有大量未使用的空间。这通常意味着默认情况下进程只能分配少量内存。这通常会导致大量对垃圾收集器的调用。

所以大容量的邮箱会导致整个系统变慢。

票数 3

Stack Overflow用户

发布于 2019-08-09 20:27:03

添加到蛋汤饭的非常有用的回答：

在Erlang/OTP 20中，可以使用进程标志message_queue_data = off_heap来避免GC在大型邮箱上的减速。请参阅process_flag(Flag :: message_queue_data, MQD) 这里的文档。参见Erlang 标准库中的一个示例

init(_) ->
    %% The error logger process may receive a huge amount of
    %% messages. Make sure that they are stored off heap to
    %% avoid exessive GCs.
    process_flag(message_queue_data, off_heap),
    {ok, []}.

或者，在使用erlang:spawn/2创建进程时，可以使用{message_queue_data，off_heap}选项。

然而，根据我的经验，off_heap选项没有多大帮助。有帮助的是一种自定义流控制方案:如果队列长度超过1000，请让发送方进程使用process_info(ReceiverPid, message_queue_len)检查接收进程的队列长度。定期检查，不要每次发送，以避免杀人的表现。另见这里。

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/36216246

复制

相似问题

问在Erlang中，当进程的邮箱增长更大时，它运行得更慢，为什么？
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Erlang中，当进程的邮箱增长更大时，它运行得更慢，为什么？EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Erlang中，当进程的邮箱增长更大时，它运行得更慢，为什么？
EN