文章/答案/技术大牛

发布

问Haskell堆栈溢出
EN

Stack Overflow用户

提问于 2011-05-11 05:15:23

回答 3查看 391关注 0票数 7

我正在编写一个遗传算法来生成字符串"helloworld“。但当n为10,000或更大时，演进函数会产生堆栈溢出。

module Genetics where

import Data.List (sortBy)
import Random (randomRIO)
import Control.Monad (foldM)

class Gene g where
    -- How ideal is the gene from 0.0 to 1.0?
    fitness :: g -> Float

    -- How does a gene mutate?
    mutate :: g -> IO g

    -- How many species will be explored?
    species :: [g] -> Int

orderFitness :: (Gene g) => [g] -> [g]
orderFitness = reverse . sortBy (\a b -> compare (fitness a) (fitness b))

compete :: (Gene g) => [g] -> IO [g]
compete pool = do
    let s = species pool
    variants <- (mapM (mapM mutate) . map (replicate s)) pool
    let pool' = (map head . map orderFitness) variants
    return pool'

evolve :: (Gene g) => Int -> [g] -> IO [g]
evolve 0 pool = return pool
evolve n pool = do
    pool' <- compete pool
    evolve (n - 1) pool'

有了species pool = 8，8个基因库可以复制到8个组中。每一组都发生变异，每组中最适合的被选择进行进一步的进化(返回到8个基因)。

GitHub

memory

haskell

random

stack-overflow

genetic-algorithm

回答 3

Stack Overflow用户

回答已采纳

发布于 2011-05-11 07:36:58

多亏了唐的deepseq建议，我能够将问题缩小到mapM mutate，它造成了太多的thunks。新版本有mutate'，它使用seq来防止thunking。

module Genetics where

import Data.List (maximumBy)
import Random (randomRIO)

class Gene g where
    -- How ideal is the gene from 0.0 to 1.0?
    fitness :: g -> Float

    -- How does a gene mutate?
    mutate :: g -> IO g

    -- How many species will be explored in each round?
    species :: [g] -> Int

best :: (Gene g) => [g] -> g
best = maximumBy (\a b -> compare (fitness a) (fitness b))

-- Prevents stack overflow
mutate' :: (Gene g) => g -> IO g
mutate' gene = do
    gene' <- mutate gene
    gene' `seq` return gene'

drift :: (Gene g) => [[g]] -> IO [[g]]
drift = mapM (mapM mutate')

compete :: (Gene g) => [g] -> IO [g]
compete pool = do
    let islands = map (replicate (species pool)) pool
    islands' <- drift islands
    let representatives = map best islands'
    return representatives

evolve :: (Gene g) => Int -> [g] -> IO [g]
evolve 0 pool = return pool
evolve n pool = compete pool >>= evolve (n - 1)

GitHub

票数 2

Stack Overflow用户

发布于 2011-05-11 06:39:12

如果您对性能感兴趣，我将使用快速随机数生成器，例如：

The mersenne-random package

其次，compete看起来非常可疑，因为它完全是懒惰的，尽管它构建了一些潜在的大型结构。试着用deepseq锤子把它重写得更严格一点：

import Control.DeepSeq    

compete :: (Gene g, NFData g) => [g] -> IO [g]
compete pool = do
    let s = species pool
    variants <- (mapM (mapM mutate) . map (replicate s)) pool
    let pool' = (map head . map orderFitness) variants
    pool' `deepseq` return pool'

不过，这些东西都不需要放在IO中(单独发布)。类似于Rand monad的东西可能是more appropriate。

票数 3

Stack Overflow用户

发布于 2011-05-11 07:08:45

在orderFitness是sortBy的情况下，您可以使用maximumBy和单个map，而不是使用(map head . map orderFitness)。这不会节省太多(因为您将从O(n log n)变为O(n)，并且可能会从消除双映射中获得另一个因子2)，但至少在某种程度上更简单和更有效。您还可以摆脱对反向的调用。

我怀疑这在没有deepseq的情况下解决了这个问题，但这应该是一个改进。

编辑:如果标准库和GHC是完美的，那么head . sortBy将产生与maximumBy相同的代码，而map head . map sortBy将产生与map (head . sortBy)相同的代码。遗憾的是，这两件事在实践中都不太可能是真的。sortBy会倾向于做一堆额外的内存分配，因为它是一种分而治之的算法。组合地图是您有时会得到的优化，但不应指望。

更重要的是，使用maximumBy更具声明性。更容易看到代码做了什么以及需要多长时间。在优化中利用它也应该更容易，因为我们知道目标是什么，而不仅仅是我们如何实现它。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/5956554

复制

相似问题

问Haskell堆栈溢出
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Haskell堆栈溢出EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Haskell堆栈溢出
EN