文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Linq在IEnumerable<T>中查找序列

问使用Linq在IEnumerable<T>中查找序列
EN

Stack Overflow用户

提问于 2010-08-25 07:23:57

回答 5查看 2.8K关注 0票数 6

使用LINQ在IEnumerable<T>中查找序列的最有效方法是什么

我希望能够创建一个允许以下调用的扩展方法：

int startIndex = largeSequence.FindSequence(subSequence)

匹配必须相邻且按顺序进行。

.net

linq

回答 5

Stack Overflow用户

发布于 2010-08-25 10:06:15

这是一个在序列中查找子序列的算法的实现。我将该方法称为IndexOfSequence，因为它使意图更加明确，并且类似于现有的IndexOf方法：

public static class ExtensionMethods
{
    public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence)
    {
        return source.IndexOfSequence(sequence, EqualityComparer<T>.Default);
    }

    public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence, IEqualityComparer<T> comparer)
    {
        var seq = sequence.ToArray();

        int p = 0; // current position in source sequence
        int i = 0; // current position in searched sequence
        var prospects = new List<int>(); // list of prospective matches
        foreach (var item in source)
        {
            // Remove bad prospective matches
            prospects.RemoveAll(k => !comparer.Equals(item, seq[p - k]));

            // Is it the start of a prospective match ?
            if (comparer.Equals(item, seq[0]))
            {
                prospects.Add(p);
            }

            // Does current character continues partial match ?
            if (comparer.Equals(item, seq[i]))
            {
                i++;
                // Do we have a complete match ?
                if (i == seq.Length)
                {
                    // Bingo !
                    return p - seq.Length + 1;
                }
            }
            else // Mismatch
            {
                // Do we have prospective matches to fall back to ?
                if (prospects.Count > 0)
                {
                    // Yes, use the first one
                    int k = prospects[0];
                    i = p - k + 1;
                }
                else
                {
                    // No, start from beginning of searched sequence
                    i = 0;
                }
            }
            p++;
        }
        // No match
        return -1;
    }
}

我没有完全测试它，所以它可能仍然包含错误。我只是在一些著名的角落案例上做了一些测试，以确保我没有落入明显的陷阱。到目前为止似乎工作得很好。

我认为复杂度接近O(n)，但我不是大O符号的专家，所以我可能错了……至少它只枚举源序列一次，而不会返回，所以它应该是相当有效的。

票数 3

Stack Overflow用户

发布于 2010-08-25 07:42:29

你说你想要使用的代码不是LINQ，所以我不明白为什么需要用LINQ来实现它。

这本质上与子串搜索的问题相同(实际上，order重要的枚举是“string”的泛化)。

因为计算机科学已经经常考虑这个问题很长时间了，所以你可以站在巨人的肩膀上。

一些合理的起点是：

http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

http://en.wikipedia.org/wiki/Rabin-karp

即使是维基百科文章中的伪代码也足以很容易地移植到C#。查看不同情况下的性能描述，并确定您的代码最有可能遇到哪些情况。

票数 2

Stack Overflow用户

发布于 2015-11-16 16:50:01

我知道这是一个古老的问题，但我需要这个确切的方法，我把它写成这样：

public static int ContainsSubsequence<T>(this IEnumerable<T> elements, IEnumerable<T> subSequence) where T: IEquatable<T>
{
    return ContainsSubsequence(elements, 0, subSequence);
}

private static int ContainsSubsequence<T>(IEnumerable<T> elements, int index, IEnumerable<T> subSequence) where T: IEquatable<T>
{
    // Do we have any elements left?
    bool elementsLeft = elements.Any();

    // Do we have any of the sub-sequence left?
    bool sequenceLeft = subSequence.Any();

    // No elements but sub-sequence not fully matched
    if (!elementsLeft && sequenceLeft)
        return -1; // Nope, didn't match

    // No elements of sub-sequence, which means even if there are
    // more elements, we matched the sub-sequence fully
    if (!sequenceLeft)
        return index - subSequence.Count(); // Matched!

    // If we didn't reach a terminal condition,
    // check the first element of the sub-sequence against the first element
    if (subSequence.First().Equals(e.First()))
        // Yes, it matched - move onto the next. Consume (skip) one element in each
        return ContainsSubsequence(elements.Skip(1), index + 1 subSequence.Skip(1));
    else
        // No, it didn't match. Try the next element, without consuming an element
        // from the sub-sequence
        return ContainsSubsequence(elements.Skip(1), index + 1, subSequence);
}

更新为不仅返回子序列是否匹配，而且返回它在原始序列中的起始位置。

这是IEnumerable上的一个扩展方法，完全懒惰，提前终止，并且比目前投票通过的答案更具线性。但是要小心(正如@wai-ha-lee所指出的)，它是递归的，并且创建了一个枚举器的lot。在适用的地方使用它(性能/内存)。这可以满足我的需求，但是YMMV。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/3561776

复制

相似问题

问使用Linq在IEnumerable<T>中查找序列
EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Linq在IEnumerable<T>中查找序列EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Linq在IEnumerable<T>中查找序列
EN