blocks|key|1566576|text|如果你只是在检查是否存在，.NET+3.5中的HashSet<T>是你最好的选择-类似字典的性能，但没有键/值对-只有值：|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1566577|++++HashSet<int>+data+=+new+HashSet<int>();
++++for+(int+i+=+0;+i+<+1000000;+i%2B%2B)
++++{
++++++++data.Add(rand.Next(50000000));
++++}
++++bool+contains+=+data.Contains(1234567);+//+etc|code-block|syntax|javascript|1566578|entityMap^0|N|A|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@$9|N|A|O|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|P|8|@]|D|@]|E|$I|J]]|$1|K|3|-4|5|6|7|Q|8|@]|D|@]|E|$]]]|L|$]]

If you are just checking for existence, <code>HashSet&lt;T&gt;</code> in .NET 3.5 is your best option - dictionary-like performance, but no key/value pair - just the values:

<pre><code> HashSet&lt;int&gt; data = new HashSet&lt;int&gt;();
 for (int i = 0; i &lt; 1000000; i++)
 {
 data.Add(rand.Next(50000000));
 }
 bool contains = data.Contains(1234567); // etc
</code></pre>

blocks|key|1566337|text|List.Contains是一个O(n)运算。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1566338|Dictionary.ContainsKey是一个O(1)操作，因为它使用对象的哈希码作为关键字，这为您提供了更快的搜索能力。|1566339|我不认为浏览一个包含一百万个条目的列表来找到几个条目是个好主意。|1566340|例如，是否可以将这些上百万个实体保存到RDBMS中，并在该数据库上执行查询？|1566341|如果这是不可能的，那么我会使用字典，无论如何，如果你想做关键字查找。|1566342|entityMap^0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|L|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|M|8|@]|9|@]|A|$]]|$1|D|3|E|5|6|7|N|8|@]|9|@]|A|$]]|$1|F|3|G|5|6|7|O|8|@]|9|@]|A|$]]|$1|H|3|I|5|6|7|P|8|@]|9|@]|A|$]]|$1|J|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]]|K|$]]

List.Contains is a O(n) operation.
Dictionary.ContainsKey is a O(1) operation, since it uses the hashcode of the objects as a key, which gives you a quicker search ability.
I don't think that it 's a good idea to scan through a List which contains a million entries to find a few entries.
Isn't it possible to save those millon entities into a RDBMS for instance, and perform queries on that database ?
If it is not possible, then I would use a Dictionary anyway if you want to do key-lookups.

blocks|key|1350849|text|字典并没有那么糟糕，因为字典中的键被设计为可以快速找到。为了在列表中找到一个数字，它需要遍历整个列表。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1350850|当然，只有当你的数字是唯一的并且没有排序的时候，字典才能工作。|1350851|我认为在.NET+3.5中也有一个HashSet<T>类，它也只允许唯一的元素。|offset|length|style|CODE|1350852|entityMap^0|0|0|H|A|0^^$0|@$1|2|3|4|5|6|7|L|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|M|8|@]|9|@]|A|$]]|$1|D|3|E|5|6|7|N|8|@$F|O|G|P|H|I]]|9|@]|A|$]]|$1|J|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]]|K|$]]

Dictionary isn't that bad, because the keys in a dictionary are designed to be found fast. To find a number in a list it needs to iterate through the whole list.

Of course the dictionary only works if your numbers are unique and not ordered.

I think there is also a <code>HashSet&lt;T&gt;</code> class in .NET 3.5, it also allows only unique elements.

blocks|key|1351314|text|这并不完全是对您问题的回答，但是我有一个类可以提高集合上Contains()的性能。我创建了一个队列的子类，并添加了一个将hashcode映射到对象列表的Dictionary。Dictionary.Contains()函数是O(1)，而List.Contains()、Queue.Contains()和Stack.Contains()是O(n)。|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1351315|字典的value-type是保存具有相同散列码的对象的队列。调用方可以提供实现IEqualityComparer的自定义类对象。您可以将此模式用于堆栈或列表。代码只需要做几处修改。|1351316|///+<summary>
///+This+is+a+class+that+mimics+a+queue,+except+the+Contains()+operation+is+O(1)+rather+++++than+O(n)+thanks+to+an+internal+dictionary.
///+The+dictionary+remembers+the+hashcodes+of+the+items+that+have+been+enqueued+and+dequeued.
///+Hashcode+collisions+are+stored+in+a+queue+to+maintain+FIFO+order.
///+</summary>
///+<typeparam+name="T"></typeparam>
private+class+HashQueue<T>+:+Queue<T>
{
++++private+readonly+IEqualityComparer<T>+_comp;
++++public+readonly+Dictionary<int,+Queue<T>>+_hashes;+//_hashes.Count+doesn't+always+equal+base.Count+(due+to+collisions)

++++public+HashQueue(IEqualityComparer<T>+comp+=+null)+:+base()
++++{
++++++++this._comp+=+comp;
++++++++this._hashes+=+new+Dictionary<int,+Queue<T>>();
++++}

++++public+HashQueue(int+capacity,+IEqualityComparer<T>+comp+=+null)+:+base(capacity)
++++{
++++++++this._comp+=+comp;
++++++++this._hashes+=+new+Dictionary<int,+Queue<T>>(capacity);
++++}

++++public+HashQueue(IEnumerable<T>+collection,+IEqualityComparer<T>+comp+=+null)+:+++++base(collection)
++++{
++++++++this._comp+=+comp;

++++++++this._hashes+=+new+Dictionary<int,+Queue<T>>(base.Count);
++++++++foreach+(var+item+in+collection)
++++++++{
++++++++++++this.EnqueueDictionary(item);
++++++++}
++++}

++++public+new+void+Enqueue(T+item)
++++{
++++++++base.Enqueue(item);+//add+to+queue
++++++++this.EnqueueDictionary(item);
++++}

++++private+void+EnqueueDictionary(T+item)
++++{
++++++++int+hash+=+this._comp+==+null+?+item.GetHashCode()+:+++++this._comp.GetHashCode(item);
++++++++Queue<T>+temp;
++++++++if+(!this._hashes.TryGetValue(hash,+out+temp))
++++++++{
++++++++++++temp+=+new+Queue<T>();
++++++++++++this._hashes.Add(hash,+temp);
++++++++}
++++++++temp.Enqueue(item);
++++}

++++public+new+T+Dequeue()
++++{
++++++++T+result+=+base.Dequeue();+//remove+from+queue

++++++++int+hash+=+this._comp+==+null+?+result.GetHashCode()+:+this._comp.GetHashCode(result);
++++++++Queue<T>+temp;
++++++++if+(this._hashes.TryGetValue(hash,+out+temp))
++++++++{
++++++++++++temp.Dequeue();
++++++++++++if+(temp.Count+==+0)
++++++++++++++++this._hashes.Remove(hash);
++++++++}

++++++++return+result;
++++}

++++public+new+bool+Contains(T+item)
++++{+//This+is+O(1),+whereas+Queue.Contains+is+(n)
++++++++int+hash+=+this._comp+==+null+?+item.GetHashCode()+:+this._comp.GetHashCode(item);
++++++++return+this._hashes.ContainsKey(hash);
++++}

++++public+new+void+Clear()
++++{
++++++++foreach+(var+item+in+this._hashes.Values)
++++++++++++item.Clear();+//clear+collision+lists

++++++++this._hashes.Clear();+//clear+dictionary

++++++++base.Clear();+//clear+queue
++++}
}|code-block|syntax|javascript|1351317|我的简单测试显示，我的HashQueue.Contains()运行速度比Queue.Contains()快得多。在count设置为10,000的情况下运行测试代码，对于HashQueue版本需要0.00045秒，对于队列版本需要0.37秒。当计数为100,000时，HashQueue版本需要0.0031秒，而队列需要36.38秒！|1351318|下面是我的测试代码：|1351319|static+void+Main(string[]+args)
{
++++int+count+=+10000;

++++{+//HashQueue
++++++++var+q+=+new+HashQueue<int>(count);

++++++++for+(int+i+=+0;+i+<+count;+i%2B%2B)+//load+queue+(not+timed)
++++++++++++q.Enqueue(i);

++++++++System.Diagnostics.Stopwatch+sw+=+System.Diagnostics.Stopwatch.StartNew();
++++++++for+(int+i+=+0;+i+<+count;+i%2B%2B)
++++++++{
++++++++++++bool+contains+=+q.Contains(i);
++++++++}
++++++++sw.Stop();
++++++++Console.WriteLine(string.Format("HashQueue,+{0}",+sw.Elapsed));
++++}

++++{+//Queue
++++++++var+q+=+new+Queue<int>(count);

++++++++for+(int+i+=+0;+i+<+count;+i%2B%2B)+//load+queue+(not+timed)
++++++++++++q.Enqueue(i);

++++++++System.Diagnostics.Stopwatch+sw+=+System.Diagnostics.Stopwatch.StartNew();
++++++++for+(int+i+=+0;+i+<+count;+i%2B%2B)
++++++++{
++++++++++++bool+contains+=+q.Contains(i);
++++++++}
++++++++sw.Stop();
++++++++Console.WriteLine(string.Format("Queue,+++++{0}",+sw.Elapsed));
++++}

++++Console.ReadLine();
}|1351320|entityMap^0|2G|L|3A|F|3Q|G|47|G|0|0|0|B|K|10|G|0|0|0^^$0|@$1|2|3|4|5|6|7|U|8|@$9|V|A|W|B|C]|$9|X|A|Y|B|C]|$9|Z|A|10|B|C]|$9|11|A|12|B|C]]|D|@]|E|$]]|$1|F|3|G|5|6|7|13|8|@]|D|@]|E|$]]|$1|H|3|I|5|J|7|14|8|@]|D|@]|E|$K|L]]|$1|M|3|N|5|6|7|15|8|@$9|16|A|17|B|C]|$9|18|A|19|B|C]]|D|@]|E|$]]|$1|O|3|P|5|6|7|1A|8|@]|D|@]|E|$]]|$1|Q|3|R|5|J|7|1B|8|@]|D|@]|E|$K|L]]|$1|S|3|-4|5|6|7|1C|8|@]|D|@]|E|$]]]|T|$]]

This is not exactly an answer to your question, but I have a class that increases the performance of Contains() on a collection. I subclassed a Queue and added a Dictionary that maps hashcodes to lists of objects. The <code>Dictionary.Contains()</code> function is O(1) whereas <code>List.Contains()</code>, <code>Queue.Contains()</code>, and <code>Stack.Contains()</code> are O(n).

The value-type of the dictionary is a queue holding objects with the same hashcode. The caller can supply a custom class object that implements IEqualityComparer. You could use this pattern for Stacks or Lists. The code would need just a few changes.

<pre><code>/// &lt;summary&gt;
/// This is a class that mimics a queue, except the Contains() operation is O(1) rather than O(n) thanks to an internal dictionary.
/// The dictionary remembers the hashcodes of the items that have been enqueued and dequeued.
/// Hashcode collisions are stored in a queue to maintain FIFO order.
/// &lt;/summary&gt;
/// &lt;typeparam name="T"&gt;&lt;/typeparam&gt;
private class HashQueue&lt;T&gt; : Queue&lt;T&gt;
{
 private readonly IEqualityComparer&lt;T&gt; _comp;
 public readonly Dictionary&lt;int, Queue&lt;T&gt;&gt; _hashes; //_hashes.Count doesn't always equal base.Count (due to collisions)

 public HashQueue(IEqualityComparer&lt;T&gt; comp = null) : base()
 {
 this._comp = comp;
 this._hashes = new Dictionary&lt;int, Queue&lt;T&gt;&gt;();
 }

 public HashQueue(int capacity, IEqualityComparer&lt;T&gt; comp = null) : base(capacity)
 {
 this._comp = comp;
 this._hashes = new Dictionary&lt;int, Queue&lt;T&gt;&gt;(capacity);
 }

 public HashQueue(IEnumerable&lt;T&gt; collection, IEqualityComparer&lt;T&gt; comp = null) : base(collection)
 {
 this._comp = comp;

 this._hashes = new Dictionary&lt;int, Queue&lt;T&gt;&gt;(base.Count);
 foreach (var item in collection)
 {
 this.EnqueueDictionary(item);
 }
 }

 public new void Enqueue(T item)
 {
 base.Enqueue(item); //add to queue
 this.EnqueueDictionary(item);
 }

 private void EnqueueDictionary(T item)
 {
 int hash = this._comp == null ? item.GetHashCode() : this._comp.GetHashCode(item);
 Queue&lt;T&gt; temp;
 if (!this._hashes.TryGetValue(hash, out temp))
 {
 temp = new Queue&lt;T&gt;();
 this._hashes.Add(hash, temp);
 }
 temp.Enqueue(item);
 }

 public new T Dequeue()
 {
 T result = base.Dequeue(); //remove from queue

 int hash = this._comp == null ? result.GetHashCode() : this._comp.GetHashCode(result);
 Queue&lt;T&gt; temp;
 if (this._hashes.TryGetValue(hash, out temp))
 {
 temp.Dequeue();
 if (temp.Count == 0)
 this._hashes.Remove(hash);
 }

 return result;
 }

 public new bool Contains(T item)
 { //This is O(1), whereas Queue.Contains is (n)
 int hash = this._comp == null ? item.GetHashCode() : this._comp.GetHashCode(item);
 return this._hashes.ContainsKey(hash);
 }

 public new void Clear()
 {
 foreach (var item in this._hashes.Values)
 item.Clear(); //clear collision lists

 this._hashes.Clear(); //clear dictionary

 base.Clear(); //clear queue
 }
}
</code></pre>

My simple testing shows that my <code>HashQueue.Contains()</code> runs much faster than <code>Queue.Contains()</code>. Running the test code with count set to 10,000 takes 0.00045 seconds for the HashQueue version and 0.37 seconds for the Queue version. With a count of 100,000, the HashQueue version takes 0.0031 seconds whereas the Queue takes 36.38 seconds!

Here's my testing code:

<pre><code>static void Main(string[] args)
{
 int count = 10000;

 { //HashQueue
 var q = new HashQueue&lt;int&gt;(count);

 for (int i = 0; i &lt; count; i++) //load queue (not timed)
 q.Enqueue(i);

 System.Diagnostics.Stopwatch sw = System.Diagnostics.Stopwatch.StartNew();
 for (int i = 0; i &lt; count; i++)
 {
 bool contains = q.Contains(i);
 }
 sw.Stop();
 Console.WriteLine(string.Format("HashQueue, {0}", sw.Elapsed));
 }

 { //Queue
 var q = new Queue&lt;int&gt;(count);

 for (int i = 0; i &lt; count; i++) //load queue (not timed)
 q.Enqueue(i);

 System.Diagnostics.Stopwatch sw = System.Diagnostics.Stopwatch.StartNew();
 for (int i = 0; i &lt; count; i++)
 {
 bool contains = q.Contains(i);
 }
 sw.Stop();
 Console.WriteLine(string.Format("Queue, {0}", sw.Elapsed));
 }

 Console.ReadLine();
}
</code></pre>

blocks|key|1350968|text|SortedList的搜索速度更快(但插入项目的速度更慢)|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|1350969|entityMap|0|LINK|mutability|MUTABLE|url|http://msdn.microsoft.com/en-us/library/system.collections.sortedlist.aspx^0|0|A|0|0^^$0|@$1|2|3|4|5|6|7|L|8|@]|9|@$A|M|B|N|1|O]]|C|$]]|$1|D|3|-4|5|6|7|P|8|@]|9|@]|C|$]]]|E|$F|$5|G|H|I|C|$J|K]]]]

A <a href="http://msdn.microsoft.com/en-us/library/system.collections.sortedlist.aspx" rel="nofollow noreferrer">SortedList</a> will be faster to search (but slower to insert items)

blocks|key|1566479|text|为什么字典是不合适的？|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1566480|要查看某个特定值是否在列表中，您需要遍历整个列表。使用字典(或其他基于散列的容器)可以更快地缩小需要比较的对象的数量。键(在您的例子中是数字)是散列的，这为字典提供了要比较的对象的小数子集。|1566481|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|F|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|G|8|@]|9|@]|A|$]]|$1|D|3|-4|5|6|7|H|8|@]|9|@]|A|$]]]|E|$]]

Why is a dictionary inappropriate?

To see if a particular value is in the list you need to walk the entire list. With a dictionary (or other hash based container) it's much quicker to narrow down the number of objects you need to compare against. The key (in your case, the number) is hashed and that gives the dictionary the fractional subset of objects to compare against.

blocks|key|1566684|text|我在Compact+Framework中使用它，因为它不支持HashSet，我选择了一个两个字符串都是我要查找的值的字典。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1566685|这意味着我获得了具有字典性能的list<>功能。这有点老生常谈，但它是有效的。|1566686|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|F|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|G|8|@]|9|@]|A|$]]|$1|D|3|-4|5|6|7|H|8|@]|9|@]|A|$]]]|E|$]]

I'm using this in the Compact Framework where there is no support for HashSet, I have opted for a Dictionary where both strings are the value I am looking for.

It means I get list&lt;> functionality with dictionary performance. It's a bit hacky, but it works.

Could anyone explain me why the generics <code>List.Contains()</code> function is so slow? 

I have a <code>List&lt;long&gt;</code> with about a million numbers, and the code that is constantly checking if there's a specific number within these numbers.

I tried doing the same thing using <code>Dictionary&lt;long, byte&gt;</code> and the <code>Dictionary.ContainsKey()</code> function, and it was about 10-20 times faster than with the List.

Of course, I don't really want to use Dictionary for that purpose, because it wasn't meant to be used that way.

So, the real question here is, is there any alternative to the <code>List&lt;T&gt;.Contains()</code>, but not as whacky as <code>Dictionary&lt;K,V&gt;.ContainsKey()</code> ?

List<T>.Contains() is very slow?

数据库

容器

有人能解释一下为什么泛型List.Contains()函数这么慢吗？我有一个大约有一百万个数字的List<long>，代码会不断地检查这些数字中是否有特定的数字。我尝试使用Dictionary<long, byte>和Dictionary.ContainsKey()函数做同样的事情，它比使用List快10-20倍。当然，我并不是真的想使用Dictionary来达到这个目的，因为它本来就不是这样用的

问List<T>.Contains()很慢吗？
EN

回答 7

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问List<T>.Contains()很慢吗？EN

回答 7

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问List<T>.Contains()很慢吗？
EN