我正在寻找一种非常快速的方法来检查对象列表中的重复项。
我正在考虑简单地遍历列表并以这种方式进行手动比较,但我认为linq可能提供了一个更优雅的解决方案……
假设我有一个物体。
public class dupeCheckee
{
public string checkThis { get; set; }
public string checkThat { get; set; }
dupeCheckee(string val, string val2)
{
checkThis = val;
checkThat = val2;
}
}
我有一个这些对象的列表
List<dupeCheckee> dupList = new List<dupeCheckee>();
dupList.Add(new dupeCheckee("test1", "value1"));
dupList.Add(new dupeCheckee("test2", "value1"));
dupList.Add(new dupeCheckee("test3", "value1"));
dupList.Add(new dupeCheckee("test1", "value1"));//dupe
dupList.Add(new dupeCheckee("test2", "value1"));//dupe...
dupList.Add(new dupeCheckee("test4", "value1"));
dupList.Add(new dupeCheckee("test5", "value1"));
dupList.Add(new dupeCheckee("test1", "value2"));//not dupe
我要找出名单里的那些笨蛋。当我找到它时,我需要做一些额外的逻辑,而不是一定要删除它们。
当我使用linq时,我的GroupBy抛出了一个异常...
'System.Collections.Generic.List<dupeCheckee>' does not contain a definition for 'GroupBy' and no extension method 'GroupBy' accepting a first argument of type 'System.Collections.Generic.List<dupeCheckee>' could be found (are you missing a using directive or an assembly reference?)
告诉我我错过了一个图书馆。不过,我很难弄清楚到底是哪一个。
一旦我弄清楚了这一点,我将如何本质上检查这两个条件...IE checkThis和checkThat都出现多次吗?
更新:我想出了什么
这是我在快速研究后想出的linq查询...
test.Count != test.Select(c => new { c.checkThat, c.checkThis }).Distinct().Count()
我不确定这是否一定比这个答案更好。
var duplicates = test.GroupBy(x => new {x.checkThis, x.checkThat})
.Where(x => x.Skip(1).Any());
我知道我可以将第一个语句放入if else子句中。我还运行了一个快速测试。当我期望0的时候,重复列表返回给我1,但它确实正确地调用了我使用的其中一个集合中有重复的事实。
另一种方法完全符合我的期望。以下是我用来测试这一点的数据集。
Dupes:
List<DupeCheckee> test = new List<DupeCheckee>{
new DupeCheckee("test0", "test1"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test1", "test2"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test2", "test3"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test3", "test3"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test0", "test5"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test1", "test6"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test2", "test7"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test3", "test8"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test0", "test5"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test1", "test1"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test2", "test2"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test3", "test3"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test4", "test4"),//{ checkThis = "test", checkThat = "test1"}
};
没有上当受骗。
List<DupeCheckee> test2 = new List<DupeCheckee>{
new DupeCheckee("test0", "test1"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test1", "test2"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test2", "test3"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test3", "test3"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test4", "test5"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test5", "test6"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test6", "test7"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test7", "test8"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test8", "test5"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test9", "test1"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test2", "test2"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test3", "test3"),//{ checkThis = "test", checkThat = "test1"}
new DupeCheckee("test4", "test4"),//{ checkThis = "test", checkThat = "test1"}
};
发布于 2013-04-25 00:36:11
您需要引用System.Linq (例如using System.Linq
)
然后你可以这样做
var dupes = dupList.GroupBy(x => new {x.checkThis, x.checkThat})
.Where(x => x.Skip(1).Any());
这将为您提供包含所有重复项的组
然后,重复项的测试将是
var hasDupes = dupList.GroupBy(x => new {x.checkThis, x.checkThat})
.Where(x => x.Skip(1).Any()).Any();
或者甚至调用ToList()
或ToArray()
来强制计算结果,然后您可以检查是否存在重复项并检查它们。
例如..
var dupes = dupList.GroupBy(x => new {x.checkThis, x.checkThat})
.Where(x => x.Skip(1).Any()).ToArray();
if (dupes.Any()) {
foreach (var dupeList in dupes) {
Console.WriteLine(string.Format("checkThis={0},checkThat={1} has {2} duplicates",
duplist.Key.checkThis,
duplist.Key.checkThat,
duplist.Count() - 1));
}
}
另一个选择
var dupes = dupList.Select((x, i) => new { index = i, value = x})
.GroupBy(x => new {x.value.checkThis, x.value.checkThat})
.Where(x => x.Skip(1).Any());
这为您提供了组,每个组中的每个项目都将原始索引存储在属性index
中,项目存储在属性value
中
发布于 2016-05-03 01:48:24
有大量有效的解决方案,但我认为下一个解决方案将更加透明和易于理解,然后所有这些:
var hasDuplicatedEntries = ListWithPossibleDuplicates
.GroupBy(YourGroupingExpression)
.Any(e => e.Count() > 1);
if(hasDuplicatedEntries)
{
// Do what ever you want in case when list contains duplicates
}
发布于 2015-03-13 00:12:24
我喜欢用这个来知道什么时候有任何重复的东西。假设你有一个字符串,想知道是否有重复的字母。这就是我所使用的。
string text = "this is some text";
var hasDupes = text.GroupBy(x => x).Any(grp => grp.Count() > 1);
如果你想知道有多少个副本,不管副本是什么,使用这个。
var totalDupeItems = text.GroupBy(x => x).Count(grp => grp.Count() > 1);
举个例子,"this is some text“有这个...
字母t总数:3
字母I总数:2
字母总数:3
字母e总数:2
所以变量totalDupeItems等于4,有4种不同的副本。
如果您想要获得复制项的总量,而不管复制的是什么,那么使用这个。
var totalDupes = letters.GroupBy(x => x).Where(grp => grp.Count() > 1).Sum(grp => grp.Count());
因此,变量totalDupes应该是10。这是每个重复项类型相加在一起的重复项总数。
https://stackoverflow.com/questions/16197290
复制相似问题