问如何创建索引文件，而不添加在多个文件夹中找到的同名文件
EN

Stack Overflow用户

提问于 2018-06-03 02:10:02

回答 2查看 42关注 0票数 0

如果在多个文件夹中发现同名文件，如何在不添加同名文件的情况下创建索引文件？

下面的代码搜索一个目录，获取其中的所有pdf并将路径写入txt文件。我的问题是，如果在两个不同的文件夹中发现相同的文件，它会在我的index.txt文件中添加两次，导致在搜索索引文件时出现问题。

以下是我的代码

    public void createIndexedFileWithContentFromDirectory(string indexPDFDocumentName, string sourceDirectory, string fileExtension)
    {
        bool indexFileExists = File.Exists(indexPDFDocumentName);
        if (indexFileExists == false) {

            var files = Directory.EnumerateFiles(sourceDirectory, fileExtension, SearchOption.AllDirectories);

            File.WriteAllLines(indexPDFDocumentName, files.Select(x => System.IO.Path.GetFileNameWithoutExtension(x) + "=" + x).ToArray());

        }
    }

索引文件的外观如下所示

myfile1=C:\Folder1\myfile1.PDF

myfile2=C:\Folder2\myfile2.PDF

myfile3=C:\Folder3\myfile3.PDF

myfile1=C:\Folder4\myfile1.PDF

..。

请注意，myFile1被添加了两次，因为它存在于两个不同的文件夹中。我希望能够做的是忽略一个文件，如果它已经找到，所以索引文件只包含第一个找到的文件的位置。

就像这样..。

myfile1=C:\Folder1\myfile1.PDF

myfile2=C:\Folder2\myfile2.PDF

myfile3=C:\Folder3\myfile3.PDF

myfile4=C:\Folder4\myfile4.PDF

..。

过滤并仅将找到的第一个文件添加到索引文件的最佳方式是什么，即使该文件存在于多个目录中？

编辑：是我的解决方案，它可能不是最有效的，但它工作得很好。

  public void createIndexedFileWithContentFromDirectory(string indexPDFDocumentName, string sourceDirectory, string fileExtension)
    {

        bool indexFileExists = File.Exists(indexPDFDocumentName);
        if (indexFileExists == false) {

            var allFiles = Directory.EnumerateFiles(sourceDirectory, fileExtension, SearchOption.AllDirectories);

            string[] allFilesArray = allFiles.Select(x => System.IO.Path.GetFileNameWithoutExtension(x) + "=" + x).ToArray();

            /// This dictionary is created from the above array and it's used for filtering duplicates
            var dictionaryFromArray = new Dictionary<string, string>();
            dictionaryFromArray = allFilesArray.Select(s => s.Split('=')).GroupBy(a => a[0].ToUpper()).ToDictionary(e => e.Key, v => v.Select(a => a[1]).First());

            File.WriteAllLines(indexPDFDocumentName, dictionaryFromArray.Select(z => z.Key + "=" + z.Value).ToArray());

            MessageBox.Show("Indexing Complete");
        }
    }

回答 2

Stack Overflow用户

发布于 2018-06-03 02:23:42

只是做个假设(跳过其他同名文件)

var files = new DirectoryInfo(@"d:\temp")
            .EnumerateFiles("*.*", SearchOption.AllDirectories)
            .GroupBy(x => x.Name)
            .Select(x => x.First().FullName)
            .ToArray();

票数 1

Stack Overflow用户

发布于 2018-06-04 09:33:02

//Assuming you get a list of filepaths as input.
List<string> filePathList = new List<string>()
{
    @"myfile1 = C:\Folder1\myfile1.PDF",
    @"myfile2 = C:\Folder2\myfile2.PDF",
    @"myfile3 = C:\Folder3\myfile3.PDF",
    @"myfile1 = C:\Folder4\myfile1.PDF"
};

//Group the files based on filenames (i.e Substring after the last '\' in their path)
//and select the "First" path of each group and ignore duplicates.
var uniqueFilePaths = filePathList.GroupBy(x => x.Split("\\").Last())
    .Select(x => x.First())
    .ToList();

/*  Output:
 *  "myfile1 = C:\Folder1\myfile1.PDF",
 *  "myfile2 = C:\Folder2\myfile2.PDF",
 *  "myfile3 = C:\Folder3\myfile3.PDF",
 */

这个想法很简单，你根据文件名进行groupBy，并保留每组中的第一个(或最后一个)。我也建议你去看看similar question I answered。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/50659791

复制

相似问题

问如何创建索引文件，而不添加在多个文件夹中找到的同名文件
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何创建索引文件，而不添加在多个文件夹中找到的同名文件EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何创建索引文件，而不添加在多个文件夹中找到的同名文件
EN