首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >用C#将制表符分隔的文本文件中的HTML读入字符串

用C#将制表符分隔的文本文件中的HTML读入字符串
EN

Stack Overflow用户
提问于 2019-01-03 12:18:03
回答 1查看 349关注 0票数 0

我试图从一个制表符分隔的文本文件中读取HTML,并创建一个HTML文件,然后将其转换为pdf。当我尝试读取文本文件时,我得到了奇怪的字符‘和一些其他字符。这是我的代码

代码语言:javascript
复制
        var lines = System.IO.File.ReadAllLines(@"C:\temp\Laura.txt", Encoding.GetEncoding("Windows-1255"));
        var csv = lines.Select(x =>
        {
            var parts = x.Split('\t');
            return new Articles()
            {
                id = parts[0].Trim(),
                name = parts[1].Trim(),
                body = parts[2].Trim(),
                //body = WebUtility.HtmlDecode(parts[2].Trim()),
                //body = HttpUtility.HtmlEncode(parts[2].Trim()),
                //body = WebUtility.HtmlEncode(parts[2].Trim()),
                //body = SecurityElement.Escape(parts[2].Trim()),
            };
        }).ToList();
       foreach (var item in csv)
        {
            string id = item.name;
            string filename = item.name + ".html";
            string body = item.body;
            string path = @"c:\temp\" + filename;

            // This text is added only once to the file.
            if (!File.Exists(path))
            {
                // Create a file to write to.
                File.WriteAllText(path, body);
                Microsoft.Office.Interop.Word.Application ap = new Microsoft.Office.Interop.Word.Application();
                Document document = ap.Documents.Open(path);

                object oFalse = false;
                object oTrue = true;
                object OutputFileName = Path.Combine(
                Path.GetDirectoryName(path),
                Path.GetFileNameWithoutExtension(path) + ".pdf");
                object missing = System.Reflection.Missing.Value;
                document.PrintOut(
                oTrue,          // Background
                oFalse,         // Append
                ref missing,    // Range
                OutputFileName, // OutputFileName
                ref missing,    // From
                ref missing,    // To
                ref missing,    // Item
                ref missing,    // Copies
                ref missing,    // Pages
                ref missing,    // PageType
                ref missing,    // PrintToFile
                ref missing,    // Collate
                ref missing,    // ActivePrinterMacGX
                ref missing,    // ManualDuplexPrint
                ref missing,    // PrintZoomColumn
                ref missing,    // PrintZoomRow
                ref missing,    // PrintZoomPaperWidth
                ref missing     // PrintZoomPaperHeight
                );
            }
        }

我已经尝试过被注释掉的代码,但似乎什么都不起作用。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-01-04 00:31:47

尝尝这个

代码语言:javascript
复制
var lines = System.IO.File.ReadAllLines(@"C:\temp\Laura.txt",  Encoding.GetEncoding("Windows-1255"));
var csv = lines.Select(x =>
{
    var parts = x.Split('\t');
    return new Articles()
    {
        id = parts[0].Trim(),
        name = parts[1].Trim(),
        body = parts[2].Trim(),
    };
}).ToList();

尝试此wdExportFormatPDF

代码语言:javascript
复制
var lines = System.IO.File.ReadAllText(@"1.html", Encoding.GetEncoding("Windows-1255"));
var path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"2.html");
var app = new Microsoft.Office.Interop.Word.Application();
var doc = app.Documents.Open(path, false);
var OutputFileName = Path.Combine(
                          Path.GetDirectoryName(path),
                          Path.GetFileNameWithoutExtension(path)+
                          ".pdf");
doc.ExportAsFixedFormat(OutputFileName, WdExportFormat.wdExportFormatPDF);

下面是完整的代码

代码语言:javascript
复制
static void connvert()
{
    var lines =
        File.
        ReadAllLines
        (@"C:\temp\Laura.txt",
            Encoding.GetEncoding("Windows-1255")
        );

    var csv = lines.Select(x =>
    {
        var parts = x.Split('\t');
        return new Articles()
        {
            id = parts[0].Trim(),
            name = parts[1].Trim(),
            body = parts[2].Trim(),
        };
    }).ToList();



    foreach (var item in csv)
    {
        string id = item.name;
        string filename = item.name + ".html";
        string body = item.body;
        string path = @"c:\temp\" + filename;

        // This text is added only once to the file.
        if (!File.Exists(path))
        {
            // Create a file to write to.
            //  File.WriteAllText(path, body);
            File.WriteAllText(path, body, Encoding.Unicode); // try this
            //   File.WriteAllText(path, body, Encoding.Encoding.GetEncoding("Windows-1255"));// then this

            var app = new Application();
            var doc = app.Documents.Open(path, false);
            var OutputFileName =
                Path.Combine(
                    Path.GetDirectoryName(path),
                    Path.GetFileNameWithoutExtension(path) +
                    ".pdf");
            doc.ExportAsFixedFormat
                (OutputFileName,
                    WdExportFormat.wdExportFormatPDF
                );
        }
    }
}
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54016282

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档