文章/答案/技术大牛

发布

社区首页 >问答首页 >用JavaScript代码更改PDF注释以跳转到常规页面以进入页面操作

问用JavaScript代码更改PDF注释以跳转到常规页面以进入页面操作
EN

Code Review用户

提问于 2017-01-31 22:48:19

回答 2查看 449关注 0票数 3

我刚刚用C#编写了我的第一个程序。请解释：

我有没有用过什么不好的做法？
如何改进代码？

如果能提供修改后的代码和一些注释，我将非常感激，这样我就可以看到问题所在，并做一些进一步的阅读来纠正我的错误。这个程序运行得很好。

using System;
using System.IO;
using System.Text.RegularExpressions;
using iTextSharp.text.pdf;

namespace pdfStamperMemory
{
    class Program
    {
        static void Main(string[] args)
        {
            // searching for JS based on: http://stackoverflow.com/a/41386971/2657875
            // MemoryStream based on: http://stackoverflow.com/a/23738927/2657875

            byte[] bytes;
            string script;
            string input = Path.GetFullPath(args[0]);
            string output = Path.Combine(Path.GetDirectoryName(input), Path.GetFileNameWithoutExtension(input) + "-itext.pdf");

            using (var ms = new MemoryStream())
            {
                using (var reader = new PdfReader(input))
                {
                    using (var stamper = new PdfStamper(reader, ms))
                    {
                        // get all page labels
                        string[] labels = PdfPageLabels.GetPageLabels(reader);
                        string[] arr = new string[labels.Length];

                        for (int i = 0; i < labels.Length; i++)
                        {
                            arr[i] += labels[i];
                            if ((arr[0].Equals("Cover")) && i >= 1)
                            {
                                arr[i] = arr[i].Remove(0, 5);
                            }
                        }

                        for (int i = 1; i <= reader.NumberOfPages; i++)
                        {
                            // Get a page a PDF page
                            PdfDictionary page = reader.GetPageN(i);

                            // Get all the annotations of page i
                            PdfArray annotsArray = page.GetAsArray(PdfName.ANNOTS);

                            // If page does not have annotations
                            if (annotsArray == null)
                            {
                                continue;
                            }

                            // For each annotation
                            for (int j = 0; j < annotsArray.Size; ++j)
                            {
                                // For current annotation
                                PdfDictionary curAnnot = annotsArray.GetAsDict(j);

                                // check if has JS
                                PdfDictionary annotAction = curAnnot.GetAsDict(PdfName.A);
                                if (annotAction == null)
                                {
                                    Console.Write("Page {0} annotation {1}: no action\n", i, j);
                                }

                                // test if it is a JavaScript action
                                else if (PdfName.JAVASCRIPT.Equals(annotAction.Get(PdfName.S)))
                                {
                                    PdfObject scriptObject = annotAction.GetDirectObject(PdfName.JS);
                                    if (scriptObject == null)
                                    {
                                        continue;
                                    }
                                    if (scriptObject.IsString())
                                        script = ((PdfString)scriptObject).ToUnicodeString();
                                    else if (scriptObject.IsStream())
                                    {
                                        using (MemoryStream stream = new MemoryStream())
                                        {
                                            ((PdfStream)scriptObject).WriteContent(stream);
                                            script = stream.ToString();
                                        }
                                    }
                                    else
                                    {
                                        Console.WriteLine("Page {0} annotation {1}: malformed JS entry\n", i, j);
                                        continue;
                                    }
                                    if (script.Contains("if (this.hostContainer"))
                                    {
                                        Regex regex = new Regex(@"pp_(.*)'");
                                        Match text2search = regex.Match(script);
                                        if (text2search.Success)
                                        {
                                            //this is a page *label*, but it needs a *number*
                                            //to use PdfAction.GotoLocalPage
                                            string pageLabel = text2search.Groups[1].Value;

                                            // get index of a page label                                            
                                            int labelIndex = Array.IndexOf(arr, pageLabel);
                                            // replace JS with GotoLocalPage
                                            if (labelIndex != -1)
                                            {
                                                // ++ because Array.IndexOf is zer0-based
                                                labelIndex++;
                                                PdfAction action = PdfAction.GotoLocalPage(labelIndex, new PdfDestination(PdfDestination.XYZ, 0, reader.GetPageSize(labelIndex).Height, 1.25f), stamper.Writer);
                                                curAnnot.Put(PdfName.A, action);
                                            }

                                        }
                                    }
                                }
                            }
                        }
                        stamper.SetFullCompression();
                    }
                }
                // grab the bytes before closing things out
                bytes = ms.ToArray();
            }
            try
            {
                File.WriteAllBytes(output, bytes);
                Console.WriteLine("Done!");
            }
            catch
            {
                Console.WriteLine("Cannot save the file!");
            }
            finally
            {
                Console.ReadKey();
            }
        }
    }
}

pdf

回答 2

Code Review用户

回答已采纳

发布于 2017-02-01 06:37:57

Good things

您正在使用using语句正确地释放正在实现IDisposable的对象。
您正在使用大括号{}，尽管它们可能是可选的。

可改进事物

他们应该用Comments...if来描述为什么事情是以这样的方式来完成的。例如，展示边缘案例，解释为什么使用某种解决方案。让代码本身告诉读者通过为变量、方法和类使用有意义的描述性名称所做的事情。
命名事物不应该涉及缩写。例如，查看这些行//当前注释PdfDictionary curAnnot = annotsArray.GetAsDict(j)；为什么不将其命名为currentAnnotation？这将使这一评论变得超乎寻常。
变量的声明应该尽可能接近它们的usage.Having，例如方法顶部的byte[] bytes;，但是只在方法的底部使用它并不是最优的。
如果指定的右侧使对象的类型变得清晰，那么您应该考虑使用var型。

现在让我们深入研究一下代码。

using (var ms = new MemoryStream()) { using (var reader = new PdfReader(input)) { using (var stamper = new PdfStamper(reader, ms)) {

使用using是非常好的，但在这种情况下，可以/应该通过堆叠这样的用法来改进它

        using (var ms = new MemoryStream())
        using (var reader = new PdfReader(input))
        using (var stamper = new PdfStamper(reader, ms))
        {

这将为您节省两个级别的缩进，从而防止您需要水平滚动查看所有代码。

你有一个很大的方法把所有的东西都塞进去。您应该检查代码的哪些部分可以很容易地放在方法中。

举个例子

// get all page labels string[] labels = PdfPageLabels.GetPageLabels(reader); string[] arr = new string[labels.Length]; for (int i = 0; i < labels.Length; i++) { arr[i] += labels[i]; if ((arr[0].Equals("Cover")) && i >= 1) { arr[i] = arr[i].Remove(0, 5); } }

通过使用这样的方法

private static string[] GetPageLabels(PdfReader reader)
{
    string[] labels = PdfPageLabels.GetPageLabels(reader);
    if (!labels[0].Equals("Cover"))
    {
        return labels;
    }
    for (int i = 1; i < labels.Length; i++)
    {
        labels[i] = labels[i].Remove(0, 5);
    }  
    return labels;
}

你的主要方法会变短。

一般来说，您应该使用更多和更短的方法。这将使您更容易阅读和理解代码，如果您正在寻找一个bug，它将更容易找到。

PdfDictionary annotAction = curAnnot.GetAsDict(PdfName.A); if (annotAction == null) { Console.Write("Page {0} annotation {1}: no action\n", i, j); } // test if it is a JavaScript action else if (PdfName.JAVASCRIPT.Equals(annotAction.Get(PdfName.S)))

这是个很大的拒绝。不要在if和else if之间放置注释。您或维护人员Sam将无法一眼就看到他们(如果)是属于一起的。

如果在循环中使用与Regex regex = new Regex(@"pp_(.*)'");相同的正则表达式，则应考虑使用此过载构造函数使用RegexOptions.Compiled在循环外部创建regex。

指定将正则表达式编译到程序集。这会产生更快的执行速度，但会增加启动时间。

票数 3

Code Review用户

发布于 2017-02-01 11:33:21

问用JavaScript代码更改PDF注释以跳转到常规页面以进入页面操作
EN

回答 2

Code Review用户

Good things

可改进事物

Code Review用户

更多可改进的东西

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用JavaScript代码更改PDF注释以跳转到常规页面以进入页面操作EN

回答 2

Code Review用户

Good things

可改进事物

Code Review用户

更多可改进的东西

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用JavaScript代码更改PDF注释以跳转到常规页面以进入页面操作
EN