开发者

Using iTextSharp to save to file the fonts used in a PDF file

This is pretty much a duplicate of开发者_如何学Go this unanswered question, but hopefully someone in the know is watching now and can be helpful.

I'm looking for the ability have some .NET code extract the font embedded in a PDF to a font file. I'm currently using iTextSharp, but I'm open to other .NET libraries (e.g. PDFBox, PDF CLown, etc...). I'm able to iterate the information from BaseFont.GetDocumentFonts(), but I'm not clear on how to stream the font out to a font file.

Thanks, Kenny


@Highmastdon - it is actually really simple to get the font names, at least in iText/iTextSharp (pdfBox as well - but I don't have the code around right now) but in iTextSharp you would do the following:

PdfReader reader = new PdfReader(strFileName);
List<object[]> strFonts = BaseFont.GetDocumentFonts(reader);

And there it is, most libraries have support written in for a simple extraction of fonts (the names in any case).


I contributed a response before, but in the interests of adding solid examples to topics on this site (something I dreadfully needed three months ago) I will iterate through the solution I ended up using.

I downloaded MuPDF and went into the bin folder, retrieving the file mutool.exe. I then call this with a separate process in C#. It runs through pulling all of the fonts embedded in the PDF file and dumps them in the folder containing mutool.exe . Then it was just a matter of moving the fonts from there to the folder I wanted them in.

        /// <summary>
        /// Extract all fonts from PDF
        /// </summary>
        /// <param name="strPDFName"></param>
        public static void ExtractAll(string strPDFName)
        {
            if (strMUTOOL != null && strFontFinal != null)
            {
                Process p = new Process();
                p.StartInfo.FileName = strMUTOOL;
                p.StartInfo.Arguments = "extract \"" + strPDFName + "\"";
                p.StartInfo.UseShellExecute = false;
                p.StartInfo.RedirectStandardError = true;
                p.StartInfo.RedirectStandardOutput = true;
                p.StartInfo.CreateNoWindow = true;
                p.StartInfo.WorkingDirectory = strMUTOOL.Replace("mutool.exe", "").Trim();

                p.Start();
                p.WaitForExit();

                var standardError = p.StandardError.ReadToEnd();
                var standardOutput = p.StandardOutput.ReadToEnd();
                var exitCode = p.ExitCode;
            }
        }

As a bit of a heads up, most of these fonts are CFF files and you will need to convert them if you plan on using them. Also, as has been stated, using these fonts may constitute software piracy if these fonts are paid fonts. Finally, these fonts are usually only subsets and do not contain the complete glyph set - just the glyphs used in the PDF.


I didn't get an answer, but I did find several vendor-based solutions. The software from pdf-tools.com, pdfextract.exe works very well. Also the library from quickpdflibrary.com works very well too and is the vender we went with and so far very happy.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜