开发者

Convert PDF to JPG or PNG using C# or Command Line [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 3 years ago.

Improve this question

I need to convert a PDF file to images. I used for testing purposes "Total PDF Conve开发者_StackOverflowrter" which offers a command line, but it's shareware and I need to find a free alternative.

Does anyone knows such a tool or maybe even a free C# library?


The convert tool (or magick since version 7) from the ImageMagick bundle can do this (and a whole lot more).

In its simplest form, it's just

convert myfile.pdf myfile.png

or

magick myfile.pdf myfile.png


As a GhostScript answer is missing and there is no hint for multipage PDF export yet I think adding another variant is ok.

gs -dBATCH -dNOPAUSE -sDEVICE=pnggray -r300 -dUseCropBox -sOutputFile=item-%03d.png examples.pdf

Options description:

  • dBatch and dNOPAUSE just tell gs to run in batch mode, which means more or less it will not ask any questions. Those parameters are also important if you want to run the command in a bash script.
  • sDEVICE tells gs what output format to produce. pnggray is for grayscale, png16m for 24-bit RGB color. If you insist on creating Jpegs use -sDEVICE=jpeg to produce color JPEG files. Use the -dJPEGQ=N (N is an integer from 0 to 100, default 75) parameter to control the Jpgeg quality.
  • -r300 sets the scan resolution to 300dpi. If you prefer a smaller output sizes use -r70 or if you input pdf has a high resoultion use -r600. If you have a PDF with 300dpi and specify -r600 your images will be upscaled.
  • -dUseCropBox tell gs to use a CropBox if defined. A CropBox is specifies an area of interest on a page. If you have a pdf with a large white margin and you don't want this margin on your output this option might help.
  • -sOutputFile defines the name(s) of the output file. The %03d.png part tells gs to include a counter for multiple files. A two page pdf would result in two files named item-001.png and item-002.png.
  • The last (unnamed parameter is the input file.)

Availability: The convert command of imagemagick does use the gs command internally. If you can convert a pdf with imagemagick, you already have gs installed.

Install ghostscript:

RHEL:

yum install ghostscript

SLES:

zypper install ghostscript

Debian/Ubuntu:

sudo apt-get install ghostscript

Windows:

You can find Windows binaries under http://www.ghostscript.com/download/gsdnld.html


I have found this solution which worked for me: https://github.com/jhabjan/Ghostscript.NET. It is also available as an nuget download.

Here is the sample code for converting all pdf pages into png images:

    private static void Test()
    {
        var localGhostscriptDll = Path.Combine(Environment.CurrentDirectory, "gsdll64.dll");
        var localDllInfo = new GhostscriptVersionInfo(localGhostscriptDll);

        int desired_x_dpi = 96;
        int desired_y_dpi = 96;

        string inputPdfPath = "test.pdf";

        string outputPath = Environment.CurrentDirectory;

        GhostscriptRasterizer _rasterizer = new GhostscriptRasterizer();

        _rasterizer.Open(inputPdfPath, localDllInfo, false);

        for (int pageNumber = 1; pageNumber <= _rasterizer.PageCount; pageNumber++)
        {
            string pageFilePath = Path.Combine(outputPath, "Page-" + pageNumber.ToString() + ".png");

            Image img = _rasterizer.GetPage(desired_x_dpi, desired_y_dpi, pageNumber);
            img.Save(pageFilePath, ImageFormat.Png);
        }

        _rasterizer.Close();
    }


The @Thomas answer didn't work in my case. I gues that works only if you have images in your pdf. In my case what worked was pdftoppm (source from https://askubuntu.com/a/50180/37527):

pdftoppm input.pdf outputname -png

This will output each page in the PDF using the format outputname-01.png, with 01 being the index of the page.

Converting a single page of the PDF

pdftoppm input.pdf outputname -png -f {page} -singlefile

Change {page} to the page number. It's indexed at 1, so -f 1 would be the first page.

Specifying the converted image's resolution

The default resolution for this command is 150 DPI. Increasing it will result in both a larger file size and more detail.

To increase the resolution of the converted PDF, add the options -rx {resolution} and -ry {resolution}. For example:

pdftoppm input.pdf outputname -png -rx 300 -ry 300


You may want to check this free solution

http://www.codeproject.com/Articles/32274/How-To-Convert-PDF-to-Image-Using-Ghostscript-API

It easily convert PDF to images (single file or multiple files) is open source, and use ghostscript (free download)

Example of its use:

converter = new PDFConverter();
converter.JPEGQuality = 90;
converter.OutputFormat = "jpg";
string output = "output.jpg";
converter.Convert("input.pdf", output);


You should use iText sharp. Its a port of an open source java project for manipulating PDFs. http://sourceforge.net/projects/itextsharp/


2JPEG command line tool can do it, like:

2jpeg.exe -src "C:\In\*.pdf" -dst "C:\Out"

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜