Change font programmatically in Acrobat Pro 9.1
I have a large PDF file that uses a number of fonts. I have to export it to another application that only recognizes Arial or Tim开发者_JS百科es New Roman fonts. Is it possible to do this in Javascript? I tried this with no luck:
/* Changes font to Arial */
var ckWord, numWords;
for (var i = 0; i < this.numPages; i++)
{
numWords = this.getPageNumWords(i);
for (var j = 0; j < numWords; j++) {
ckWord = (this.getPageNthWord(i,j))
if (ckWord.font != "Arial") {
ckWord.font = "Arial";
}
}
}
Acrobat's JS object model won't let you change page contents, no.
Kludging one font into another is generally a bad idea anyway, visually speaking. The appropriate spacing between letters can vary enough from one font to another that your output would look... well... suboptimal. This distorted spacing can also throw off "word finder" algorithms, causing them to think there are word breaks where there are none, or thinking two or more words are all one big word.
Not pretty.
It's also quite possible that the real problem is the font itself. Its even likely the font's encoding that this the problem, not the font itself. The way bytes in the content stream are interpreted as characters.
You can see the encoding used by different fonts in the document properties dialog's (control+d) "fonts" tab. I suspect your non-arial fonts are using something unusual... "Identity-H" or "Custom" most likely.
Changing the encoding of text in a PDF is a Very Hard Problem.
Finally, to see if it's even theoretically possible to extract the text, try to copy and paste it out of the PDF in Acrobat. If you can do that, then some other program can too. If you cannot (or it comes out as garbage), then other programs are likely to face a similar lack of success.
At that point the only thing you can do is OCR. Optical Character Recognition. I believe Acrobat Pro comes with a simple OCR program, though I could be mistaken. I've never used it.
精彩评论