Remove/Delete Page from multipage TIFF file
I need to delete a page from a multipaged TIFF file. I am currently working in .NET but can move to another language if some one knows how to do it in that language.
The page would be either the second to last, or the last page in the file. And I need to do it with out decompressing the previous pages in the file, so not creating a new TIFF and copying all the pages I still want to that.
I have code that does that already, but as the TIFF files I am working with are around 1.0 gb - 3.0 gb heavily compressed, this is extremely time consuming. If I can just remove the part of the file that I want and not create a new one that will go much faster.
The page that I need to remove is very very small compared to the rest of the file, as is the page that may or may not be after it, around 500*500 pixels.
What I have tried, I have tried the LibTiff.Net library, found here
http://bitmiracle.com/libtiff/
After messing with it for awhile I asked the developer about my issue, they said that there wasn't currently support to do that. I also loo开发者_如何学编程ked into ImageMagick a bit, but I haven't been able to figure out how to do this there either.
Any one got any helpful ideas here?
Ok, got a solution working in python.
import mmap
from struct import *
def main():
filename = raw_input("Input file name: ")
f = open(filename, "r+b")
offList, compList = getOffsets(f)
for i in range(len(offList)):
print "offset: ", offList[i], "\t Compression: ", compList[i]
print "ran right"
stripLabelAndMacro(f, offList, 3)
offList, compList = getOffsets(f)
for i in range(len(offList)):
print "offset: ", offList[i], "\t Compression: ", compList[i]
f.close()
#test stripping end crap
def getOffsets(f):
fmap = mmap.mmap(f.fileno(),0)
offsets = []
compressions = []
#get TIFF version
ver = int(unpack('H', fmap[2:4])[0])
if ver == 42:
#get first IDF
offset = long(unpack('L', fmap[4:8])[0])
while (offset != 0):
offsets.append(offset)
#get number of tags in this IDF
tags = int(unpack('H', fmap[offset:offset+2])[0])
i = 0
while (i<tags):
tagID = int(unpack('H',fmap[offset+2:offset+4])[0])
#if the tag is a compression, get the compression SHORT value and
#if recognized use a string representation
if tagID == 259:
tagValue = int(unpack('H', fmap[offset+10:offset+12])[0])
if tagValue == 1:
compressions.append("None")
elif tagValue == 5:
compressions.append("LZW")
elif tagValue == 6:
compressions.append("JPEG")
elif tagValue == 7:
compressions.append("JPEG")
elif tagValue == 34712 or tagValue == 33003 or tagValue == 33005:
compressions.append("JP2K")
else:
compressions.append("Unknown")
i+=1
offset += 12
offset = long(unpack('L', fmap[offset+2:offset+6])[0])
return offsets, compressions
#Tested, Doesn't break TIFF
def stripLabel(f, offsetList, labelIndex):
fmap = mmap.mmap(f.fileno(),0)
offsetLabel = offsetList[labelIndex]
offsetMacro = offsetList[labelIndex+1]
offsetEnd = fmap.size()
macroSize = offsetEnd - offsetMacro
for i in range(macroSize):
fmap[offsetLabel+i] = fmap[offsetMacro+i]
fmap.flush()
fmap.resize(offsetLabel+macroSize-1)
fmap.close()
Tested it, seems to work fine. the stripLabel method is specifically meant to remove the second to last page/directory and shift the last one up, but it should in theory work for any directory other than the last, and it could be easily modified to remove the last too. It requires at least the amount of free ram as the file size you are working on, but it runs fast and file size isn't an issue with most TIFF's. It isn't the most elegant approach, if some one has another please post.
精彩评论