Processing images using Leptonica in an Xcode project
In Xcode, I am trying to pre process an image prior to sending it to OCR'ing. The OCR engine, Tesseract, handles images based on the Leptonica library.
As an example: The Leptonica feature pixConvertTo8("image.tif")... is there a way to "transfer" the image raw data from UIImage -> PIX (see pix.h from the leptonica library) -> perform the pixConvertTo8() and back from PIX -> UImage - and this preferably without saving it to a file for transition - all in memory.
- (void) processImage:(UIImage *) uiImage
{
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
// preprocess UIImage here with fx: pixConvertTo8();
CGSize imageSize = [uiImage size];
int bytes_per_line = (int)CGImageGetBytesPerRow([uiImage CGImage]);
int bytes_per_pixel = (int)CGImageGetBitsPerPixel([uiImage CGImage]) / 8.0;
CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage]));
const UInt8 *imageData = CFDataGetBytePtr(data);
// this could take a while.
char* text = tess->TesseractRect(imageData,
bytes_per_开发者_运维百科pixel,
bytes_per_line,
0, 0,
imageSize.width, imageSize.height);
these two functions will do the trick....
- (void) startTesseract
{
//code from http://robertcarlsen.net/2009/12/06/ocr-on-iphone-demo-1043
NSString *dataPath =
[[self applicationDocumentsDirectory]stringByAppendingPathComponent:@"tessdata"];
/*
Set up the data in the docs dir
want to copy the data to the documents folder if it doesn't already exist
*/
NSFileManager *fileManager = [NSFileManager defaultManager];
// If the expected store doesn't exist, copy the default store.
if (![fileManager fileExistsAtPath:dataPath]) {
// get the path to the app bundle (with the tessdata dir)
NSString *bundlePath = [[NSBundle mainBundle] bundlePath];
NSString *tessdataPath = [bundlePath stringByAppendingPathComponent:@"tessdata"];
if (tessdataPath) {
[fileManager copyItemAtPath:tessdataPath toPath:dataPath error:NULL];
}
}
NSString *dataPathWithSlash = [[self applicationDocumentsDirectory] stringByAppendingString:@"/"];
setenv("TESSDATA_PREFIX", [dataPathWithSlash UTF8String], 1);
// init the tesseract engine.
tess = new tesseract::TessBaseAPI();
tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng");
}
- (NSString *) ocrImage: (UIImage *) uiImage
{
//code from http://robertcarlsen.net/2009/12/06/ocr-on-iphone-demo-1043
CGSize imageSize = [uiImage size];
double bytes_per_line = CGImageGetBytesPerRow([uiImage CGImage]);
double bytes_per_pixel = CGImageGetBitsPerPixel([uiImage CGImage]) / 8.0;
CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage]));
const UInt8 *imageData = CFDataGetBytePtr(data);
imageThresholder = new tesseract::ImageThresholder();
imageThresholder->SetImage(imageData,(int) imageSize.width,(int) imageSize.height,(int)bytes_per_pixel,(int)bytes_per_line);
// this could take a while. maybe needs to happen asynchronously.
tess->SetImage(imageThresholder->GetPixRect());
char* text = tess->GetUTF8Text();
// Do something useful with the text!
NSLog(@"Converted text: %@",[NSString stringWithCString:text encoding:NSUTF8StringEncoding]);
return [NSString stringWithCString:text encoding:NSUTF8StringEncoding]
}
You will have to declare both tess and imageThresholder in the .h file
tesseract::TestBaseApi *tess;
tesseract::ImageThresholder *imageThresholder;
I've found some good code snippets in the Tesseract OCR engine about how to do this. Noticeably in class ImageThresholder inside thresholder.cpp - see link below. I didn't test it yet but here is some short description:
the interesting part for me is the else block wherein the depth is 32. here the pixCreate() pixGetdata() pixgetwpl() do the acctual work.
The thresholder.cpp from the tesseract engine uses the above mentioned method
精彩评论