开发者

Languages other than English for Tesseract iOS

I am trying to use Tesseract Open source code to see if I can compile and recognize English characters on the iPhone. I was 开发者_运维知识库able to do so. Now I try to include "ita.traineddata" inside tessdata and change

tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding],    // Path to tessdata-no ending /.
           "eng");                                                  // ISO 639-3 string or NULL.

to

tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding],    // Path to tessdata-no ending /.
           "ita");                                                  // ISO 639-3 string or NULL.

but I get this error: Error openning data file /var/mobile/Applications/A37DB8B7-2272-4F80-9836-0034CEB56CC5/Documents/tessdata/ita.traineddata

What am I missing and how should this be handled?


First add the tessdata to your project/project name folder, and than (IMPORTANT) go to targets / build phases / copy bundle resources and add the tessdata folder as REFERENCE!

and then init the tesseract like this:

// Set up the tessdata path. This is included in the application bundle
// but is copied to the Documents directory on the first run.
NSArray *documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentPath = ([documentPaths count] > 0) ? [documentPaths objectAtIndex:0] : nil;

NSString *dataPath = [documentPath stringByAppendingPathComponent:@"tessdata"];
NSFileManager *fileManager = [NSFileManager defaultManager];
// If the expected store doesn't exist, copy the default store.
if (![fileManager fileExistsAtPath:dataPath]) {
    // get the path to the app bundle (with the tessdata dir)
    NSString *bundlePath = [[NSBundle mainBundle] bundlePath];
    NSString *tessdataPath = [bundlePath stringByAppendingPathComponent:@"tessdata"];
    if (tessdataPath) {
        [fileManager copyItemAtPath:tessdataPath toPath:dataPath error:NULL];
    }
}    
setenv("TESSDATA_PREFIX", [[documentPath stringByAppendingString:@"/"] UTF8String], 1);
// init the tesseract engine.
tesseract = new tesseract::TessBaseAPI();    
tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "ita");

note: Tesseract initialize itself with english language as default, once I removed the whole tessdata folder, and it still worked without the eng.traineddata file, thats why it works with the english but not with the italian traineddata, your tessdata folder is not initialized properly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜