pattern recognition in a matrix
I'm not sure I'm giving the right name to this, but anyway I have a matrix that essentially contains 1s or 0s. The matrix is a square matrix, its size can be 3x3, 4x4 or 5x5.
By pattern matching, I mean to find some "shapes" in my matrix such as a line, a T or a U, e.g.: 0 1 0 0 1 0 1 1 1that matrix contains a T, but it also contains 2 lines! Now if the matrix was a 4x4, the shapes don't increase but they can be positioned at more place obviously, e.g.: 0 0 0 0
1 1 1 0 0 0 1 0 1 1 1 0That matrix would contain a U (no lines though, this is the exception, lines have the size of the matrix).
Naively since the matrix is pretty small I would have tried all possibilities for each shape I'm wiling to support, but it's not very fun. I cannot figure out any algorithm for this though, and not being able to label this operation properly doesn't help ;) 开发者_如何学Python Has anyone got any idea how to would do this "efficiently" ? (efficiently may be a bit of an overstatement considering the size of the matrix, but you know what I mean).There's some ambiguity in your question. For instance, does:
1 1 1
1 1 1
1 1 1
contain 6 lines, a T, a U, and a bunch of other letters of the alphabet? Or are all letters separated? Your initial question implied that letters could be discovered in overlapping fashion, because the T template contains two lines. Thus, a matrix where all elements were 'on' would contain every possible letter/line in every possible position.
Also, I'm assuming you're only concerned about 90 degree rotations and you wouldn't want to try to find 45-degree offset letters when the matrix sizes get large enough to support it.
In terms of ease-of-implementation, the brute-force approach you're talking about (test every position for all four letter rotations) really wins out, I'd say.
Alternatively, you could get pretty fancy by (warning: vague algorithm descriptions ahead!):
1) Walking along the matrix elements until you found a 1. Then essentially flood-fill from that 1 on a stack and keep track of the direction changes. Then have some sort of rotation-invariant lookup that mapped a set of 'on' pixels to found letters.
2) Use some sort of integral-image or box-filter description to take sums of subsections of the matrix. You could then do lookups on the subsections and map the subsection sums to letter/line values.
3) Since the comments have determined that you're only really looking for 4 shapes, a new approach may be worthwhile. You're only examining 4 shapes (line, cross, T, and U) if I'm not mistaken. Each of them can be in 4 orientations. One quick tip is that you can just run the algorithm 4 times but rotate the underlying matrix by 90 degrees. Then you don't have to adjust for rotation in your algorithm. Also note that the cross only needs to be found in one orientation because it looks identical in all 4 orientations and the line is identical in two orientations. Anyway, you could save yourself some work by searching for the 'hardest' things to match first. Let's say I'm looking for an upright 'U' here:
1 0 1
1 0 1
1 1 1
I start in the top left. Rather than checking to make sure that any pixels are 'off' (or 0), I go to the next place I expect to find an 'on' value (or a 1). Let's say that's the pixel below the top left. I check the middle-left pixel, and indeed it's on. Then I check below that. If you develop a simple rule set for each letter, you can quickly abandon the search for it if you don't have the required values turned 'on'. If you then run the same algorithm 4 times and only search for upright values, I'm not sure you'd be able to do much better than this!
The approaches I've mentioned are just ideas. They may be more trouble than they're worth in terms of efficiency gains, though. And who knows, they may not work at all!
Good luck!
I thought I could contribute with what I ended up doing so here it is, following aardvarkk idea. (objective-c code) I wasn't very pedantic with the array size checks because my matrix is necessarily a square matrix. Also sorry if the code is ugly :D
I made a little class structure for the shapes I want to reconize, they have a list of "directions" which are essentially values of an enum.
-(BOOL)findShape:(NSInteger)size directions:(NSArray*)directions{
NSMutableArray* current = [mgs tokens];
for (int rot = 0; rot < 4; rot++) {
for (int i = 0; i < size; i++) {
for(int j = 0; j < size; j++){
NSInteger value = [[[current objectAtIndex:i] objectAtIndex:j] integerValue];
if(value){
BOOL carryOn = [self iterateThroughDirections:directions i:i j:j tokens:current size:size];
if(carryOn){
return YES;
}
}
}
}
current = [self rotate:current];
}
return NO;
}
-(BOOL) iterateThroughDirections:(NSArray*)directions i:(NSInteger)i j:(NSInteger)j tokens:(NSMutableArray*)tokens size:(NSInteger)size{
BOOL carryOn = YES;
for(int k = 0; k < [directions count] && carryOn; k++){
NSNumber* dir = [directions objectAtIndex:k];
NSInteger d = [dir integerValue];
//move in the direction
switch (d) {
case UP:
if(i > 0){
i--;
}else{
carryOn = NO;
}
break;
case DOWN:
if(i < size-1){
i++;
}else{
carryOn = NO;
}
break;
case LEFT:
if(j > 0){
j--;
}else{
carryOn = NO;
}
break;
case RIGHT:
if(j < size-1){
j++;
}else{
carryOn = NO;
}
break;
default:
NSAssert(NO, @"invalid direction");
break;
}
NSInteger v = [[[tokens objectAtIndex:i] objectAtIndex:j] integerValue];
//now that we moved, check if the token is active, if it's not we're done
if(!v){
carryOn = NO;
break;
}
}
return carryOn;
}
-(NSMutableArray*)rotate:(NSMutableArray*)matrix{
NSInteger w = [matrix count];
NSInteger h = [[matrix objectAtIndex:0] count];
NSMutableArray* rotated = [[NSMutableArray arrayWithCapacity:h] retain];
for (int i = 0; i < h; i++) {
[rotated addObject:[NSMutableArray arrayWithCapacity:w]];
}
for(int i = 0; i < h; ++i){
for(int j = 0; j < w; ++j){
[[rotated objectAtIndex:i] addObject:[[matrix objectAtIndex:j] objectAtIndex:h-i-1]];
}
}
return rotated;
}
This seems to be working well for me! Thanks again for the help!
精彩评论