Finding shortest unique prefix length using map/reduce
I have a list of strings (from documents in CouchDB).
I want to find the minimum prefix length so that all shortened strings (taking the first LEN characters) are unique.
For example:
- aabb
- aabc
- abcd
should give: LEN is three.
Is it possible to write this as 开发者_运维百科a map/reduce function ?
Doing it the brute force way:
MAP: Create for each input record "ABCDE" records with the keys - "A" - "AB" - "ABC" - "ABCD" - "ABCDE"
REDUCE:
- IF you have 1 value in the iterator output: "length(key)" "true"
- If you have more than one 1 value in the iterator output: "length(key)" "false"
MAP: Identity mapper
REDUCE: Output "true" if all input values are true. Else output false (or nothing);
That should result in a true for all lengths that are "unique"
精彩评论