开发者

Finding shortest unique prefix length using map/reduce

I have a list of strings (from documents in CouchDB).

I want to find the minimum prefix length so that all shortened strings (taking the first LEN characters) are unique.

For example:

  • aabb
  • aabc
  • abcd

should give: LEN is three.

Is it possible to write this as 开发者_运维百科a map/reduce function ?


Doing it the brute force way:

MAP: Create for each input record "ABCDE" records with the keys - "A" - "AB" - "ABC" - "ABCD" - "ABCDE"

REDUCE:

  • IF you have 1 value in the iterator output: "length(key)" "true"
  • If you have more than one 1 value in the iterator output: "length(key)" "false"

MAP: Identity mapper

REDUCE: Output "true" if all input values are true. Else output false (or nothing);

That should result in a true for all lengths that are "unique"

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜