开发者

How to remove duplicate string in big string data?

e.g开发者_如何学运维. i am getting data like "vivartvivartpandey" i want output like "vivartpandey" Only one thing is fixed that data will come either like string1+string1+string2 or like string1+string2 (no duplicate) but string1 and string2 both are variable.

So How to identify string1 and remove duplicate string1?


we need more constraints to be able to achieve this. For example, if you get "ssssabcd", there is no way to find out if the string1 is "ssss" or "ss" (i.e if repetition occured)


Use a regular expression like this:

String s = "vivaryvivartypadney";

Matcher m = Pattern.compile("(.*)\g(-1)(.*)").matcher(s);

if (m.find())
    String prefix = m.group(1), suffix = m.group(2);

The first parenthesis in the regexp define a group, and the \g(-1) is a relative reference to the prior matching group. Now, if your string were something like "vivavivavivavivaChile", you'd get a match, but only because of the 'first' duplication, not the longer second one (as other answer mentions). I'll leave it to you to put this in a loop to get the longest such match if you want that.


I have tried to create a simple solution.

    int index = 0;
    for (int i = 0; i <= text.length() / 2; i++) {
        String string1 = text.substring(0, i);
        String string2 = text.substring(i, 2 * i);
        if (string1.equals(string2)) {
            index = i;
        }
    }
    System.out.println("without duplicate: " + text.substring(index));
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜