开发者

Is there a way to delete the leading numbers only from those elements that start with a number followed by a period?

I have data in this format:

    1. New York Times - USA
    2. Guardian - UK
    Der Spiegel - Germany
    3. Le Monde - France
    Dagen - Denmark (12.6.2002)
    Norga-i-Dag (2) - Norway

I want to end up with newspaper values of:

    New York Times
    Guardian
    Der Spiegel
    Le Monde
    Dagen
    Norga-i-Dag

I'm using this code to parse out the newspaper and country values:

    String newspaper = "";
    String country = "";
    int hyphenIndex = unparsedText.indexOf("-");
    if (hyphenIndex > -1)
    {
        newspap开发者_运维技巧er = unparsedText.substring(0, hyphenIndex);
    }
    country = unparsedText.substring(hyphenIndex + 1, unparsedText.length());
    country = country.trim();

Is there a way to delete the leading numbers only from those elements that start with a number followed by a period:

    1. New York Times
    2. Guardian
    3. Le Monde

In other words this would be fine as a compromise:

    . New York Times - USA
    . Guardian - UK
    Der Spiegel - Germany
    . Le Monde - France
    Dagen - Denmark (12.6.2002)
    Norga-i-Dag (2) - Norway

I want to avoid creating problems for elements like these that also contain numbers and/or periods:

    Dagen - Denmark (12.6.2002)
    Norga-i-Dag (2) - Norway


try this to remove at least one digit, followed by a period and any number of spaces.

String text = unparsedText.replace("^[0-9]+\\. *", "");


I'm sure you'll get a flood of answers shortly :-). In the mean time I think you'll benefit from the RegEx tutorial. Hint: . is a special character in regex


This will remove any digits followed by a period followed by a space, i.e. 11.. NOTE: It would be best if unparsedText were each line, as otherwise this may replace items you want to keep.

unparsedText.replaceAll('\\d+\.\\s+', '');


String resultString = subjectString.replaceAll("(?m)^\\d+\\.\\s*", "");

should do.

It will delete a number, a dot and optional spaces, but only at the start of a line.


You can do the following to directly transform your input to output:

String result = input.replaceAll("(?m)^\\d+\\.\\s*|-(?!.*-)\\s*.*?$", "");

Code In Action

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜