Is there a way to delete the leading numbers only from those elements that start with a number followed by a period?
I have data in this format:
1. New York Times - USA
2. Guardian - UK
Der Spiegel - Germany
3. Le Monde - France
Dagen - Denmark (12.6.2002)
Norga-i-Dag (2) - Norway
I want to end up with newspaper values of:
New York Times
Guardian
Der Spiegel
Le Monde
Dagen
Norga-i-Dag
I'm using this code to parse out the newspaper
and country
values:
String newspaper = "";
String country = "";
int hyphenIndex = unparsedText.indexOf("-");
if (hyphenIndex > -1)
{
newspap开发者_运维技巧er = unparsedText.substring(0, hyphenIndex);
}
country = unparsedText.substring(hyphenIndex + 1, unparsedText.length());
country = country.trim();
Is there a way to delete the leading numbers only from those elements that start with a number followed by a period:
1. New York Times
2. Guardian
3. Le Monde
In other words this would be fine as a compromise:
. New York Times - USA
. Guardian - UK
Der Spiegel - Germany
. Le Monde - France
Dagen - Denmark (12.6.2002)
Norga-i-Dag (2) - Norway
I want to avoid creating problems for elements like these that also contain numbers and/or periods:
Dagen - Denmark (12.6.2002)
Norga-i-Dag (2) - Norway
try this to remove at least one digit, followed by a period and any number of spaces.
String text = unparsedText.replace("^[0-9]+\\. *", "");
I'm sure you'll get a flood of answers shortly :-). In the mean time I think you'll benefit from the RegEx tutorial. Hint: . is a special character in regex
This will remove any digits followed by a period followed by a space, i.e. 11.
. NOTE: It would be best if unparsedText
were each line, as otherwise this may replace items you want to keep.
unparsedText.replaceAll('\\d+\.\\s+', '');
String resultString = subjectString.replaceAll("(?m)^\\d+\\.\\s*", "");
should do.
It will delete a number, a dot and optional spaces, but only at the start of a line.
You can do the following to directly transform your input to output:
String result = input.replaceAll("(?m)^\\d+\\.\\s*|-(?!.*-)\\s*.*?$", "");
Code In Action
精彩评论