string-manipulation - removing char from string
I currently have the following string in java:
"Blah, bl开发者_如何学编程ah, blah,~Part One, Part Two~,blah blah"
I need to remove the comma between the ~
character so it reads.
"Blah, blah, blah,~Part One Part Two~,blah blah"
Can anyone help me out please?
Many thanks,
String[] tests = {
"a,b,c,d,e,f",
"a,b,~c~,d,e",
"~a,b,c,d,e~",
"a,b,c,~d,e,f~,g,h,i,~j,k,l,~m,n,o~,q,r,~s,t,u",
};
for (String test : tests) {
System.out.println(
test.replaceAll(
"(^[^~]*~)|([^~]*$)|([^,~]*),|([^,~]*~[^~]*~)",
"$1$2$3$4"
)
);
}
The above prints:
a,b,c,d,e,f
a,b,~c~,d,e
~abcde~
a,b,c,~def~,g,h,i,~jkl~m,n,o~qr~s,t,u
How it works
There are 4 cases:
- We're at the beginning of the string, "outside"
- Just match until we find the first
~
, so next time we'll be "inside" - So,
(^[^~]*~)
- Just match until we find the first
- There are no more
~
till the end of the string- If there are even number of
~
, we'll be "outside" - Just match until the end
- So,
([^~]*$)
- If there are even number of
- If it's none of the above, we're "inside"
- Keep finding the next comma before
~
(so we're still "inside")- So,
([^,~]*),
(don't capture the comma!)
- So,
- If we find
~
instead of a comma, then go out, then go back in on the next~
- So,
([^,~]*~[^~]*~)
- So,
- Keep finding the next comma before
In all cases, we make sure we capture enough to reconstruct the string.
References
- regular-expressions.info/Character Classes, Anchors, Grouping and backreferences
Related questions
- Remove a comma between two specific characters (PHP version)
String text = "Blah, blah, blah,~Part One, Part Two~,blah blah,~Part One, Part Two~,blah blah";
Pattern pattern = Pattern.compile("~[^~]+~");
Matcher matcher =pattern.matcher(text);
StringBuffer sb = new StringBuffer();
while(matcher.find()) {
matcher.appendReplacement(sb, matcher.group(0).replaceAll(",", ""));
}
matcher.appendTail(sb);
text = sb.toString();
I haven't tested this but I would do something like:
string sample = "Blah, blah, blah,~Part One, Part Two~,blah blah";
Regex r = new Regex("(.+)\\~(.+),(.+)\\~(.+)","${1}~${2}${3}~${4}");
r.replaceAll(sample );
I referenced Regular Expressions in Java. In here the .+ matches to one or more any character. More such patterns can be found here.
Here is a method that would do the job:
public String deleteCharacterBetween(String deleteFrom, String betweenChar, String charToRemove) {
int nextIndex = 0, index = 0;
while (true) {
index = deleteFrom.indexOf(betweenChar, nextIndex);
nextIndex= deleteFrom.indexOf(betweenChar, index + 1);
if (nextIndex < 0 || index < 0)
return deleteFrom;
String before = deleteFrom.substring(0, index);
String toEdit = deleteFrom.substring(index, nextIndex);
String after = deleteFrom.substring(nextIndex);
toEdit = toEdit.replace(charToRemove, "");
deleteFrom = before + toEdit + after;
}
}
You can call it like this:
String a = "Blah, blah, blah,~Part One, Part Two~,blah blahBlah, blah, blah,~Part One, Part Two~,blah blah";
System.out.println(deleteCharacterBetween(a, "~", ","));
精彩评论