Removing up to 4 spaces from a string
I have an array of Strings Im looping 开发者_如何学Pythonthrough. For each string, I need to remove up to 4 spaces from the beginning. In other words, if there are only 2 spaces, I remove 2. If there are 6 spaces I remove 4. How can I specify this in the loop?
for(int i=0; i<stringArray.length; i++) {
newString = REMOVE UP TO 4 SPACES FROM stringArray[i];
stringArray[i] = newString;
}
Thanks!
Try this:
stringArray[i] = stringArray[i].replaceFirst ("^ {0,4}", "");
Regex is the best tool for this job. I will explain it step by step:
Introduction
System.out.println(
"# one ## two ### three #### four"
.replaceAll("##", "@@")
);
// "# one @@ two @@# three @@@@ four"
The above snippet should give you a good idea of how replaceAll
works: it replaces all occurrences of "##"
with "@@"
.
Pattern
As it turns out, replaceAll
is a regex-based method: the first argument is a special pattern string, and the second argument is a special replacement string. The next snippet illustrates the idea:
System.out.println(
"# one ## two ### three #### four"
.replaceAll("#{2}", "@@")
);
// "# one @@ two @@# three @@@@ four"
Now we used "#{2}"
as the first argument. Rather intuitively, in regex this means "#
repeated exactly twice"; this is exactly the same pattern we had before, which is why we also get the same result.
Range
The bounded repetition syntax in regex is actually quite expressive: instead of exact repetition, we can also define a range as follows:
System.out.println(
"# one ## two ### three #### four"
.replaceAll("#{1,3}", ":")
);
// ": one : two : three :: four"
Rather intuitively, #{1,3}
means "#
repeated between 1 and 3 times".
Greed
Now note that regex repetition by default is greedy: it tries to match more if possible. This is illustrated by the following:
System.out.println(
"# one ## two ### three #### four"
.replaceAll("#{2,3}", ":")
);
// ": one : two : three :# four"
Note that ####
is replaced into :#
. This is because the first 3 was taken by the first replacement, leaving only 1 left. Had #{2,3}
only taken 2 #
the first time, there would've been another #
the second time, but since it's greedy, it took 3 #
the first time, leaving it no chance to take the last #
!
First
Now let's try another example as follows:
System.out.println(
"=====5====4===3==2=1"
.replaceAll("={1,4}", ":")
);
// "::5:4:3:2:1"
Now let's say that we only want the first ={1,4}
match to be replaced with ":"
.
System.out.println(
"=====5====4===3==2=1"
.replaceFirst("={1,4}", ":")
);
// ":=5====4===3==2=1"
Voila! Everything works as expected!
Anchor
Now let's look at the next example:
System.out.println(
"0=====5====4===3==2=1"
.replaceFirst("={1,4}", ":")
);
// "0:=5====4===3==2=1"
The replacement is still doing what it's supposed to do, but let's suppose that we only to match ={1,4}
at the beginning of the string. Fortunately for us, regular expression has a way to express this: we can anchor the pattern at the beginning of the string, which is denoted by ^
.
System.out.println(
"0=====5====4===3==2=1"
.replaceFirst("^={1,4}", ":")
);
// "0=====5====4===3==2=1"
System.out.println(
"=====5====4===3==2=1"
.replaceFirst("^={1,4}", ":")
);
// ":=5====4===3==2=1"
System.out.println(
"===3==2=1"
.replaceFirst("^={1,4}", ":")
);
// ":3==2=1"
Voila! Everything works as expected!
Going back to the answer
And now we have enough information to answer the original question!
stringArray[i] = stringArray[i].replaceFirst("^ {1,4}", "");
So the pattern ^ {1,4}
means:
- Anchored at the beginning of the string with
^
... - ...between 1 to 4 space characters, taking more if possible
We then replace the first occurrence of such a match with the empty string, essentially removing it.
More learning resources
That was a beginner's introduction to regular expressions basics. There are still many aspects to this wonderful world that haven't been explored yet.
References
- Java Tutorials/Essential Classes/Regular Expressions
- regular-expressions.info
- Repetition
- Anchors
Related questions
- Coming eventually
Do you care about trailing spaces?
You could check it against a regular expression to see if it matches 4 or more spaces in front, and use substring to clip off the first 4 spaces. Else if there are less than 4 spaces in front (does not match RegEx) just use string.trim() in Java.
精彩评论