How to split String with some separator but without removing that separator in Java? [duplicate]
I'm facing problem in splitting String
.
I want to split a String
with some separator but without losing that separator.
When we use somestring.split(String separator)
method in Java it splits the String
but removes the separator part from String
. I don't want this to happen.
I want result like below:
String string1="Ram-sita-laxman";
String seperator="-";
string1.split(seperator);
Output:
[Ram, sita, laxman]
but开发者_Python百科 I want the result like the one below instead:
[Ram, -sita, -laxman]
Is there a way to get output like this?
string1.split("(?=-)");
This works because split
actually takes a regular expression. What you're actually seeing is a "zero-width positive lookahead".
I would love to explain more but my daughter wants to play tea party. :)
Edit: Back!
To explain this, I will first show you a different split
operation:
"Ram-sita-laxman".split("");
This splits your string on every zero-length string. There is a zero-length string between every character. Therefore, the result is:
["", "R", "a", "m", "-", "s", "i", "t", "a", "-", "l", "a", "x", "m", "a", "n"]
Now, I modify my regular expression (""
) to only match zero-length strings if they are followed by a dash.
"Ram-sita-laxman".split("(?=-)");
["Ram", "-sita", "-laxman"]
In that example, the ?=
means "lookahead". More specifically, it mean "positive lookahead". Why the "positive"? Because you can also have negative lookahead (?!
) which will split on every zero-length string that is not followed by a dash:
"Ram-sita-laxman".split("(?!-)");
["", "R", "a", "m-", "s", "i", "t", "a-", "l", "a", "x", "m", "a", "n"]
You can also have positive lookbehind (?<=
) which will split on every zero-length string that is preceded by a dash:
"Ram-sita-laxman".split("(?<=-)");
["Ram-", "sita-", "laxman"]
Finally, you can also have negative lookbehind (?<!
) which will split on every zero-length string that is not preceded by a dash:
"Ram-sita-laxman".split("(?<!-)");
["", "R", "a", "m", "-s", "i", "t", "a", "-l", "a", "x", "m", "a", "n"]
These four expressions are collectively known as the lookaround expressions.
Bonus: Putting them together
I just wanted to show an example I encountered recently that combines two of the lookaround expressions. Suppose you wish to split a CapitalCase identifier up into its tokens:
"MyAwesomeClass" => ["My", "Awesome", "Class"]
You can accomplish this using this regular expression:
"MyAwesomeClass".split("(?<=[a-z])(?=[A-Z])");
This splits on every zero-length string that is preceded by a lower case letter ((?<=[a-z])
) and followed by an upper case letter ((?=[A-Z])
).
This technique also works with camelCase identifiers.
It's a bit dodgy, but you could introduce a dummy separator using a replace function. I don't know the Java methods, but in C# it could be something like:
string1.Replace("-", "#-").Split("#");
Of course, you'd need to pick a dummy separator that's guaranteed not to be anywhere else in the string.
Adam hit the nail on the head! I used his answer to figure out how to insert filename text from the file dialog browser into a rich text box. The problem I ran into was when I was adding a new line at the "\" in the file string. The string.split command was splitting at the \ and deleting it. After using a mixture of Adam's code I was able to create a new line after each \ in the file name.
Here is the code I used:
OpenFileDialog fd = new OpenFileDialog();
fd.Multiselect = true;
fd.ShowDialog();
foreach (string filename in fd.FileNames)
{
string currentfiles = uxFiles.Text;
string value = "\r\n" + filename;
//This line allows the Regex command to split after each \ in the filename.
string[] lines = Regex.Split(value, @"(?<=\\)");
foreach (string line in lines)
{
uxFiles.Text = uxFiles.Text + line + "\r\n";
}
}
Enjoy!
Walrusking
A way to do this is to split your string, then add your separator at the beginning of each extracted string except the first one.
seperator="-";
String[] splitstrings = string1.split(seperator);
for(int i=1; i<splitstring.length;i++)
{
splitstring[i] = seperator + splitstring[i];
}
that is the code fitting to LadaRaider's answer.
精彩评论