Javascript Split Array
I am trying to write a custom string splitting function, and it is harder than I would have expected.
Basically, I pass in a string and an array of the values that the string will split on, and it will return an array of substrings, removing empty ones and including the values it splits on. If the string can be split at the same place by two different values, the longer one has precedence.
That is,
split("Go ye away, I want some peace && quiet. & Thanks.", ["Go ", ",", "&&", "&", "."]);
should return
["Go ", "ye away", ",", " I want some peace ", "&&", " quiet", ".", " ", "&", " Thanks", "."]
Can you think of a reasonably simple algorithm for this? If there is a built-in way to do this in Javascript (I don开发者_如何学运维't think there is), it would be nicer.
Something like this?
function mySplit(input, delimiters) {
// Sort delimiters array by length to avoid ambiguity
delimiters.sort(function(a, b) {
if (a.length > b.length) { return -1; }
return 0;
}
var result = [];
// Examine input one character at a time
for (var i = 0; i < input.length; i++) {
for (var j = 0; j < delimiters.length; j++) {
if (input.substr(i, delimiters[j].length) == delimiters[j]) {
// Add first chunk of input to result
if (i > 0) {
result.push(input.substr(0, i));
}
result.push(delimiters[j]);
// Reset input and iteration
input = input.substr(i + delimiters[j].length);
i = 0;
j = 0;
}
}
}
return result;
}
var input = "Go ye away, I want some peace && quiet. & Thanks.";
var delimiters = ["Go ", ",", "&&", "&", "."];
console.log(mySplit(input, delimiters));
// Output: ["Go ", "ye away", ",", " I want some peace ",
// "&&", " quiet", ".", " ", "&", " Thanks", "."]
Exact solution asked for:
function megasplit(toSplit, splitters) {
var splitters = splitters.sorted(function(a,b) {return b.length-a.length});
// sort by length; put here for readability, trivial to separate rest of function into helper function
if (!splitters.length)
return toSplit;
else {
var token = splitters[0];
return toSplit
.split(token) // split on token
.map(function(segment) { // recurse on segments
return megasplit(segment, splitters.slice(1))
})
.intersperse(token) // re-insert token
.flatten() // rejoin segments
.filter(Boolean);
}
}
Demo:
> megasplit(
"Go ye away, I want some peace && quiet. & Thanks.",
["Go ", ",", "&&", "&", "."]
)
["Go ", "ye away", ",", " I want some peace ", "&", "&", " quiet", ".", " ", "&", " Thanks", "."]
Machinery (reusable!):
Array.prototype.copy = function() {
return this.slice()
}
Array.prototype.sorted = function() {
var copy = this.copy();
copy.sort.apply(copy, arguments);
return copy;
}
Array.prototype.flatten = function() {
return [].concat.apply([], this)
}
Array.prototype.mapFlatten = function() {
return this.map.apply(this,arguments).flatten()
}
Array.prototype.intersperse = function(token) {
// [1,2,3].intersperse('x') -> [1,'x',2,'x',3]
return this.mapFlatten(function(x){return [token,x]}).slice(1)
}
Notes:
- This required a decent amount of research to do elegantly:
- (Deep) copying an array using jQuery
- What is the most efficient way to concatenate N arrays in JavaScript? (created my own less-ugly method)
- How can I split text on commas not within double quotes, while keeping the quotes? (junk answers, again created my own method)
- This was further complicated by the fact the specification required that tokens (though they were to be left in the string) should NOT be split (or else you'd get
"&", "&"
). This made use ofreduce
impossible and necessitated recursion. - I also personally would not ignore empty strings with splits. I can understand not wanting to recursively split on the tokens, but I'd personally simplify the function and make the output act like a normal
.split
and be like["", "Go ", "ye away", ",", " I want some peace ", "&&", " quiet", ".", " ", "&", " Thanks", ".", ""]
- I should point out that, if you are willing to relax your requirements a little, this goes from being a 15/20-liner to a 1/3-liner:
1-liner if one follows canonical splitting behavior:
Array.prototype.mapFlatten = function() {
...
}
function megasplit(toSplit, splitters) {
return splitters.sorted(...).reduce(function(strings, token) {
return strings.mapFlatten(function(s){return s.split(token)});
}, [toSplit]);
}
3-liner, if the above was hard to read:
Array.prototype.mapFlatten = function() {
...
}
function megasplit(toSplit, splitters) {
var strings = [toSplit];
splitters.sorted(...).forEach(function(token) {
strings = strings.mapFlatten(function(s){return s.split(token)});
});
return strings;
}
精彩评论