create an array of words (Strings) from String
How do I create an array of strings from a string, eg开发者_StackOverflow.
"hello world" would return ["hello", "world"]. This would need to take into account punctuation marks, etc.
There's probably a great RegEx solution for this, I'm just not capable of finding it.
How about AS3's String.split?
var text:String = "hello world";
var split:Array = text.split(" "); // this will give you ["hello", "world"]
// then iterate and strip out any redundant punctuation like commas, colons and full stops
Think I've cracked it, here is the function in full:
public static function getArrayFromString(str:String):Array {
return str.split(/\W | ' | /gi);
}
Basically, it uses the 'not a word' condition but excludes apostrophes, is global and ignores case. Thanks to everyone who pointed me in the right direction.
Any reason that:
var myString:String = "hello world";
var reg:RegExp = /\W/i;
var stringAsArray:Array = myString.replace(reg, "").split(" ");
Won't work?
Maybe this one works too...
public static function getArrayFromString(str:String):Array {
return str.split(/[^,\.\s\n\r\f¿\?¡!]+/gi);
}
That should work in languages other than English, for example (i.e. '\w' won't accept accented characters, for instance...)
Here's what you need. Tested and working:
private function splitString(str:String):Array {
var r:RegExp = /\W+/g;
return str.split(r));
}
http://snipplr.com/view/63811/split-string-into-array/
This seems to do what you want:
package
{
import flash.display.Sprite
public class WordSplit extends Sprite
{
public function WordSplit()
{
var inText:String = "This is a Hello World example.\nIt attempts,\
to simulate! what splitting\" words ' using: puncuation\tand\
invisible ; characters ^ & * yeah.";
var regExp:RegExp = /\w+/g;
var wordList:Array = inText.match(regExp);
trace(wordList);
}
}
}
If not, please provide a sample input and output specification.
I think you might want something like this:
public static function getArrayFromString(str:String):Array {
return str.split(/[\W']+/gi);
}
Basically, you can add any characters that you want to be considered delimiters into the square brackets. Here's how the pieces work:
- The brackets define a set of characters.
- The things in the brackets are the characters in the set (with \W being "not a word")
- The plus sign means "one or more of the previous item"—in this case, the character set. That way, if you have something with several of the characters in a row, you won't get empty items in your array.
精彩评论