How to create article spinner regex in Java?
Say for example I want to take this phrase:
{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}
and randomly make it into one of the following:
Hello world
Goodbye people
What's Up word
What's Up planet
Later citizens
etc.
The basic idea is that enclosed within every pair of braces will be an unlimited number of choices separated by "|". The program needs to go through and randomly choose one choice for each set of braces. Keep in mind that braces can be nested endlessly within each other. I found a thread about this and tried to convert it to Java, but it did not work. Here is the python code that supposedly worked:
import re
from random import randint
def select(m):
choices = m.group(1).split('|')
return choices[randint(0, len(choices)-1)]
def spinner(s):
r = re.compile('{([^{}]*)}')
while True:
s, n = r.subn(select, s)
if n == 0: break
return s.strip()
Here is my attempt to convert that Python code to Java.
public String generateSpun(String text){
String spun = new String(text);
Pattern reg = Pattern.compile("{([^{}]*)}");
Matcher matcher = reg.matcher(spun);
while (matcher.find()){
spun = matcher.replaceFirst(select(matcher.group()));
}
r开发者_如何学Goeturn spun;
}
private String select(String m){
String[] choices = m.split("|");
Random random = new Random();
int index = random.nextInt(choices.length - 1);
return choices[index];
}
Unfortunately, when I try to test this by calling
generateAd("{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}");
In the main of my program, it gives me an error in the line in generateSpun where Pattern reg is declared, giving me a PatternSyntaxException.
java.util.regex.PatternSyntaxException: Illegal repetition
{([^{}]*)}
Can someone try to create a Java method that will do what I am trying to do?
Here are some of the problems with your current code:
- You should reuse your compiled
Pattern
, instead ofPattern.compile
every time - You should reuse your
Random
, instead ofnew Random
every time - Be aware that
String.split
is regex-based, so you mustsplit("\\|")
- Be aware that curly braces in Java regex must be escaped to match literally, so
Pattern.compile("\\{([^{}]*)\\}");
- You should query
group(1)
, notgroup()
which defaults to group0
- You're using
replaceFirst
wrong, look upMatcher.appendReplacement/Tail
instead Random.nextInt(int n)
has exclusive upper bound (like many such methods in Java)- The algorithm itself actually does not handle arbitrarily nested braces properly
Note that escaping is done by preceding with \
, and as a Java string literal it needs to be doubled (i.e. "\\"
contains a single character, the backslash).
Attachment
- Source code and output with above fix but no major change to algorithm
To fix the regex, add backslashes before the outer {
and }
. These are meta-characters in Java regexes. However, I don't think that will result in a working program. You are modifying the variable spun
after it has been bound to the regex, and I do not think the returned Matcher will reflect the updated value.
I also don't think the python code will work for nested choices. Have you actually tried the python code? You say it "supposedly works", but it would be wise to verify that before you spend a lot of time porting it to Java.
Well , I just created one in PHP & Python , demo here http://spin.developerscrib.com , its at a very early stage so might not work to expectation , the source code is on github : https://github.com/razzbee/razzy-spinner
Use this, will work... I did, and working great
Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb.toString());
and here
private String select(String m){
String[] choices = m.split("|");
Random random = new Random();
int index = random.nextInt(choices.length - 1);
return choices[index];
}
m.split("|")
use m.split("\\|")
Other wise it splits each an every character
and use Pattern.compile("\\{([^{}]*)\\}");
精彩评论