
Regexp equivalent of str.getSubstring(x, y);

What would be a RegExp equivalent of substring from position x to position y?

For example:


I know that the value of a field called 'status' in this fixed length string is 8 characters starting from position 26. What would a regex in JavaScript (and Java) look like to get the "AWESOME" string?

I'm trying to build a parser of mainframe screens that come over JMS as fixed length strings. And the idea is to write a UI where a user could highlight a section of a string, fill the 'field name' field, select type (int, String..) and have a Java class generated automatically.


The first .{26} eats the first 26 characters no matter what they are. The (.{8}) captures the next 8 characters and stores them. For Javascript you can use

var matches= /.{26}(.{8})/.exec("0001..STACK.OVERFLOW...IS.AWESOME.13052011")

and matches[1] will be the substr that you're looking for. (matches[0] always contains the entire matched string)

note that the .s can be replaced by character classes if you want (ex. [\w]{26}[\w]{8})

In Java, Pattern.compile("(?s)(?<=.{26}).{8}") should do it, and the substring is the matched text.

You can't do it in javascript because JavaScript regexs don't support lookbehind, but you can do a capturing group. In JavaScript the closest you can get is /^[\s\S]{26}([\s\S]{8})/, and the substring is in group 1.

Note, that this counts chars, not codepoints, so might split a surrogate pair, but JavaScript and Java's built in substring functions have the same problem.

Something like ^(?<=.{2}).*(?=.{3})$ will give you substring starting after 2 and ending 3 position before. And yeah, this doesn't work with JS, but even if it did, stick with substring.

I think this should work:


This will not work in JavaScript because JavaScript does not support look-behind.

Edit (regarding newlines): if you want the . to match new lines, you have to specify Pattern.DOTALL in Java. For more detail, look at Pattern in JavaDocs, particularly Pattern.compile(String regex, int flags).

You can also turn this on by including (?s) in your regex.

Another thought: if, instead of strictly getting the substring starting 28 characters in, you want to get one that follows some other pattern (like 8 characters after "STATUS"), you could just do this:





