开发者

Multiline regexp matcher

There is input file with content:

XX00002200000

XX00003300000

regexp:

(.{6}22.{5}\W)(.{6}33.{5})

Tried in The Regex Coach(app for regexp testing), strings are matched OK.

Java:

        pattern = Pattern.compile(patternString);
        inputStream = resource.getInputStream();

        scanner = new Scanner(inputStream, charsetName);
        scanner.useDelimiter("\r\n");

patt开发者_如何转开发ernString is regexp(mentioned above) added as bean property from .xml

It's failed from Java.


Simple solution: ".{6}22.{5}\\s+.{6}33.{5}". Note that \s+ is a shorthand for consequent whitespace elements.

Heres an example:

 public static void main(String[] argv) throws FileNotFoundException {
  String input = "yXX00002200000\r\nXX00003300000\nshort", regex = ".{6}22.{5}\\s+.{6}33.{5}", result = "";
  Pattern pattern = Pattern.compile(regex);
  Matcher m = pattern.matcher(input);

  while (m.find()) {
   result = m.group();
   System.out.println(result);
  }
 }

With output:

XX00002200000
XX00003300000

To play around with Java Regex you can use: Regular Expression Editor (free online editor)

Edit: I think that you are changing the input when you are reading data, try:

public static String readFile(String filename) throws FileNotFoundException {
    Scanner sc = new Scanner(new File(filename));

    StringBuilder sb = new StringBuilder();
    while (sc.hasNextLine())
        sb.append(sc.nextLine());
    sc.close();

    return sb.toString();
}

Or

static String readFile(String path) {
    FileInputStream stream = null;
    FileChannel channel = null;
    MappedByteBuffer buffer = null;

    try {
        stream = new FileInputStream(new File(path));
        channel = stream.getChannel();
        buffer = channel.map(FileChannel.MapMode.READ_ONLY, 0,
                channel.size());
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        try {
            stream.close();
        } catch (Exception e2) {
            e2.printStackTrace();
        }
    }

    return Charset.defaultCharset().decode(buffer).toString();
}

With imports like:

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.util.regex.Matcher;
import java.util.regex.Pattern;


Try this change in delimiter:

 scanner.useDelimiter("\\s+");

also why don't you use a more general regex expression like this :

 ".{6}[0-9]{2}.{5}"

The regex you have mentioned above is for 2 lines.Since you have mentioned the delimiter as a new line you should be giving a regex expression suitable for a single line.


Pardon my ignorance, but I am still not sure what exactly are you trying to search. In case, you are trying to search for the string (with new lines)

XX00002200000
XX00003300000

then why are you reading it by delimiting it by new lines?

To read the above string as it is, the following code works

Pattern p = Pattern.compile(".{6}22.{5}\\W+.{6}33.{5}");

 FileInputStream scanner = null;
        try {
            scanner = new FileInputStream("C:\\new.txt");
            {
                byte[] f = new byte[100];
                scanner.read(f);
                String s = new String(f);
                Matcher m = p.matcher(s);
                if(m.find())
                    System.out.println(m.group());
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

NB: here new.txt file contains the string

XX00002200000
XX00003300000
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜