开发者

Breaking up a String in Java

I have multiple strings that are in the following format:

12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]

From these string I need to get out the date, time, first and last name of the person, and the card number. The word admitted can be omitted and anyt开发者_如何学Pythonhing following the final digit of the card number can be ignored.

I have a feeling I want to use StringTokenizer for this, but I'm not positive.

Any suggestions?


The String Tokenizer is great when you have a common delimiter, but in this case I'd opt for regular expressions.


Your record format is simple enough that I'd just use String's split method to get the date and time. As pointed out in the comments, having names that can contain spaces complicates things just enough that splitting the record by spaces won't work for every field. I used a regular expression to grab the other three pieces of information.

public static void main(String[] args) {
    String record1 = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]";
    String record2 = "12/18/2009 02:08:26 Admitted Van Halen, Eddie (Card #222) at South Lobby [In]";
    String record3 = "12/18/2009 02:08:26 Admitted Thoreau, Henry David (Card #333) at South Lobby [In]";

    summary(record1);
    summary(record2);
    summary(record3);
}

public static void summary(String record) {
    String[] tokens = record.split(" ");

    String date = tokens[0];
    String time = tokens[1];

    String regEx = "Admitted (.*), (.*) \\(Card #(.*)\\)";
    Pattern pattern = Pattern.compile(regEx);
    Matcher matcher = pattern.matcher(record);
    matcher.find();

    String lastName = matcher.group(1);
    String firstName = matcher.group(2);
    String cardNumber = matcher.group(3);

    System.out.println("\nDate: " + date);
    System.out.println("Time: " + time);
    System.out.println("First Name: " + firstName);
    System.out.println("Last Name: " + lastName);
    System.out.println("Card Number: " + cardNumber);
}

The regular expression "Admitted (.*), (.*) \\(Card #(.*)\\)" uses grouping parentheses to store the information you're trying to extract. The parentheses that exist in your record must be escaped.

Running the code above gives me the following output:

Date: 12/18/2009
Time: 02:08:26
First Name: John
Last Name: Doe
Card Number: 111

Date: 12/18/2009
Time: 02:08:26
First Name: Eddie
Last Name: Van Halen
Card Number: 222

Date: 12/18/2009
Time: 02:08:26
First Name: Henry David
Last Name: Thoreau
Card Number: 333


I'd go for java.util.Scanner... this code will get you started... you should really use the Pattern form of the scanner methods rather then the String form that I used.

import java.util.Scanner;

public class Main
{
    public static void main(String[] args)
        throws Exception
    {
        final String  str;
        final Scanner scanner;
        final String  date;
        final String  time;
        final String  word;
        final String  lastName;
        final String  firstName;

        str       = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]";
        scanner   = new Scanner(str);
        date      = scanner.next("\\d+/\\d+/\\d+");
        time      = scanner.next("\\d+:\\d+:\\d+");
        word      = scanner.next();
        lastName  = scanner.next();
        firstName = scanner.next();
        System.out.println("date : " + date);
        System.out.println("time : " + time);
        System.out.println("word : " + word);
        System.out.println("last : " + lastName);
        System.out.println("first: " + firstName);
    }
}


A few things to keep in mind while you are parsing this line:

  • Last names can have spaces so you should be looking for ,
  • First name could have a space so look for the (

Due to this I would work off of TofuBeer's answer and adjust the next for first and last name. The string split is gonna be messy due to the extra spaces.


Shortest regexp solution (with type casting):

String stringToParse = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In] ";
Pattern pattern = Pattern.compile("((\\d{2}/){2}\\d{4}\\s(\\d{2}:){2}\\d{2})\\s(\\w+)\\s((.*)),\\s((.*))\\s.*#(\\d+)");
Matcher matcher = pattern.matcher(stringToParse);
matcher.find();

String firstName = matcher.group(6);
String lastName = matcher.group(5);
int cardNumber = Integer.parseInt(matcher.group(7));

DateFormat df = new SimpleDateFormat("MM/dd/yyyy HH:mm:ss");
Date date = df.parse(matcher.group(1));


Trust your guts... :) With StringTokenizer:

import java.io.*;
import java.util.StringTokenizer;
public class Test {
  public Test() {
  }

public void execute(String str) { String date, time, firstName, lastName, cardNo; StringTokenizer st = new StringTokenizer(str, " "); date = st.nextToken(); time = st.nextToken(); st.nextToken(); //Admitted lastName = st.nextToken(",").trim(); firstName = st.nextToken(",(").trim(); st.nextToken("#"); //Card cardNo = st.nextToken(")#"); System.out.println("date = " + date +"\ntime = " + time +"\nfirstName = " + firstName +"\nlastName = "+ lastName +"\ncardNo = " +cardNo); }

public static void main(String args[]) { Test t = new Test(); String record1 = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]"; String record2 = "12/18/2009 02:08:26 Admitted Van Halen, Eddie (Card #222) at South Lobby [In]"; String record3 = "12/18/2009 02:08:26 Admitted Thoreau, Henry David (Card #333) at South Lobby [In]"; t.execute(record1); t.execute(record2); t.execute(record3); } }

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜