Tactics for parsing fixed-width text log in Java
I'm trying to figure out how to best parse the following log file, splitting each section seperated by the horizontal lines and extract various pieces of data, e.g. 'COMPANY123', 'BIMMU', the date (2/18 etc.) and then create a string containing all of the other data contained in a section delimited by the horizontal lines.
I.e., I want to create an array of 'statement' objects each with the following attributes:
Company name, Account, Date, Data.
E.g. for the second record below,
Account = 'BIMMU'
Firm = 'Super Corporation'
Date= 9/14/11
Data = '* * * * * * * * TODAYS ACCOUNT ACTIVITY * * * * * * * * * * *
9/14/11 Y9 CALL OESX OCT 11 ........ etc'
The log is a fixed-width text file and the variables (date etc.) always occur at the same position in the line, e.g. sSalesCode = line.substring(142, 147);
Should I maybe do this in two passes, e.g. split the code into sections delimited by the horizontal line, and then parse these sections individually?
Just writing this out here has helped me get my train of thought, but if anybody else has any smart ideas then it would be great to hear them.
------------------------------------------------------------------------------------------------------------------------------------F BIASPBIMMU
BIMMU BIASP-COMPANY123 KG (Z ) 9/14/11 EU (T- I- ) 开发者_开发技巧 MT-0 F BIASP²BIMMU
CALLS 2/18 YI 50.00-X (49) F BIASP²BIMMU
------------------------------------------------------------------------------------------------------------------------------------F BIASPBIMMU
BIMMU BIMM2-SUPER CORPORATION KG (Z ) 9/14/11 EU (T- I- ) MT-0 F BIMM2²BIMMU
F BIMM2²BIMMU
* * * * * * * * * * * * * * * * * * * T O D A Y S A C C O U N T A C T I V I T Y * * * * * * * * * * * * * * * * * * * *F BIMM2²BIMMU
9/14/11 Y9 500 GO CALL OESX OCT 11 2400 9.60 EU .00 F BIMM2²BIMMU
GO-PARFSecurities Ser F BIMM2²BIMMU
Y9 * 500 * COMMISSIONS EU 250.00- F BIMM2²BIMMU
Y9 PERTES & PROFITS NETS EU 250.00- F BIMM2BIMMU
CALLS 9/14 E1 17,825.00-H ( 1) F BIMM2²BIMMU
CALLS 9/14 E1 17,825.00-N ( 1) F BIMM2²BIMMU
-----------------------------------------------------------------------------------------------------------------------------------
You can try to use framework Fixedformat4j. It uses annotations and works fast. I have implemented it partially for my project to understand how it works.
You can create class with annotations like this:
@Record
public class LogRecord {
private String firm;
private String user;
private Date logonDate;
private String logData;
public String getFirm() {
return firm;
}
@field(offset=10, length=10)
public void setFirm(String firm) {
this.firm = firm;
}
public String getUser() {
return user;
}
@field(offset=0, length=10)
public void setUser(String user) {
this.user = user;
}
public Date getLogonDate() {
return logonDate;
}
@field(offset=nn, length=8)
@FixedFormatPattern("mm/dd/yy")
public void setLogonDate(Date logonDate) {
this.logonDate = logonDate;
}
public String getLogData() {
return logData;
}
@field(offset=mm, length=yy)
public void setLogData(String logData) {
this.logData = logData;
}
}
And then instantiate it with FixedFormatManager.
i had similar problem recenly, i ended up using Flapjack (Google Code: Flapjack)... See the examples on google code, i guess it should help you out.
精彩评论