开发者

Tactics for parsing fixed-width text log in Java

I'm trying to figure out how to best parse the following log file, splitting each section seperated by the horizontal lines and extract various pieces of data, e.g. 'COMPANY123', 'BIMMU', the date (2/18 etc.) and then create a string containing all of the other data contained in a section delimited by the horizontal lines.

I.e., I want to create an array of 'statement' objects each with the following attributes:

Company name, Account, Date, Data.

E.g. for the second record below,

Account = 'BIMMU'
Firm = 'Super Corporation'
Date= 9/14/11
Data = '* * * * * * * * TODAYS ACCOUNT ACTIVITY * * * * * * * * * * *
        9/14/11 Y9 CALL OESX OCT 11 ........ etc'

The log is a fixed-width text file and the variables (date etc.) always occur at the same position in the line, e.g. sSalesCode = line.substring(142, 147);

Should I maybe do this in two passes, e.g. split the code into sections delimited by the horizontal line, and then parse these sections individually?

Just writing this out here has helped me get my train of thought, but if anybody else has any smart ideas then it would be great to hear them.


------------------------------------------------------------------------------------------------------------------------------------F   BIASPBIMMU
BIMMU    BIASP-COMPANY123 KG              (Z )  9/14/11  EU (T-  I- )         开发者_开发技巧             MT-0                              F   BIASP²BIMMU
CALLS     2/18  YI              50.00-X (49)                                                                                        F   BIASP²BIMMU
------------------------------------------------------------------------------------------------------------------------------------F   BIASPBIMMU
BIMMU    BIMM2-SUPER CORPORATION KG              (Z )  9/14/11  EU (T-  I- )                      MT-0                              F   BIMM2²BIMMU
                                                                                                                                    F   BIMM2²BIMMU
* * * * * * * * * * * * * * * * * * *     T O D A Y S    A C C O U N T    A C T I V I T Y    * * * * * * * * * * * * * * * * * * * *F   BIMM2²BIMMU
 9/14/11        Y9             500   GO  CALL OESX   OCT 11  2400            9.60    EU                                        .00  F   BIMM2²BIMMU
                                                              GO-PARFSecurities Ser                                                 F   BIMM2²BIMMU
                Y9        *    500 *     COMMISSIONS                                 EU                                     250.00- F   BIMM2²BIMMU
                Y9                       PERTES & PROFITS NETS                       EU                                     250.00- F   BIMM2BIMMU
CALLS     9/14  E1          17,825.00-H ( 1)                                                                                        F   BIMM2²BIMMU
CALLS     9/14  E1          17,825.00-N ( 1)                                                                                        F   BIMM2²BIMMU
-----------------------------------------------------------------------------------------------------------------------------------                                                                                                                                                                      


You can try to use framework Fixedformat4j. It uses annotations and works fast. I have implemented it partially for my project to understand how it works.

You can create class with annotations like this:

@Record
public class LogRecord {
    private String firm;
    private String user;
    private Date logonDate;
    private String logData;

    public String getFirm() {
        return firm;
    }

    @field(offset=10, length=10)
    public void setFirm(String firm) {
        this.firm = firm;
    }

    public String getUser() {
        return user;
    }

    @field(offset=0, length=10)
    public void setUser(String user) {
        this.user = user;
    }

    public Date getLogonDate() {
        return logonDate;
    }

    @field(offset=nn, length=8)
    @FixedFormatPattern("mm/dd/yy")  
    public void setLogonDate(Date logonDate) {
        this.logonDate = logonDate;
    }

    public String getLogData() {
        return logData;
    }

    @field(offset=mm, length=yy)
    public void setLogData(String logData) {
        this.logData = logData;
    }

}

And then instantiate it with FixedFormatManager.


i had similar problem recenly, i ended up using Flapjack (Google Code: Flapjack)... See the examples on google code, i guess it should help you out.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜