How to parse MICR line data?
I have a digital check scanner that is able to capture the MICR line from the check. It will return the MICR line in raw format as a string, with delimiters to separate the account number, routing number, and check number. However, each bank formats this MICR line differently, so there's no standard way to parse this data.
Some companies I have tried are Inlite Research Inc and Accusoft Pegasus. The API from Inlite Research works for some banks, but cannot read Bank of America checks correctly. I'm still testing out the API from Accusoft.
What I am asking is if anyone know of an API that will accurately parse the MICR line for the different components. Is there an API that will let me add new definitions of check format if I encounter a new check that the API cannot handle correctly? Or, if anyone know how to or has written a routine to parse the MICR line.
I would appreciate 开发者_运维技巧any help I can get. Thank you.
Sorry for the late reply. I didn't see any answers to the question so I thought nobody responded.
To answer the questions above, I found a solution after thinking the problem over and talking with various vendors. The Check scanner that I'm using is already able to read the MICR line. The problem lies in parsing the MICR line for relevant information such as the routing transit number, account number, check/serial number, and amount (if there is one). After speaking with a handful of 3rd party companies and trying out available trial versions of MICR parser, I come to the conclusion that there is no universal parser out there. I'm still faced with the problem of the non-comforming On-Us field. Each bank formats this field differently. Sometimes the symbols are arranged differently as well. So, I decided to write my own parser. I think this is the most logical way to proceed as I've been informed by these 3rd party vendors that they each roll their own parsing software.
The way I wrote the parser was I kept a table of MICR line patterns. Each time I encounter a new MICR line format, I will update this table. My parser will match any check scanned against this table and if it finds a match, it will use that pattern to parse the relevant information.
I hope my experience and the solution I came up with will also help those who ran across the same issue.
Thank you for all those who responded and good luck.
The basic pattern of a MICR:
xxxxxxxxxxx /rrrrrrrrr/ ooooooooooo baaaaaaaaaab
where 'x' is AuxOnUs, 'r' is routing number, 'o' is OnUs, and 'a' is amount, with 'b' and '/' are special MICR symbols.
A minimal MICR line is just:
/rrrrrrrrr/ ooooooooo
AuxOnUs is generally only used by business checks, and it pretty much always means there is a serial number.
Routing number is always consistent, it's the only part of the MICR that is universal.
Amount is generally not encoded in the MICR, but sometimes it is.
OnUs is the tricky part. It normally consists of the check serial number and the account, but each bank handles it differently. Usually the serial number will be 4 digits, but it may be 5 or more. If there's an AuxOnUs field, you can be pretty sure the OnUs is just the account number.
The OnUs can contain spaces and dashes. It would be nice if there were a consistent way they were divided, but I've seen so many variations, I think it's better to just leave it as an "OnUs" field instead of separating it into serial and account, unless you're the paying bank, in which case you should know what format your own checks are.
This should be the correct answer based on my research as well. MICR patterns are too varied to reliably parse without having a collection of regex matching patterns to pull the relevant information. What would be nice is to see the collection of regex patterns you have come up with with group names such as:
<(?<checkNumber>[0-9\s]*)<[0-9\s]*:[0-9\s]*:.*
6 years after this question was originally asked, and I have run across this question numerous times in the past 2 weeks. I finally found an ACTUAL solution, and how to properly parse a MICR line. I've written some code to do so and it works on 99.9% of checks I've scanned this far, so I have to share and make sure people understand how this should be done.
For 11 years I have done this job. We have always used Magtek check scanners. Recently I decided to move to an imaging scanner so we could get scans of all our checks. I went with Panini check scanners. Unfortunately, their API doesn't break apart the MICR line, but our Magtek scanners were programmable to give us whatever we wanted. I created a basic string that could be matched with a pattern every time. It would always come out as: <aaaaaaaaa/bbbbbbbb/ccc> where a is route number, b is account number, and c is check number. Over and over I keep wondering how the scanner, just a simple serial device, can figure it out and get it right EVERY SINGLE TIME for a decade.
I started by using Patrick's own answer, sort of, to build a table of MICR patterns I hadn't seen before. Problem is that I ran to a point where one pattern would get a close match to another check and the data would be off slightly. I then tried doing it based on route number until I ran across two checks from BofA that had identical route numbers and completely different MICR lines. I was so disappointed that my face met my desk in frustration.
After much more research, the proper way is left-to-right parsing of the MICR line. MICR lines are left-to-right, and of course the field giving us the most trouble is the on-us field. All my example snippets are C# code.
Start by looping through the string backwards:
for (int i = micr.Length - 1; i >= 0; i--)
Evaluate each character as you loop. If your first character is the amount character, it's a business check. Read until you get another amount character, then save that value. If the next character is the on-us symbol, assume that the check number is at the far left of the on-us field. If the next character is a digit, keep reading and filling a buffer (REMEMBER YOU ARE WORKING BACKWARDS!) with the digits until you reach the on-us character. If your buffer contains only digits, that's your check number. If it's empty, just move on and collect the entire on-us field in a buffer until you reach the transit character. Once you reach the transit character, keep reading and filling your buffer until you reach the next transit character. Your buffer is now your routing number. If it's a business check, You still have more characters to read. Keep reading until you reach ANOTHER on-us character. You've now reached the auxiliary on-us field, which should be the check number. Read until you reach the next on-us character and that should be the end of your string. You now have your check number.
Now, look at the value you stripped from the regular on-us field. If you have a check number, then that's your account number. If you DO NOT have a check number, then you should split the on-us field by spaces, and assume that your far left set (array element 0) of digits are your check number. HOWEVER, if after splitting by space you only have ONE element in the array, that means the on-us field likely contains dashes separating the items. Split the on-us field by dashes and assume that your far left array element is the check number and the rest are your account number. I've seen some that have as many as 3 dashes in the on-us field, like this: nnnn-1234-56-7, where nnnn is the check number and the rest is the account number.
Once you've got your account number separated from check number, strip any miscellaneous characters (spaces, dashes, etc.) from it and you're done.
This is my solution to all my MICR problems. Hopefully it helps someone else.
Thanks goes, partially, to this document: http://www.transact-tech.com/uploads/printers/files/100-9094-Rev-C-MICR-Programmers-Guide.pdf
精彩评论