Is there finer granularity than LOAD/NEXT for reading structured data?
Imagine that I have a long file of Rebol-formatted data, with a million lines, that look something like
REBOL []
[
[employee name: {Tony Romero} salary: $10,203.04]
[employee name: {Marcus "Marco" Marcami} salary: default]
[employee name: {Serena Derella} salary: ($10,000 + $20开发者_如何学Go3.04)]
...
[employee name: {Stacey Christie} salary: (10% * $102,030.40)]
]
If the enclosing block wasn't there, I could use LOAD/NEXT
to read through the employee items one at a time (as opposed to parsing the entire file into structured data with LOAD
). Is there any way to do something similar if the enclosing block is there?
What if I wanted to go back to a previously visited item? Could there be a "structural seek"?
Is there a viable database solution that one could use for this kind of desire for Rebol-structured data, which might even permit random access insertions?
I recall, that it was you who proved, that this should be doable in PARSE? ;-)
Nevertheless, to give you a useful answer: the code I wrote for the link text can be described exactly as parsing (in essence) REBOL not using the default LOAD/NEXT when needing something else. So, have a look, read the documentation, run the tests, write some tests, and if you have more questions, just ask.
If you are happy to tweak your file format a little so it is a file with one record per line, no enclosing blocks nor REBOL header:
employee-name: {Tony Romero} salary: $10203.04
employee-name: {Marcus "Marco" Marcami} salary: 'default
employee-name: {Serena Derella} salary: ($10000 + $203.04)
employee-name: {Stacey Christie} salary: (10% * $102030.40)
Then....
data: read/lines %data-file.txt
....gets you a block of unloaded strings
One way to work with them is like this:
foreach record data [
record: make object! load/all record
probe record
]
I had to tweak your data format too to make it easily loadable by REBOL:
- employee-name rather than employee name
- $10203.04 rather than $10'203.04
- 10% -- only works with REBOL3
If you can't tweak the data format like that, you could always do some edits on each string prior to LOAD/ALL to normalise it for REBOL.
Sunanda's answer is not good as you can have multiline data! You can use something like that:
data: {REBOL [] [ [employee name: {Tony Romero} salary: $10'203.04] [employee name: {Marcus "Marco" Marcami} salary: default] [employee name: {Serena Derella} salary: ($10'000 + $203.04)] ]} unless all [ set [value data] load/next data value = 'REBOL ][ print "Not a REBOL data file!" halt ] set [header data] load/next data print ["data-file-header:" mold header] data: find/tail data #"[" attempt [ ;you must use attempt as there will be at least one error at the end of file! ;** Syntax Error: Missing [ at end-of-block indexes: copy [] while [ append indexes data set [loaded-row data] load/next data data ][ probe loaded-row ] ] print "done" remove back tail indexes ;removes the last erroneous position foreach data-at-pos reverse indexes [ probe first load/next data-at-pos ]
So the output would be:
[employee name: "Tony Romero" salary: $10203.04] [employee name: {Marcus "Marco" Marcami} salary: default] [employee name: "Serena Derella" salary: ($10000.00 + $203.04)] done [employee name: "Serena Derella" salary: ($10000.00 + $203.04)] [employee name: {Marcus "Marco" Marcami} salary: default] [employee name: "Tony Romero" salary: $10203.04]
精彩评论