Condensing Stock Data
I have a data set that is composed as such:
2009,11,01,17,00,23,1.471700,1.4720开发者_运维知识库00
2009,11,01,17,01,04,1.471600,1.471900
2009,11,01,17,01,09,1.471900,1.472100
2009,11,01,17,01,12,1.472000,1.472300
2009,11,01,17,01,13,1.471900,1.472200
2009,11,01,17,01,14,1.471600,1.471900
2009,11,01,17,01,18,1.471700,1.472000
2009,11,01,17,01,18,1.471900,1.472200
I am using Octave to manipulate this data. I would like to use this tick data to create various files containing the data in 5, 10. and 30 minute intervals. With this format they could be plotted as a bar/candlestick chart and further calculations performed. However, I don't really have any idea how to approach the looping over the data to create such files.
I am familiar with Octave and use this software, but this particular task could be undertaken in some other software to produce files for later import into Octave.
My first attempt to code this in Octave gives this error:-
error: A(I,J,...) = X: dimensions mismatch
error: called from:
error: /home/andrew/Documents/forex_convert/tick_to_min.m at line 105, column 25
The code that produces it is
[i,j]=find(fMM>=45 & fMM<50);
min_5_vec(1:length(i),1)=tick_data(min(i):max(i),1); % line 105
The code checks the "minutes" vector fMM and should extract and create a new "min_5_vec" vector containing all tick data that occurred between the times HH:45:00 and HH:49:59 for every hour. The thing is this code, which is part of a function, appears to fail only on this particular line which I find very strange as it has been copied and pasted and only the figures 45 and 50 have been changed, and the other similar parts of the function code up to line 105 do not fail. I have visually checked the raw data and can see no cause for the nature of the data to be the reason for the failure. Any suggestions for the possible cause of the failure?
First, use datenum to convert your year,month,day,hour,minute,second variables to times:
datenum(2009,11,01,17,00,23)
will return the number of days past since 1/1/0000. lets say you save all the times in a vector called times. now, it should be easy enough to find the first/last time you have:
first = min(times);
last = max(times);
one minute is equal to:
ONE_MINUTE = 1/24/60
now the binning is done like:
index = 1;
means = [];
for t = first:5*ONE_MINUTE:last
current_bin = (times>=t) & (times<t+5*ONE_MINUTE)
% do something with all the data for which current_bin==1
means(index) = mean(data(current_bin));
index = index+1;
end
Just for the example, I calculated the means of the data in each bin. I assume you have a vector called data which contains some data for each time.
(I know this can be optimized a lot, but I preferred clarity over performance for this answer)
精彩评论