What is more effective and accurate algorithm for define value's interval of sample?
Excel,Matplotlib,matlab,R and etc can draw histogram. In many cases we must change original big sample to set of intervals.Wiki said that we have different algorithms for this task,but most popular is square-root choice article in wiki. In text i don't see proof for this statment. So my question is:wich is algorithm the 开发者_高级运维best for this task? What can you advise to read about this problem?
If you want a second opinion, complete with a more thorough justification, try section 4.3 of "Modern Multivariate Statistical Techniques..." by Izenman. For the particular case of the normal distribution, he comes up with a bin width of 3.4908*sigma*n^(-1/3), which is pretty close to the Freedman-Diacontis choice in Wikipedia.
However, Izenman also shows that, for the measure he optimises to produce this bin width, the histogram does pretty badly compared to other estimators, so I suggest that if you are prepared to work hard to get as good an estimate as possible, you start off by changing from histograms to kernel density estimators (section 4.5 of Izenman and http://en.wikipedia.org/wiki/Kernel_density_estimation)
精彩评论