开发者

How to perform a multidimensional search for "N-nearest neighbors?"

I am designing an automated trading software for the foreign exchange market. In a MYSQL database I have years of market data at five-minute intervals. I have 5 different metrics for this data alongside the price and time.

[Time|Price|M1|M2|M3|M4|M5] 
x ~400,0000

Time is the primary key, and M1 through M5 are different metrics (such as standard deviation or slope of a moving average).

Given an input of M1,M2,M3,M4, and M5 how can I efficiently locate the nearest 5,000 neighbors? Note that each met开发者_如何学JAVAric is floating point and has different distributions/ranges.


I don't know how you would determine the nearest neighbor. It seems you could do an absolute value difference between each metric and sum them up. (Without the absolute value, you could have two metrics that are way off, but cancel each other out.)

So, the nearest neighbor would be defined as having the lowest value from this quest:

ABS(M1 - @M1) + ABS(M2 - @M2) + ABS(M3 - @M3) + ABS(M4 - @M4) + ABS(M5 - @M5)

If this works, then the query would be:

SELECT TOP 5000 *
FROM YourTable
ORDER BY ABS(M1 - @M1) + ABS(M2 - @M2) + ABS(M3 - @M3) + ABS(M4 - @M4) + ABS(M5 - @M5)

If you wanted, you could weight each metric differently as well:

SELECT TOP 5000 *
FROM YourTable
ORDER BY 2 * ABS(M1 - @M1) + 5 * ABS(M2 - @M2) + ABS(M3 - @M3) + 3 * ABS(M4 - @M4) + ABS(M5 - @M5)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜