Oracle SQL query or function to cluster geographic data
I have a table containing geographic data and I want to group rows on the proximity of the X and Y coordin开发者_如何学JAVAates. So, given an offset, n, and a table with columns X and Y, I want to group rows where ABS(row1.X - row2.X) < n, with a count of the number of rows in the group. Is this possible with SQL or do I need a function?
How big is your spatial range ? How big is your data set ? How accurate do you need ? This is relevant because, if they are close, you don't have to worry about the curvature of the Earth.
Degrees of latitude are parallel so the distance between each degree remains almost constant but since degrees of longitude are farthest apart at the equator and converge at the poles, their distance varies greatly. Each degree of latitude is approximately 69 miles (111 kilometers) apart. The range varies (due to the earth's slightly ellipsoid shape) from 68.703 miles (110.567 km) at the equator to 69.407 (111.699 km) at the poles. This is convenient because each minute (1/60th of a degree) is approximately one mile. A degree of longitude is widest at the equator at 69.172 miles (111.321) and gradually shrinks to zero at the poles. At 40° north or south the distance between a degree of longitude is 53 miles (85 km).
The other aspect is, if Fred is 4 miles from Bill and Bill is 4 miles from Tom, the distance between Fred and Tom might be 8 miles. If your proximity threshold is 5 miles, Bill and Fred are in the same group, Bill and Tom are in the same group, but Fred and Tom aren't.
The following query might give you a useful start though:
select abs(abs(a.lat) - abs(b.lat)),abs(abs(a.lon) - abs(b.lon)),
sdo_geom.sdo_distance(a.geom, b.geom, 0.005,'unit=kilometer') dist_km
from
(select sdo_geometry(2001,8314,sdo_point_type(33,151, null), null,null) geom, 33 lat, 151 lon from dual) a,
(select sdo_geometry(2001,8314,sdo_point_type(34,151, null), null,null) geom, 34 lat, 151 lon from dual) b
Do you have oracle spatial? If so there are a number of built in functions to handle this for you. If not - you'll need a function to determine catographic distances (and then group off of that). I recall seing a SO question on how to write such a function last week. (There are actually a few along these lines)
How to limit a MySQL Distance Query
Best bet. Use the spatial extensions. They'll perform much better.
精彩评论