Extract points within a shape from a raster
I have a raster file (basically 2D array) with close to a million points. I am trying to extract a circle from the raster (and all the points that lie within the cir开发者_StackOverflow中文版cle). Using ArcGIS is exceedingly slow for this. Can anyone suggest any image processing library that is both easy to learn and powerful and quick enough for something like this?
Thanks!
Extracting a subset of points efficiently depends on the exact format you are using. Assuming you store your raster as a numpy array of integers, you can extract points like this:
from numpy import *
def points_in_circle(circle, arr):
"A generator to return all points whose indices are within given circle."
i0,j0,r = circle
def intceil(x):
return int(ceil(x))
for i in xrange(intceil(i0-r),intceil(i0+r)):
ri = sqrt(r**2-(i-i0)**2)
for j in xrange(intceil(j0-ri),intceil(j0+ri)):
yield arr[i][j]
points_in_circle
will create a generator returning all points. Please note that I used yield
instead of return
. This function does not actually return point values, but describes how to find all of them. It creates a sequential iterator over values of points within circle. See Python documentation for more details how yield
works.
I used the fact that for circle we can explicitly loop only over inner points. For more complex shapes you may loop over the points of the extent of a shape, and then check if a point belongs to it. The trick is not to check every point, only a narrow subset of them.
Now an example of how to use points_in_circle
:
# raster dimensions, 10 million points
N, M = 3200, 3200
# circle center and its radius in index space
i0, j0, r = 70, 20, 12.3
raster = fromfunction(lambda i,j: 100+10*i+j, (N, M), dtype=int)
print "raster is ready"
print raster
pts_iterator = points_in_circle((i0,j0,r), raster) # very quick, do not extract points yet
pts = array(list(pts_iterator)) # actually extract all points
print pts.size, "points extracted, sum = ", sum(pts)
On a raster of 10 million integers it is pretty quick.
Please describe file format or put a sample somewhere if you need more specific answer.
Numpy allows you to do this, and is extremely fast:
import numpy
all_points = numpy.random.random((1000, 1000)) # Input array
# Size of your array of points all_points:
(image_size_x, image_size_y) = all_points.shape
# Disk definition:
(center_x, center_y) = (500, 500)
radius = 10
x_grid, y_grid = numpy.meshgrid(numpy.arange(image_size_x),
numpy.arange(image_size_y))
# Array of booleans with the disk shape
disk = ((x_grid-center_x)**2 + (y_grid-center_y)**2) <= radius**2
# You can now do all sorts of things with the mask "disk":
# For instance, the following array has only 317 points (about pi*radius**2):
points_in_disk = all_points[disk]
# You can also use masked arrays:
points_in_circle2 = numpy.ma.masked_array(all_points, ~disk)
from matplotlib import pyplot
pyplot.imshow(points_in_circle2)
You need a library that can read your raster. I am not sure how to do that in Python but you could look at geotools (especially with some of the new raster library integration) if you want to program in Java. If you are good with C I would reccomend using something like GDAL.
If you want to look at a desktop tool you could look at extending QGIS with python to do the operation above.
If I remember correctly, the Raster extension to PostGIS may support clipping rasters based upon vectors. This means you would need to create your circles to features in the DB and then import your raster but then you might be able to use SQL to extract your values.
If you are really just a text file with numbers in a grid then I would go with the suggestions above.
精彩评论