开发者

How would you find the height of objects given an image?

This isn't exactly a programming question exactly. I just want to know what your approach would be to a common problem in Digital image processing.

Let's say you have an image of a few trees in say jpg开发者_如何学Go format. How would you go about finding the heights of each of these trees? The photo is the only input you have.

I want to know the approaches you have not to code. So it doesn't matter if your answers are vague, or non DIP-ish.

Small correction : The height need not be the actual height of the tree. The height can be taken to any scale. But should be consistent to all objects in the pic.


Yes it is possible. What you are describing has an entire industry around it, called Photogrammetry


There is a fair amount of computer vision research in this area. Assuming you don't know the camera constraints, you'll have to make assumptions about the scene and camera to determine the heights up to a scale factor. Note that without camera constraints or a reference height in the image it is impossible to tell the difference between a tall tree photographed from a distance or a short tree photographed up close. A great start is the Single View Metrology work by Criminisi.


It is simple to find the size of an object from images using Photogrammetry. Photogrammetry is the science of making measurements from photographs. For this we need to know two things,

  • the distance between the camera and the image plane(distance from camera to object).
  • Focal-length(in mm and pixels per mm) or physical size of the image sensor.

Following are the steps:

Calibrate the Camera

Use openCV to calibrate the camera.You can use the OpenCV calibrate.py tool and the Chessboard pattern PNG provided in the source code to generate a calibration matrix. Camera calibration is done to find the camera parameters. I took about a dozen of photos of the chessboard photos from many angles as I could with my webcam (to calibrate my webcam). For more details check openCV camera calibration.

We will get f_x,f_y,c_x,c_y from calibration matrix.

Checking the details of the photos you took, you will find the native resolution of the photos(heightXwidth) and in their EXIF headers you can find the focal length value(f). These items may vary depending on your camera.

Pixels per millimeter

We need to know the pixels per millimeter(px/mm) on the image sensor.

f_x=f*m_x

f_y=f*m_y

Since we have two of the variables for each formula we can solve for m_x and m_y.I just averaged f_x and f_y to get f_xy.

m=f_xy/focal_length_of_camera

Insert the image

Insert your image from which you need to find the actual size of image. You should know the distance between object and camera. Find the dimension of the image (height1Xwidth1)

Find the Object size in pixels

Determine the size of object in pixels. I simply use distance formula to find length of a selected line. You can adopt any other method.

Convert px/mm in the lower resolution

pxpermm_in_lower_resolution = (width1*m)/width

Size of object in the image sensor

size_of_object_in_image_sensor = object_size_in_pixels/(pxpermm_in_lower_resolution)

Actual size of object

The actual size of object can be found with the above data as,

real_size = (dist*size_of_object_in_image_sensor)/focal_length


Assuming they're all the same distance away, all to scale, you'd want to find a single unit of measurement you can guarantee. For example, if there's a person in the photo, again, same scale, and you know they're exactly 6 feet tall, you use that as your measure. You then take that, and count how many stacked make the tree. For example, if you need 3.5 of this person, then:

3.5 * 6 = 21

gives you a 21 foot tall tree.

Without a single point of reference for everything, or if they're all on different scales, you would need a lot more information than you could easily get without having been there.


I would rely on an object of known dimensions to be present in the picture. For instance, a man.

Or perhaps, we could use the EXIF data to reverse engineer the size of the object based on the camera's sensor dimensions, the lens and the focal length used. This again depends on the angle. We should be getting most accurate results when the camera has been held perpendicular to the subject.


If your image is 3*3 and you want to find out the size of image (i.e 3x3..so 3x3 = 9) now we have 8 pixels starting from 0 up to 8. So 9/8=(___)kb.

If you want to find the size of image in MB, like doing above example, just do like that (9/8)/(1024)=(----)MB..

So you will get the result in Mb.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜