Distorting an image using a height map?

2023-02-14 05:31 问答作者：

I have a height map for an image, which tells me the offset of each pixel in the Z direction. My goal is to flatten a distorted image using on开发者_运维问答ly it's height map.

How would I go about doing this? I know the position of the camera, if that helps.

To do this, I was thinking about assuming that each pixel was a point on a plane, and then to translate each of those points vertically according to the Z-value I get from the height map, and from that translation (imagine you are looking at the points from above; the shift will cause the point to move around from your perspective).

From that projected shift, I could extract X and Y-shift of each pixel, which I could feed into cv.Remap().

But I have no idea how I could get the projected 3D offset of a point with OpenCV, let alone construct a offset map out of it.

Here are my reference images for what I'm doing:

Distorting an image using a height map?

I know the angle of the lasers (45 degrees), and from the calibration images, I can calculate the height of the book really easily:

h(x) = sin(theta) * abs(calibration(x) - actual(x))

I do this for both lines and linearly interpolate the two lines to generate a surface using this approach (Python code. It's inside a loop):

height_grid[x][y] = heights_top[x] * (cv.GetSize(image)[1] - y) + heights_bottom[x] * y

I hope this helps ;)

Right now, this is what I have to dewarp the image. All that strange stuff in the middle projects a 3D coordinate onto the camera plane, given it's position (and the camera's location, rotation, etc.):

class Point:
  def __init__(self, x = 0, y = 0, z = 0):
    self.x = x
    self.y = y
    self.z = z

mapX = cv.CreateMat(cv.GetSize(image)[1], cv.GetSize(image)[0], cv.CV_32FC1)
mapY = cv.CreateMat(cv.GetSize(image)[1], cv.GetSize(image)[0], cv.CV_32FC1)

c = Point(CAMERA_POSITION[0], CAMERA_POSITION[1], CAMERA_POSITION[2])
theta = Point(CAMERA_ROTATION[0], CAMERA_ROTATION[1], CAMERA_ROTATION[2])
d = Point()
e = Point(0, 0, CAMERA_POSITION[2] + SENSOR_OFFSET)

costx = cos(theta.x)
costy = cos(theta.y)
costz = cos(theta.z)

sintx = sin(theta.x)
sinty = sin(theta.y)
sintz = sin(theta.z)


for x in xrange(cv.GetSize(image)[0]):
  for y in xrange(cv.GetSize(image)[1]):
    
    a = Point(x, y, heights_top[x / 2] * (cv.GetSize(image)[1] - y) + heights_bottom[x / 2] * y)
    b = Point()
    
    d.x = costy * (sintz * (a.y - c.y) + costz * (a.x - c.x)) - sinty * (a.z - c.z)
    d.y = sintx * (costy * (a.z - c.z) + sinty * (sintz * (a.y - c.y) + costz * (a.x - c.x))) + costx * (costz * (a.y - c.y) - sintz * (a.x - c.x))
    d.z = costx * (costy * (a.z - c.z) + sinty * (sintz * (a.y - c.y) + costz * (a.x - c.x))) - sintx * (costz * (a.y - c.y) - sintz * (a.x - c.x))
    
    mapX[y, x] = x + (d.x - e.x) * (e.z / d.z)
    mapY[y, x] = y + (d.y - e.y) * (e.z / d.z)
    

print
print 'Remapping original image using map...'

remapped = cv.CreateImage(cv.GetSize(image), 8, 3)
cv.Remap(image, remapped, mapX, mapY, cv.CV_INTER_LINEAR)

This is turning into a huge thread of images and code now... Anyways, this code chunk takes my 7 minutes to run on a 18MP camera image; that's way too long, and in the end, this approach does nothing to the image (the offset for each pixel is << 1).

Any ideas?

I ended up implementing my own solution:

for x in xrange(cv.GetSize(image)[0]):
  for y in xrange(cv.GetSize(image)[1]):

    a = Point(x, y, heights_top[x / 2] * (cv.GetSize(image)[1] - y) + heights_bottom[x / 2] * y)
    b = Point()

    d.x = costy * (sintz * (a.y - c.y) + costz * (a.x - c.x)) - sinty * (a.z - c.z)
    d.y = sintx * (costy * (a.z - c.z) + sinty * (sintz * (a.y - c.y) + costz * (a.x - c.x))) + costx * (costz * (a.y - c.y) - sintz * (a.x - c.x))
    d.z = costx * (costy * (a.z - c.z) + sinty * (sintz * (a.y - c.y) + costz * (a.x - c.x))) - sintx * (costz * (a.y - c.y) - sintz * (a.x - c.x))

    mapX[y, x] = x + 100.0 * (d.x - e.x) * (e.z / d.z)
    mapY[y, x] = y + 100.0 * (d.y - e.y) * (e.z / d.z)


print
print 'Remapping original image using map...'

remapped = cv.CreateImage(cv.GetSize(image), 8, 3)
cv.Remap(image, remapped, mapX, mapY, cv.CV_INTER_LINEAR)

This (slowly) remaps each pixel using the cv.Remap function, and this seems to kind of work...

Distortion based on distance from the camera only happens with a perspective projection. If you have the (x,y,z) position of a pixel, you can use the projection matrix of the camera to unproject the pixels back into world-space. With that information, you can render the pixels in an orthographic way. However, you may have missing data, due to the original perspective projection.

Separate your scene out as follows:

you have an unknown bitmap image I(x,y) -> (r,g,b)
you have a known height field H(x,y) -> h
you have a camera transform C(x,y,z) -> (u,v) which projects the scene to a screen plane

Note that the camera transform throws information away (you do not get a depth value for each screen pixel). You may also have bits of scene overlap on screen, in which case only the foremost gets shown - the rest is discarded. So in general this is not perfectly reversible.

you have a screenshot S(u,v) which is a result of C(x,y,H(x,y)) for x,y in I
you want to generate a screenshot S'(u',v') which is a result of C(x,y,0) for x,y in I

There are two obvious ways to approach this; both depend on having accurate values for the camera transform.

Ray-casting: for each pixel in S, cast a ray back into the scene. Find out where it hits the heightfield; this gives you (x,y) in the original image I, and the screen pixel gives you the color at that point. Once you have as much of I as you can recover, re-transform it to find S'.
Double-rendering: for each x,y in I, project to find (u,v) and (u',v'). Take the pixel-color from S(u,v) and copy it to S'(u',v').

Both methods will have sampling problems which be helped by super-sampling or interpolation; method 1 will leave empty spaces in occluded areas of the image, method 2 will 'project through' from the first surface.

Edit:

I had presumed you meant a CG-style heightfield, where each pixel in S is directly above the corresponding location in S'; but this is not how a page drapes over a surface. A page is fixed at the spine and is non-stretchy - lifting the center of a page pulls the free edge toward the spine.

Based on your sample image, you'll have to reverse this cumulative pulling - detect the spine centerline location and orientation and work progressively left and right, finding the change in height across the top and bottom of each vertical strip of page, calculating the resulting aspect-narrowing and skew, and reversing it to re-create the original flat page.

继续阅读：image-processing opencv python unwarp

Distorting an image using a height map?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？