Friday, September 3, 2021

Best way to visualize 3D depth data

 Depth map images represent the Z (or distance from camera) as brightness or color.

Depth map image created from a laser scan of almonds.

The surfaces are hard for a human to pick out of the image.

In the depth image above there are boxes sitting on the floor.  The image gets brighter the further from the camera.  The distance increases, causing the Z or Depth to have a larger value.  Image displays treat the higher pixel value as a brighter intensity.
  
But you can't tell where the box, or its side walls are.  It is more difficult to tell where the corners and flaps are.


The interesting information for humans (and eventually robots) if to color the image based on the surface, not on the depth.

This is the same image from the depth data, but the color is set by the surface normals.  That is, the normal vector to the plane each pixel sits on is encoded as a Red, Green and Blue value.  
 
a x b = normal vector (?does that make it an abnormal vector? dumb joke)

The normal vector is found by taking the cross product of two vectors.  The pixel contains an x,y image value and a z from the depth or intensity value.  The two vectors for the cross product are created by subtracting the pixel x,y,z from a neighbor pixel in x axis, and a neighbor pixel in y axis.

The normal vector is 3 dimensional.  There is a convenient way of displaying 3D data by associating the 3D x,y,z to the colors red, green, blue.  This effectively moves the normal vector into the RGB color space.  That is super great because there are many color image machine vision tools!!!
 
But the RGB color space is NOT how humans think about the world!
Artists use a hue, saturation and intensity color space.  The color wheel is a common way for humans to think of red, green and blue.
Color wheel.  Hue revolves around the circumference. Intensity increases toward the center.


  The normal vector traces out a sphere with radius 1.


That is, the normal vector will only touch the surface of a sphere, not the inside of the sphere.  This is great for us.  Converting the normal vector RGB space to HSI or HSL means we can throw away the saturation.  Saturation is the inside of the HSL sphere.
 



So now look at the color wheel again, but think that you are looking down onto a color sphere

The normal vector pointing out of the image (straight at you) is white.  The normal pointing to the right is red, to the left is cyan, up is violet, down is greenish yellow.

Lets apply this idea to a depth image of a pallet of paper bags.
Depth image, bright pixels are further from camera.

How can you tell the orientation of the stacks of bag bundles on the pallet in the center of the depth image?
Convert the surface normal to an HSL color sphere like this...

From the HSL normal image you can tell where the walls of the bundles are.  You can hopefully see that the bundle in the center is falling off.  It is tilted up and right, giving it a pink or magenta color.
You may even notice that a bundle has fallen onto the ground (The green cyan blob in the image above the pallet.)  It is not noticeable in the depth image.

By coloring the surfaces using a HSL or HSI model, the walls of objects become easy to detect.  For a robot, the orientation or approach angle is part of the color image.

HSI encoded normals image.  White pixels are perpendicular to the camera.  Color (hue) encodes rotation of each surface.



Thanks for reading my blog - Lowell Cady




No comments:

Post a Comment