Object detection and inference using a point cloud with a 3D camera

The detection and identification of objects and people using a point cloud is one of the major points of contention in the implementation of systems with a certain level of autonomy, and it is undoubtedly one of the issues with the greatest impact in the world of computer vision.

Correct detection and inference is key to be able to carry out tasks such as autonomous navigation, robot arm assistance, video surveillance or, as in our case, defect detection.


What is a point cloud?

A point cloud is a system of representation of a surface through a set of vertices (X, Y, Z) in a three-dimensional system and is composed of a set of points that contain the information necessary to generate a virtual model.


Where is a point cloud used?

We find industries that depend on point clouds and examples are civil and environmental engineering, cartography, geology, environmental monitoring, architecture, archaeology, structural inspections in buildings, topography, …


Ways to approach object detection

We are going to see which is the best point cloud camera. There are fundamentally, two ways to approach object detection.

The first is based on the use of monocular cameras (normal cameras, like those of our mobiles), which provide two-dimensional images. A series of algorithms are applied to these images, which can be very varied, in order to obtain a “box” or delimitation that indicates the presence and location of the object to be detected in the image.

The second is based on the use of 3D cameras that, as the name suggests, in addition to the 2-dimensional image, add a third dimension: depth. Thanks to this we can create point clouds, like the one in Image 1. In INLOC we are using this second method to perceive the environment.


The algorithm applied to the point cloud

The algorithms that can be applied are very varied, from the simplest to the most complex.

In this post we want to show the result of detecting a cup of coffee by concatenating two very simple algorithms: first, we apply an edge detector to the original point cloud. That allows us to expose the basic geometry of most of the elements; and second, we pass it through a RANSAC algorithm, which will make an estimate of the parameters of the geometry that best adapts to the mentioned edges of the point cloud.


Point Cloud
Image 1. Point cloud

Which geometry to use will depend on the problem.

In the case of Image 1, the objective is to detect the cup of coffee on the table, so estimating the parameters of a circle is the most logical thing to do.

By using this method, a successful detection of the cup is achieved, as can be seen in Image 2.

The good thing about this method is that, if the object to be detected is reasonably simple, the object detection is not only simple but also fast.

As we see in image 1, the point cloud from the 3D camera. Note the cup of coffee on the table, with the wall and objects in the background as noise.


Segmented Point Cloud
Image 2. Segmented Point Cloud

In image 2 we observe the original point cloud with the detected and located coffee cup. The original dots are in blue, the edges are pink, and the dots that fall inside the box (in red) that encompass the mug are in gray.

It is not, however, robust: when there are objects with complex geometries or many instances of similarly shaped objects in the 3D point cloud, the chances of finding unwanted objects are high.

For these cases, hybrid solutions must be explored, in which the RGB image of each candidate object is passed on to other algorithms such as convolutional neural networks, which have greater capacity for abstraction and generalization, despite being much more complex and slow.

But that’s a story for another post!


We hope this article has been useful to you. If you have an engineering project in your hands and you think we can help you, here is the link where you can contact us and explain more about it.