Once generated by Blender, data could be used by the model-based tracker and results could be compared with ground truth.
In this tutorial we will consider a tea box. The CAD model file (teabox.cao) and the init file (teabox.init) are provided in model/teabox folder:
Instead of creating an object with any dimensions, to simplify this tutorial, in Blender we'll create a tea box whose dimensions, object frame, and 3D point coordinates correspond to those in the teabox.cao and teabox.init.
We will also consider a RGB-D camera where the left camera is a classical color camera and the right camera a depth camera. The distance between left and right cameras is 10 cm. Both cameras are grabbing 640 x 480 images.
In this section, we will create the corresponding tea box, RGB-D camera and define an initial and final RGB-D camera pose used to animate the scene.
Here we will show how to create with Blender a box with the following dimensions:
Open Blender and do the following to transform the cube in a box with the expected dimensions:
The coordinates of the box's vertices are expressed in the object frame which origin is at the box center of gravity (cog) and which axis are aligned with the scene reference frame. To conform with the required CAD model, we will move the origin of the object frame at point 0 which position is given in the next image:
Next step consists in adding a texture to be able to test tracking with keypoint features. You can add a realistic texture using an image texture, but here again, to simplify this tutorial, we will just add a "Voronoi Texture". To this end:
As shown in the previous image, the camera is not visible in the scene. This is simply due to its location that is too far from the scene origin. The camera frame origin is exactly at location X = 7.3589, Y = -6.9258, Z = 4.9583 meters (7). We need to move the camera close to the object and choose a position that will ease introduction of the depth camera.
For curiosity you can now render an image of your object (shortcut: F12) and obtain an image like the following:
To simulate an RGB-D camera, we need to add a second camera to retrieve the depth map and set the appropriate parameters to match the desired intrinsic parameters following the same instructions as in previous section. To this end:
Since this depth camera should be 10 cm on the right of the color camera, we need to modify its location:
As we want to be able to animate the movement of the stereo pair rather than each camera individually, we need to link them together using the Blender parenting concept:
The last thing to do is to modify the project to generate the depth maps. To this end:
We are now ready to animate the scene. First we have to define the camera initial position. This can be done easily:
Now we have to perform the same operation for the camera final position.
This completes the configuration of the scene in Blender. We strongly recommend that you save your project.
If you are only interested to see which are the camera intrinsics of a given camera set in Blender, we provide get_camera_intrinsics.py Python script in $VISP_WS/visp/tutorial/tracking/model-based/generic-rgbd-blender/ folder.
In the terminal from which you launched Blender, you should get something similar to:
In the terminal from which you launched Blender, you should get something similar to:
#include <iostream>
#include <visp3/core/vpConfig.h>
#include <visp3/core/vpDisplay.h>
#include <visp3/core/vpIoTools.h>
#include <visp3/core/vpXmlParserCamera.h>
#include <visp3/gui/vpDisplayFactory.h>
#include <visp3/io/vpImageIo.h>
#include <visp3/mbt/vpMbGenericTracker.h>
#if defined(VISP_HAVE_OPENCV) && defined(HAVE_OPENCV_IMGCODECS) && defined(VISP_HAVE_PUGIXML)
#ifdef ENABLE_VISP_NAMESPACE
#endif
namespace
{
bool read_data(unsigned int cpt, const std::string &video_color_images, const std::string &video_depth_images,
bool disable_depth, const std::string &video_ground_truth,
unsigned int &depth_width, unsigned int &depth_height,
{
std::cerr << "Cannot read: " << filename_color << std::endl;
return false;
}
if (!disable_depth) {
std::cerr << "Cannot read: " << filename_depth << std::endl;
return false;
}
cv::Mat depth_raw = cv::imread(filename_depth, cv::IMREAD_ANYDEPTH | cv::IMREAD_ANYCOLOR);
if (depth_raw.empty()) {
std::cerr << "Cannot read: " << filename_depth << std::endl;
return false;
}
depth_width = static_cast<unsigned int>(depth_raw.cols);
depth_height = static_cast<unsigned int>(depth_raw.rows);
I_depth_raw.
resize(depth_height, depth_width);
pointcloud.resize(depth_width * depth_height);
for (int i = 0; i < depth_raw.rows; i++) {
for (int j = 0; j < depth_raw.cols; j++) {
I_depth_raw[i][j] = static_cast<uint16_t>(32767.5f * depth_raw.at<cv::Vec3f>(i, j)[0]);
double x = 0.0, y = 0.0;
double Z = depth_raw.at<cv::Vec3f>(i, j)[0] > 2.0f ? 0.0 : static_cast<double>(depth_raw.at<cv::Vec3f>(i, j)[0]);
size_t idx = static_cast<size_t>(i * depth_raw.cols + j);
pointcloud[idx].resize(3);
pointcloud[idx][0] = x * Z;
pointcloud[idx][1] = y * Z;
pointcloud[idx][2] = Z;
}
}
}
cMo_ground_truth.
load(filename_pose);
return true;
}
}
void usage(const char **argv, int error, const std::string &data_path, const std::string &model_path, int first_frame)
{
std::cout << "Synopsis" << std::endl
<< " " << argv[0]
<< " [--data-path <path>] [--model-path <path>] [--first-frame <index>] [--depth-dense-mode <0|1>] "
<< " [--depth-normals-mode <0|1>] [--me-mode <0|1>] [--klt-mode <0|1>] [--step-by-step] [--display-ground-truth] [--help, -h]" << std::endl
<< std::endl;
std::cout << "Description" << std::endl
<< " --data-path <path> Path to the data generated by Blender get_camera_pose_teabox.py" << std::endl
<< " Python script."
<< " Default: " << data_path << std::endl
<< std::endl
<< " --model-path <path> Path to the cad model and tracker settings." << std::endl
<< std::endl
<< " --first-frame <index> First frame number to process." << std::endl
<< " Default: " << first_frame << std::endl
<< std::endl
<< " --depth-dense-mode Whether to use dense depth features (0 = off, 1 = on). default: 1" << std::endl
<< std::endl
<< " --depth-normals-mode Whether to use normal depth features (0 = off, 1 = on). default: 0" << std::endl
<< std::endl
<< " --me-mode Whether to use moving edge features (0 = off, 1 = on). default: 1" << std::endl
<< std::endl
<< " --klt-mode Whether to use KLT features (0 = off, 1 = on). Requires OpenCV. default: 1" << std::endl
<< std::endl
<< " --step-by-step Flag to enable step by step mode." << std::endl
<< std::endl
<< " --display-ground-truth Flag to enable displaying ground truth." << std::endl
<< " When this flag is enabled, there is no tracking. This flag is useful" << std::endl
<< " to validate the ground truth over the rendered images." << std::endl
<< std::endl
<< " --help, -h Print this helper message." << std::endl
<< std::endl;
if (error) {
std::cout << "Error" << std::endl
<< " "
<<
"Unsupported parameter " << argv[
error] << std::endl;
}
}
int main(int argc, const char **argv)
{
std::string opt_data_path = "data/teabox";
std::string opt_model_path = "model/teabox";
unsigned int opt_first_frame = 1;
int opt_meMode = 1, opt_kltMode = 1, opt_normalsMode = 0, opt_denseMode = 1;
bool disable_depth = false;
bool opt_disable_klt = false;
bool opt_display_ground_truth = false;
bool opt_step_by_step = false;
for (
int i = 1;
i < argc;
i++) {
if (std::string(argv[i]) == "--data-path" && i + 1 < argc) {
opt_data_path = std::string(argv[i + 1]);
}
else if (std::string(argv[i]) == "--model-path" && i + 1 < argc) {
opt_model_path = std::string(argv[i + 1]);
}
else if (std::string(argv[i]) == "--depth-dense-mode" && i + 1 < argc) {
opt_denseMode = static_cast<unsigned int>(atoi(argv[i + 1]));
if (opt_denseMode < 0 || opt_denseMode > 1) {
usage(argv, 0, opt_data_path, opt_model_path, opt_first_frame);
return EXIT_FAILURE;
}
}
else if (std::string(argv[i]) == "--depth-normals-mode" && i + 1 < argc) {
opt_normalsMode = static_cast<unsigned int>(atoi(argv[i + 1]));
if (opt_normalsMode < 0 || opt_normalsMode > 1) {
usage(argv, 0, opt_data_path, opt_model_path, opt_first_frame);
return EXIT_FAILURE;
}
}
else if (std::string(argv[i]) == "--me-mode" && i + 1 < argc) {
opt_meMode = static_cast<unsigned int>(atoi(argv[i + 1]));
if (opt_meMode < 0 || opt_meMode > 1) {
usage(argv, 0, opt_data_path, opt_model_path, opt_first_frame);
return EXIT_FAILURE;
}
}
else if (std::string(argv[i]) == "--klt-mode" && i + 1 < argc) {
opt_kltMode = static_cast<unsigned int>(atoi(argv[i + 1]));
if (opt_kltMode < 0 || opt_kltMode > 1) {
usage(argv, 0, opt_data_path, opt_model_path, opt_first_frame);
return EXIT_FAILURE;
}
}
else if (std::string(argv[i]) == "--display-ground-truth") {
opt_display_ground_truth = true;
}
else if (std::string(argv[i]) == "--step-by-step") {
opt_step_by_step = true;
}
else if (std::string(argv[i]) == "--first-frame" && i + 1 < argc) {
opt_first_frame = static_cast<unsigned int>(atoi(argv[i + 1]));
}
else if (std::string(argv[i]) == "--help" || std::string(argv[i]) == "-h") {
usage(argv, 0, opt_data_path, opt_model_path, opt_first_frame);
return EXIT_SUCCESS;
}
else {
usage(argv, i, opt_data_path, opt_model_path, opt_first_frame);
return EXIT_FAILURE;
}
}
disable_depth = opt_denseMode == 0 && opt_normalsMode == 0;
std::string color_camera_name = "Camera_L";
std::string depth_camera_name = "Camera_R";
std::cout << "Input data" << std::endl;
std::cout << " Color images : " << video_color_images << std::endl;
std::cout << " Depth images : " << (disable_depth ? "Disabled" : video_depth_images) << std::endl;
std::cout << " Extrinsics : " << (disable_depth ? "Disabled" : extrinsic_file) << std::endl;
std::cout << " Color intrinsics: " << color_intrinsic_file << std::endl;
std::cout << " Depth intrinsics: " << (disable_depth ? "Disabled" : depth_intrinsic_file) << std::endl;
std::cout << " Ground truth : " << ground_truth << std::endl;
std::cout << "Tracker settings" << std::endl;
std::cout << " Color config : " << mbt_config_color << std::endl;
std::cout << " Depth config : " << mbt_config_depth << std::endl;
std::cout << " CAD model : " << mbt_cad_model << std::endl;
std::cout << " First frame : " << opt_first_frame << std::endl;
std::cout << " Step by step : " << opt_step_by_step << std::endl;
if (opt_display_ground_truth) {
std::cout << " Ground truth is used to project the cad model (no tracking)" << std::endl;
}
else {
std::cout << " Init file : " << mbt_init_file << std::endl;
std::cout << " Features : moving-edges " << (opt_disable_klt ? "" : "+ keypoints") << (disable_depth ? "" : " + depth") << std::endl;
}
int colorTracker = 0;
if (opt_meMode == 1) {
}
if (opt_kltMode == 1) {
#if defined(VISP_HAVE_OPENCV) && defined(HAVE_OPENCV_IMGPROC) && defined(HAVE_OPENCV_VIDEO)
#else
std::cerr << "Warning: keypoints cannot be used as features since ViSP is not built with OpenCV 3rd party" << std::endl;
#endif
}
if (colorTracker == 0) {
std::cerr << "You should use at least one type of color feature. If OpenCV is not installed, KLT features are disabled" << std::endl;
return EXIT_FAILURE;
}
if (!disable_depth) {
int depthTracker = 0;
if (opt_denseMode == 1) {
}
if (opt_normalsMode == 1) {
}
}
if (!disable_depth) {
tracker.loadConfigFile(mbt_config_color, mbt_config_depth,
true);
}
else {
tracker.loadConfigFile(mbt_config_color);
}
std::cout << "Cannot found intrinsics for camera " << color_camera_name << std::endl;
}
std::cout << "Cannot found intrinsics for camera " << depth_camera_name << std::endl;
}
if (!disable_depth)
tracker.setCameraParameters(cam_color, cam_depth);
else
tracker.setCameraParameters(cam_color);
if (!disable_depth)
tracker.getCameraParameters(cam_color, cam_depth);
else
tracker.getCameraParameters(cam_color);
std::cout <<
"cam_color:\n" <<
cam_color << std::endl;
if (!disable_depth)
std::cout <<
"cam_depth:\n" <<
cam_depth << std::endl;
std::vector<vpColVector> pointcloud;
unsigned int frame_cpt = opt_first_frame;
read_data(frame_cpt, video_color_images, video_depth_images, disable_depth, ground_truth,
I, I_depth_raw, depth_width, depth_height, pointcloud, cam_depth, cMo_ground_truth);
#if defined(VISP_HAVE_DISPLAY)
#if (VISP_CXX_STANDARD >= VISP_CXX_STANDARD_11)
#else
#endif
display1->init(I, 0, 0, "Color image");
if (!disable_depth) {
display2->init(I_depth, static_cast<int>(I.getWidth()), 0, "Depth image");
}
#endif
if (!disable_depth) {
tracker.setCameraTransformationMatrix(
"Camera2", depth_M_color);
}
if (opt_display_ground_truth) {
tracker.initFromPose(I, cMo_ground_truth);
}
else {
tracker.initClick(I, mbt_init_file,
true);
}
try {
bool quit = false;
while (!quit &&
read_data(frame_cpt, video_color_images, video_depth_images, disable_depth,
ground_truth, I, I_depth_raw, depth_width, depth_height, pointcloud, cam_depth,
cMo_ground_truth)) {
if (opt_display_ground_truth) {
tracker.initFromPose(I, cMo_ground_truth);
}
else {
if (!disable_depth) {
std::map<std::string, const vpImage<unsigned char> *> mapOfImages;
std::map<std::string, const std::vector<vpColVector> *> mapOfPointClouds;
std::map<std::string, unsigned int> mapOfPointCloudWidths;
std::map<std::string, unsigned int> mapOfPointCloudHeights;
mapOfImages["Camera1"] = &I;
mapOfPointClouds["Camera2"] = &pointcloud;
tracker.track(mapOfImages, mapOfPointClouds, mapOfPointCloudWidths, mapOfPointCloudHeights);
}
else {
}
}
std::cout << "\nFrame: " << frame_cpt << std::endl;
if (!opt_display_ground_truth)
std::cout <<
"cMo:\n" <<
cMo << std::endl;
std::cout << "cMo ground truth:\n" << cMo_ground_truth << std::endl;
if (!disable_depth) {
}
else {
}
std::ostringstream oss;
oss << "Frame: " << frame_cpt;
if (opt_step_by_step) {
}
else {
}
if (!opt_display_ground_truth) {
{
std::stringstream ss;
ss <<
"Nb features: " <<
tracker.getError().size();
}
{
std::stringstream ss;
ss <<
"Features: edges " <<
tracker.getNbFeaturesEdge() <<
", klt " <<
tracker.getNbFeaturesKlt()
<<
", dense depth " <<
tracker.getNbFeaturesDepthDense() <<
", depth normals " <<
tracker.getNbFeaturesDepthNormal();
}
}
opt_step_by_step = true;
}
opt_step_by_step = false;
}
quit = true;
}
opt_step_by_step = true;
}
}
frame_cpt++;
}
}
catch (std::exception &e) {
std::cerr <<
"Catch exception: " <<
e.what() << std::endl;
}
#if (VISP_CXX_STANDARD < VISP_CXX_STANDARD_11) && defined(VISP_HAVE_DISPLAY)
if (display1 != nullptr) {
delete display1;
}
if (display2 != nullptr) {
delete display2;
}
#endif
return EXIT_SUCCESS;
}
#else
int main()
{
std::cout << "To run this tutorial, ViSP should be built with OpenCV and pugixml libraries." << std::endl;
return EXIT_SUCCESS;
}
#endif
Generic class defining intrinsic camera parameters.
@ perspectiveProjWithoutDistortion
Perspective projection without distortion model.
static const vpColor none
Class that defines generic functionalities for display.
static bool getClick(const vpImage< unsigned char > &I, bool blocking=true)
static void display(const vpImage< unsigned char > &I)
static void displayFrame(const vpImage< unsigned char > &I, const vpHomogeneousMatrix &cMo, const vpCameraParameters &cam, double size, const vpColor &color=vpColor::none, unsigned int thickness=1, const vpImagePoint &offset=vpImagePoint(0, 0), const std::string &frameName="", const vpColor &textColor=vpColor::black, const vpImagePoint &textOffset=vpImagePoint(15, 15))
static void setTitle(const vpImage< unsigned char > &I, const std::string &windowtitle)
static void flush(const vpImage< unsigned char > &I)
static void displayText(const vpImage< unsigned char > &I, const vpImagePoint &ip, const std::string &s, const vpColor &color)
Implementation of an homogeneous matrix and operations on such kind of matrices.
void load(std::ifstream &f)
static void createDepthHistogram(const vpImage< uint16_t > &src_depth, vpImage< vpRGBa > &dest_rgba)
static void read(vpImage< unsigned char > &I, const std::string &filename, int backend=IO_DEFAULT_BACKEND)
Definition of the vpImage class member functions.
void resize(unsigned int h, unsigned int w)
resize the image : Image initialization
Real-time 6D object pose tracking using its CAD model.
static void convertPoint(const vpCameraParameters &cam, const double &u, const double &v, double &x, double &y)
XML parser to load and save intrinsic camera parameters.
read_data(CameraParameters|None cam_depth, ImageGray I, rs.pipeline pipe)
std::shared_ptr< vpDisplay > createDisplay()
Return a smart pointer vpDisplay specialization if a GUI library is available or nullptr otherwise.
vpDisplay * allocateDisplay()
Return a newly allocated vpDisplay specialization if a GUI library is available or nullptr otherwise.
Default parameters allow to run the binary with the data provided in ViSP. Just run:
You should be able to see similar tracking results as the one given in the next video.