• No results found

Building a virtual environment in Unity based on a robot's perception.

N/A
N/A
Protected

Academic year: 2021

Share "Building a virtual environment in Unity based on a robot's perception."

Copied!
35
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)
(2)

Building a virtual environment in Unity based on a robot’s perception.

Nikita Vaneev Thursday 5thJuly, 2018

(3)

ii Building a virtual environment in Unity based on a robot’s perception.

Nikita Vaneev University of Twente

(4)

iii

Contents

1 Introduction 1

1.1 Context . . . 1

1.2 Problem definition . . . 1

1.3 Approach . . . 2

1.4 Research questions . . . 2

1.5 Report organization . . . 3

2 Analysis 4 2.1 System description . . . 4

2.2 Visual SLAM methods . . . 6

2.3 Approach overview . . . 11

3 Design and implementation 12 3.1 Master side . . . 13

3.2 User side . . . 14

4 Evaluation 18 4.1 Unity point cloud handling . . . 18

4.2 Gazebo and Rviz alternative . . . 22

4.3 Discussion . . . 23

5 Conclusion and recommendations 25 5.1 Conclusion . . . 25

5.2 Recommendations . . . 25

A Appendix 1 27 A.1 ImageReceiver, DepthReceiver, MessageEventArgs, and MessageReceiver class diagram . . . 27

A.2 RosConnector and Subscriber class diagram . . . 28

B Appendix 2 29 B.1 RQT graph of all topics involved on the ROS side . . . 29

Bibliography 30

(5)

iv Building a virtual environment in Unity based on a robot’s perception.

Nikita Vaneev University of Twente

(6)

1

1 Introduction

1.1 Context

i-Botics is an open innovation center founded by the University of Twente and TNO1. One of the fields that are researched in the center is telerobotics which is the field of study concerned with the control of semi-autonomous robots. The concept behind these robots is telepresence.

Telepresence is the use of virtual reality technology, especially for remote control of machinery or for apparent participation in distant events2. It could be useful when people need to work in unsafe environments.

By implementing teleoperated robots it would be possible to save many people who put their lives in danger by working in hazardous environments. Common dangerous areas where ro- bots can replace humans and where access is impossible or very risky are nuclear plants, space, and underwater. Robots with proper sensory equipment are capable of working in conditions, such as low-visibility areas, where humans cannot operate effectively. Notwithstanding, the machines are only an addition to human reflexes. Teleoperated robots can perform such tasks as maneuvering, mobility, dexterity, sensing, endurance, radio communication, durability, re- liability, autonomy, logistics, and safety; thus, they are great tools to help in emergency cases (Russel, 2017). However, time-delays between the human and the robot are a big issue since human abilities and reflexes are essential in decision making, and even small delays could neg- atively affect the performance of the tasks. In order to solve this problem, a virtual environment could be generated which is updated in real time using robot’s sensors. The concept of the pro- ject described in this chapter is illustrated in Figure 1.1.

The robot consists of many sensors, but the one that is used in this project is an RGB-D camera.

The RGB-D camera makes it possible to create rich and full 3D maps even in completely dark environments; however, they have their drawbacks such that the depth can be detected only on the particular distance; depending on the camera, some images appear noisy; the field of view is restricted to a specific angle.

1.2 Problem definition

In order for the project to be implemented successfully there are three things that need to be taken into consideration that make the whole process challenging. The project deals with op- erations on two operating systems: Linux and Windows. To allow the functionality and com- munication between the platforms, a bridge needs to be implemented where the data from one computer will be passed to another. Such bridge exists and it is a package in ROS called Rosbridge; however, it has to be included within both operating systems, which is the first chal- lenge of the project.

Once the data on the Linux computer is obtained and sent through a bridge to Unity, it needs to be modeled into a virtual environment. The data from the sensor must be processed using a proper visualization software that can make rich 3D point cloud in order to provide accurate feedback to the operator. The sensory information from the camera is essential because if the image quality is not good enough, then it would not be possible to recognize the environment and interact with it. However, there are not many ways to accurately generate 3D point cloud in Linux using the Kinect, so the second challenge is to find the software that is capable of producing good visuals.

Finally, once the RGB-D data is obtained for a single measurement, it needs to be saved and added to the previous measurement to create full and detailed point cloud map. Even though

1http://i-botics.com/

2https://en.wikipedia.org/wiki/Telerobotics

(7)

2 Building a virtual environment in Unity based on a robot’s perception.

this task could be done on the Linux side because there are already ways of working with 3D point cloud maps there, it would be a challenge to do so on the Unity side since this approach has not been made yet.

1.3 Approach

The goal of this project is to provide precise sensing from an RGB-D camera and then model a delay-free virtual environment in Unity based on the RGB-D data. While working with robots and their sensors all the software is made for Linux. In order to work with robotics in Linux, there must be some middleware between the sensors and the operating system. Such middle- ware exists and it is called ROS (Robot Operating System). This system consists of tools and libraries that allow creating a robust functionality for the sensors3.

The RGB-D camera functionality is based on the RGB-D SLAM algorithm which provides the current position of the camera and creates a point cloud on the map. As for the modeling part, Unity3d is chosen because this game engine is used to develop three-dimensional and two- dimensional games4. The key aspect of Unity for this project is the ability to implement virtual reality. Unity3d is compatible with different VR headsets and makes it easy to implement the chosen device within the platform.

Figure 1.1: The concept behind the project.

1.4 Research questions

Based on the problem definition the following main research question was formulated:

• How to retrieve and model 3d point-cloud data using an RGB-D camera?

Since there are two operating systems involved in this project, additional sub-questions were created that would help to achieve the goal of the project and answer the main research ques- tion. For the ROS side, the sub-questions are:

3http://www.ros.org/about-ros/

4https://unity3d.com/

Nikita Vaneev University of Twente

(8)

CHAPTER 1. INTRODUCTION 3

• How is point cloud generated in ROS?

• How can RGB-D data be transferred via the network to Unity?

For the Unity3D side, the sub-questions are:

• How does Unity3D receive data from ROS?

• How can single RGB-D measurements from the Kinect be combined into a full 3D point cloud map and modeled in Unity3D?

1.5 Report organization The outline is as follows:

• Chapter 2

This chapter describes the analysis that was made for this work. It consists of the descrip- tion of the robotic system and the software used during the project, and it provides the analysis of visual SLAM algorithms.

• Chapter 3

This chapter discusses in details the design and the implementation of the system. All the packages involved in the project are described and documented.

• Chapter 4

This chapter evaluates the final system and discusses the issues that were faced during the project. Also, possible alternatives to this work are provided.

• Chapter 5

This chapter concludes the work and gives recommendations for future improvements.

(9)
(10)

CHAPTER 2. ANALYSIS 5

Figure 2.2: The robotic platform.

2.1.2 ROS

The Robot Operating System (ROS) is a middleware framework for robot’s software creation and implementation. It provides the ability for easier integration of the software making the entire process straightforward. A running instance of ROS has a master node which sets a com- munication with other nodes. Once the connection is established, the nodes can transmit data between each other through transport protocols. The most significant advantage of ROS is that it allows implementing software created by different developers. The distribution of ROS used in this work is ROS Kinetic because it is primarily targeted at the Ubuntu 16.04.

Within the system, it is possible to use packages that contain libraries, nodes, datasets etc.

These packages allow a developer to easily reuse and share the software; thus, providing a use- ful functionality.

2.1.3 Unity3D

Unity3D is a cross-platform game development software that supports three scripting lan- guages: JavaScript, C#, and Python (Wang et al., 2010). These languages can be mixed-use during game development. The implementation of the Kinect in Unity is limited, and the most popular and up-to-date asset in the store is MS-SDK. However, the sensory information from the Kinect comes from the ROS side; thus, the Kinect based assets in Unity are not crucial for this project.

2.1.4 ROS and Unity communication

In order to have a connection between ROS and Unity, there must be a bridge to connect both systems. For Unity, the interface can be based either HTTP (communication protocol for of in- formation that includes graphics, audio, video, plain text, and hyperlinks) or WebSocket (com- munication protocol that provides communication in both directions over a single TCP con- nection) protocol. For ROS, the interface is based on a software package Rosbridge (Codd- Downey, Forooshani, Speers, Wang, & Jenkin, 2014). The advantage of Rosbridge is that it can

(11)

6 Building a virtual environment in Unity based on a robot’s perception.

be easily implemented by people with no prior knowledge of robotics to build web-based ap- plications. Rosbridge provides JSON (JavaScript Object Notation which is a file format that uses human-readable text to transmit data objects) functionality to ROS and can interact with various interfaces, including WebSocket.

The most up-to-date way of communication between ROS and .NET (software framework that runs on Microsoft Windows) based programs is ROS# which is an open source tool in C# (Bis- choff, 2018). The main parts of ROS# are RosBridgeClient and UrdfImporter. RosBridgeCli- ent is based on two libraries: websocket_sharp (used for communication with Unity) and Json.NET(used to increase the speed of topic sending between the operating systems). Urd- fImporter is based on two libraries as well: Math.NET Numerics and TaskParallelLibrary for .NET 3.5 that are used for robot’s representation in Unity and Gazebo.

2.1.5 SLAM

SLAM (simultaneous localization and mapping) algorithm is used to navigate robots in un- known environments while producing a map of the area around the robot and updating the location of the system within the map. The algorithm calculates 2D coordinates of the robot and estimates the position of the system. Additionally, a loop closure in SLAM asserts that the robot has returned to a previously visited location; thus, creating a better implementation of the 3D map. The data from the Kinect is processed by an RGB-D SLAM algorithm that not only localizes the camera in space but also makes a 3D point-cloud.

To answer the research question "How to retrieve and model 3d point-cloud data using an RGB- D camera?" formulated in Section 1.4, a SLAM method needs to be implemented that would produce a 3D point cloud. Since SLAM does not have visualization then a new method should be a visual SLAM algorithm where the point cloud would consist of multiple measurements from the Kinect that have to be saved into one 3D point cloud map.

2.2 Visual SLAM methods

This section analyzes different SLAM methods that could be used for making 3D point cloud maps using the Kinect. Some of the packages might be a better fit for the project than the others, but all of them give a good insight on how visual simultaneous localization and map- ping works. Seven parameters were chosen to determine which SLAM performs the best in comparison with the others. The parameters are visualization, 3D map quality, map density, noise level, odometry, speed of mapping, and RGB-D support. These settings were chosen be- cause they have the most significant effect on the visuals. The goal of the section is to find the best RGB-D SLAM algorithm that has the best match based on the parameters described earlier. Four possible options are described starting from the SLAM that generates the least detailed map to the most detailed one. These options are Hector SLAM, ORB-SLAM, DPPTAM, and RTAB-Map. The reason why these algorithms were chosen is that they can all use the Mi- crosoft Kinect as their main camera (2D LIDAR sensor is recommended for Hector SLAM, but the Kinect can still produce 2D maps).

2.2.1 Hector SLAM

Hector SLAM is based on data obtained from the 2D LIDAR camera (Kohlbrecher, Von Stryk, Meyer, & Klingauf, 2011). The algorithm deals with 2D data and can build a 2D map and localize the camera’s position (Figure 2.31). LIDAR camera is usually mounted at an angle to the surface that robot moves. To build a map, the coordinate system of the sensors needs to be converted to the coordinate system of the surface. The algorithm makes a 2D grid that consists of cells.

1https://robotics.stackexchange.com/questions/7387/hector-slam-matching-algorithm

Nikita Vaneev University of Twente

(12)

CHAPTER 2. ANALYSIS 7

Each cell has a color based on an occupation status (black - busy, light grey - free, dark grey - not scanned). A green line represents the path of the robot.

Due to the low field of view and the short range, the Kinect is not a good camera for using Hec- tor SLAM, however, for testing purposes, it could still work. Instead, RGB-D SLAM algorithms are a better approach while working with the Kinect. Additionally, based on the seven para- meters from section 2.2.1, such criteria as 3d map quality, map density, noise level and RGB-D support are missing because it is a 2D SLAM; thus, Hector SLAM is not a proper algorithm for this project.

Figure 2.3: Example of the map produced by HECTOR-SLAM.

2.2.2 ORB-SLAM

ORB-SLAM is based on real-time data obtained from monocular, stereo or RGB-D cameras (Montiel, 2015). This SLAM implementation uses the Bundle Adjustment algorithm that gen- erates images and places them in a sparse 3D scene. ORB (Oriented FAST and Rotated BRIEF) part of the algorithm uses fast feature detector that allows it to operate in real-time (Figure 2.42). ORB-SLAM detects loop closure and localization of the robot for each produced frame.

The data from the algorithm is processed with a ROS library that provides all the feedback and GUI. ORB-SLAM provides sparse 3D scenes that have their advantages and drawbacks. The method finds image displacements like edges and corners and tracks them from one frame to another.

From one side, the data that needs to be transferred is not heavy because there are not many details; thus, making the transferring much faster, but from the other side, the entire point of creating a virtual environment will not work with this SLAM due to lack of details. Regarding the parameters from section 2.2, such important criteria as quality of the map and map density have a low value, and, as it was mentioned earlier, there are not many details in the point cloud because of that.

2https://www.researchgate.net/figure/ORB-SLAM-using-data-from-CIT-Lab_fig7_311607258

(13)

8 Building a virtual environment in Unity based on a robot’s perception.

Figure 2.4: Example of the map produced by ORB-SLAM.

2.2.3 DPPTAM

DPPTAM (Dense Piecewise Planar Tracking and Mapping from a Monocular Sequence) is a real-time SLAM algorithm which estimates and reconstructs a 3D scene based on the assump- tion that homogeneous-color regions belong to the same planar areas (Figure 2.53). The al- gorithm is an online ROS node that builds dense point cloud maps while detecting mono- chrome flat objects such as walls and floors.

Based on the seven parameters, such criteria as 3D map quality, map density, noise level and RGB-D support are a strong part of DPPTAM. The most significant advantage of this SLAM is the 3D map quality. The rendering of the pixels makes it easy to distinguish between the objects visually. However, the downside is the time that it takes for the map to build and render.

Moreover, due to the low building speed, the entire map may crash while rescanning if new objects are added to already scanned areas (Civera & Concha, 2017).

2.2.4 RTAB-Map

RTAB-Map (Real-Time Appearance-Based Mapping) is a SLAM algorithm based on closure de- tection (Labb, 2014). The algorithm behind the closure detection is Bag-of-words that com- pares new images of the camera with the pictures of already visited locations (Figure 2.6). Every time the closure is detected, it optimizes the map; thus, creating a better visual representation.

RTAB-Map supports different RGB-D cameras including the Kinect. This SLAM can easily be implemented in ROS through the existing package. The algorithm uses a 2D grid map that con- sists of cells that check the surface for occupancy and then forms a 3D point-cloud map. Each cell contains pixels with different texture and color depth, so the algorithm assumes that pixels with similar qualities belong to the same object. Once the map is created it can be saved as a ply or a pcd file that can be opened and edited in any 3D software (Unity) that supports this format.

In comparison with other SLAM algorithms, namely Hector SLAM, ORB-SLAM and DPPTAM, RTAB-Map scores highest on almost every parameter (except the noise level).

3https://github.com/alejocb/dpptam

Nikita Vaneev University of Twente

(14)

CHAPTER 2. ANALYSIS 9

Figure 2.5: Example of the map produced by DPPTAM.

Figure 2.6: Example of the map produced by RTAB-Map.

2.2.5 Systematic comparison of Visual SLAM methods

The figures 2.7 and 2.8 provide a systematic comparison between four visual SLAM methods. It can be seen that Hector SLAM stands behind by many parameters since it is a 2D SLAM only;

thus, it is not a good option for the project. As for the other SLAM methods then it can be concluded that only two ways are qualified for a proper 3D mapping. These SLAM algorithms are DPPTAM and RTAB-Map. While comparing both approaches the following things need to

(15)
(16)
(17)
(18)

CHAPTER 3. DESIGN AND IMPLEMENTATION 13

3.1 Master side

The following section is split into four parts, and it describes in details how each package op- erates on the Linux side of the system. The first part explains how the Kinect is obtaining the image data using the Freenect package. The second part focuses on the communication side using ROS#. Lastly, the third part provides the details about the 3D point cloud mapping.

3.1.1 Freenect package

To obtain a visual data from the Kinect, there must be a package that would allow such func- tionality. For this reason, Freenect serves as the driver for Kinect on the Linux computer. The data from the sensor is organized into topics that Unity uses for the image processing. It is possible to obtain a point cloud from the Freenect package by using two topics: color and depth (Figure 3.2). The color topic provides RGB data while the depth topic offers monochrome depth. However, the functionality of this package is limited, and it does not allow the genera- tion of a 3D point cloud map. In that case, this package is used for two particular reasons. The first reason is to activate the Kinect, so it would be possible to use it in RTAB. The second reason is to be able to subscribe to the RGB topic in Unity, so the user would be able to see in VR from the robot’s perspective.

Figure 3.2: RGB topic (left) applied on monochrome depth topic (right).

3.1.2 RTAB-Map package

In the previous section, it was mentioned that by using the Freenect package it is possible to make a point cloud; however, the functionality of mapping is missing. For this reason, RTAB- Map was chosen as the primary software to produce the visuals. The general information about RTAB-Map is provided in section 2.2.4. What is essential about this SLAM is that there is a ROS package that can be implemented within Rviz to produce and, later, save the point cloud which is stored as a database. This database can be converted into obj and viewed in any software that supports this format. The package is called rtabmap_ros, and once it is run on the Linux computer, it executes the following functions: PointCloud2, TF, MapCloud, Info. The most es- sential features of this package for this project are PointCloud2 and MapCloud because the

(19)
(20)
(21)
(22)

CHAPTER 3. DESIGN AND IMPLEMENTATION 17

would be following the perspective of the VIVE and due to the sensitive position tracking of the headset, the plane would be constantly shaking that may cause symptoms of dizziness of the person using the headset

(23)

18 Building a virtual environment in Unity based on a robot’s perception.

4 Evaluation

In the previous chapter, the implementation of the system was described where the role and the functionality of each package is elaborated. Even though all the necessary software was installed properly within Windows and Linux, there were a few issues with the topics received from the ROS framework. In particular, Unity did not handle depth registration in the way ROS did, which caused problems with a proper virtual environment generation. The following chapter starts with evaluating the way Unity handles point cloud, then it focuses on why some parts of the system did not work as expected, and ,finally, it ends with discussion on switching from Unity to Gazebo and Rviz.

4.1 Unity point cloud handling

While trying to build an environment in Unity based on the Kinect data, three topics need to be considered. The topics are RGB image, depth, and map. The first two are responsible for point cloud creation and the last one deals with creating a 3D map based on the point cloud.

In order to achieve the desired goal of generating a virtual environment in Unity, a systematic approach was proposed that consisted of three steps. The first step was to generate a 2D image based on the /camera/rgb/color_image compressed topic. The second step was to add depth to the 2D image produced from the previous step using a /camera/depth/image_raw topic. The third step was to map all the obtained data and view it VR.

There were multiple attempts on extracting the image and the depth data, which resulted in a mostly positive outcome as it was possible to see the depth of the map to some extent. As for the mapping, there were no successful outcomes on building a 3D map in Unity. Each step is documented in details further in this report.

4.1.1 Image registration

From the three topics described earlier, RGB image processing is the most stable approach that worked in all scenarios. Unity has many components that deal with all kinds of mesh rendering.

When getting the data from the color topic of the Kinect, it is later applied as a 2D texture on a 2D mesh represented as a plane. There were no issues whatsoever at running the system using only a 2D visualization (Figure 4.1). Two functions were involved at image receiving.

The first function deals with getting the message and the second function subscribes on the image topic /camera/rgb/image_rect_color/compressed. To test whether computing power affects the experience of the operator in VR it was decided to involve different computers with a significant difference in the hardware (Figure 4.2). The first computer (PC1) has a 6 cores processor with a clock speed of 3.2GHz, Nvidia GTX 560TI GPU, and 8Gb of RAM, while the second computer (PC2) has only 2 cores processor with a clock speed of 2.4GHZ, Intel HD 5500 GPU, and 8Gb of RAM. As expected, based on visual assessment, the system with worse settings (PC2) was causing a delay on the Unity side making the image unpleasant to watch and hard to recognize; while the other system did not have any issues on the Unity side.

Nikita Vaneev University of Twente

(24)
(25)
(26)
(27)

22 Building a virtual environment in Unity based on a robot’s perception.

4.2 Gazebo and Rviz alternative

Since the attempt of generating a virtual environment based on sensory information obtained by a robotic system did not work, it was decided to look for a possible alternative to evaluate whether Unity is a good match for this project after all. Some of the issues with Unity that were mentioned earlier are distortion of the depth in the point cloud and unbearable time delay after adding a depth topic. However, these issues are not present on the Linux side. Moreover, point cloud handling in ROS using RTAB-Map shows promising results in terms of the quality of the map. Also, the time delay while processing a lot of optical data is minimal in comparison with Unity. Hence, there is a possibility of omitting Unity and Windows in general in favor of Rviz and Gazebo. Gazebo can be used to manipulate the robot in space while Rviz is used to visualize optical and depth data obtained by the Kinect.

4.2.1 Virtual reality in Linux

A simulation in Gazebo was implemented where a turtlebot is placed in a simulated environ- ment. The turtlebot is controlled with a keyboard, and all the visual data is handled by RTAB- Map and illustrated in Rviz. The role of Gazebo is to replicate a real scenario where the robot could be used; hence, Rviz could be used as the only software for 3D point cloud generation since the purpose of Rviz in this project is to visualize optical sensor data from ROS. In the fu- ture, Gazebo could be added in addition to Rviz as a tool to deal with the physical aspects of the robot and the environment.

The results of the simulation mentioned earlier were promising in both, time delay and map quality. However, the biggest issue of this implementation is the current state of VR in Linux.

One of the main reasons why Unity was proposed for this project is because it supports all kinds of virtual reality headsets, which is not the case in Linux. The current state of VR in Linux is not nearly as good as in Windows and such software as SteamVR, OSVR, or OpenHMD are barely working, providing only limited functionality. Nevertheless, there have been a few attempts at creating custom packages for Linux that would make it possible to, at least, add HMD as a secondary monitor to view the desktop. The reason why the idea of using the VR in Linux for this project appeared is because once the headset could be used within Rviz then there would not be a need to transfer a lot of data through the network to a different operating system. In this case, the problem of generating a virtual environment on the user side would be completely solved by removing Unity and the issues that appeared with it. However, while trying to implement HMD within Linux, many problems could not be resolved on time due to the lack of skills in programming and the time restrictions of this project.

HTC VIVE

Currently, the only official support of HTC VIVE in Linux comes from SteamVR, which is not enough for this project. There have a been a few attempts at using the VIVE within Gazebo and Rviz, but all the packages are outdated. The most recent solution of using the VIVE in Rviz is RVIZ plugin for the HTC VIVE. Most of the source code was updated two years ago, while the most recent updates were made a year ago. The main reason why this approach did not work is that the package is based on outdated software and some parts of it could not be accessed anymore. It is worth noticing that it was recommended to use this package with the original versions of SteamVR and VSAJ that were used when the package was created. However, as it was mentioned earlier, it did not work, that is why the test was moved forward and the package was tested with the SteamVR version 1528940790 and OpenVR version 1.0.15. As a result, the package could not operate properly and crashed a few times because of the software differences; thus, this approach did not work either. There are more packages that could be found on the internet, but all of them are in development state and none have a stable release.

Nikita Vaneev University of Twente

(28)
(29)

24 Building a virtual environment in Unity based on a robot’s perception.

program which eliminates the opportunity of livestreaming, additionally, it does not match the idea behind the project where point cloud is received from the Linux computer.

All of that brings to the conclusion that Unity might not be a good platform in the scope of this work and more research needs to be conducted directly about Unity to see how this software handles the data obtained from Linux. Also, Gazebo and Rviz seem to be a good alternative in this case for multiple reasons. The first reason, RTAB-Map handles 3D point cloud mapping without any major issues and the visuals can be displayed in Rviz. The second reason, there is no communication loss as it is in Windows to Linux connection since all the processing and handling of the data is happening on one platform. The third reason, there is still a possibility to implement VR within Rviz even though there are many problems that have to be resolved.

However, all the issues are mainly related to old codes that could be updated by people with strong programming skills.

Nikita Vaneev University of Twente

(30)

25

5 Conclusion and recommendations

5.1 Conclusion

The goal stated in the introduction was to generate a virtual environment based on the sensory information obtained by a robotic system. Based on this goal, a research question "How to retrieve and model 3d point-cloud data using an RGB-D camera?" was formulated and then answered to some extent.

For the Linux side all the sub-questions, namely, "How is point cloud generated in ROS?" and

"How can an RGB-D data be transferred via the network to Unity?" were answered. For the first question, the best way to generate a 3D point cloud is to use a visual SLAM algorithm RTAB- Map because it produces dense point clouds, it has good odometry, the data can be opened in any 3D editing software, and it can be implemented within ROS. For the second question, Rosbridge implemented within ROS# is ideal to establish the connection with the Windows side and transfer the data.

For the Windows side, the sub-question "How does Unity3D receive data from ROS?" was answered and "How can single RGB-D measurements from the Kinect be combined into a full 3D point cloud map and modeled in Unity3D?" was answered to some extent. For the first question, the communication is similar to the approach on the Linux side, as a base of the connection, Websocket is used within ROS# package where it gets connected with Rosbridge.

For the second question, the desired goal of making a 3D point cloud map in Unity was not achieved. Instead, it was possible to generate a stable 2D environment that was based on the RGB topic obtained by the Kinect. The main issue appeared due to the added depth topic that Unity did not handle well. The reason behind this problem was not discovered and requires further research on Unity3D.

Since the sub-question "How can single RGB-D measurements from the Kinect be combined into a full 3D point cloud map and modeled in Unity3D?" could not be answered, a different approach could be applied for this project. By implementing a head-mounted display which could be either HTC VIVE or Oculus Rift in Linux, it is possible to substitute Unity to Rviz since both of them have a similar role in this work, namely visualizing 3D point cloud map where a user can see it in VR. As for Gazebo, it does not play an essential role specifically for this pro- ject; however, it is definitely an important addition to future implementations where the user would interact with an environment in VR because Gazebo provides tools for getting physical information of the objects.

5.2 Recommendations

The following recommendations could be considered to improve the system:

• Improve the quality of the depth.

In the current state, the depth in Unity3D is handled as layers of 2D planes that simulate the appearance of the real depth from the Kinect.

• Add the mapping to save the point cloud in Unity.

At the moment there are two topics included in Unity3D which are color and depth. Once the issues with depth processing are resolved, the next step could be storing the obtained data.

• Switch from Unity3D to Rviz and Gazebo.

(31)

26 Building a virtual environment in Unity based on a robot’s perception.

Due to many problems that were faced during the project, it would be a good idea to use only Rviz and Gazebo combination because all the data generation and processing is already happening on the Linux side and can be viewed in Rviz.

• Update Rviz package for Oculus Rviz.

Relating to the previous recommendation, an option of adding a VR headset in Rviz could be included, but the software is outdated.

Nikita Vaneev University of Twente

(32)

27

A Appendix 1

A.1 ImageReceiver, DepthReceiver, MessageEventArgs, and MessageReceiver class diagram

Figure A.1: Class overview

(33)

28 Building a virtual environment in Unity based on a robot’s perception.

A.2 RosConnector and Subscriber class diagram

Figure A.2: Class overview

Nikita Vaneev University of Twente

(34)
(35)

30 Building a virtual environment in Unity based on a robot’s perception.

Bibliography

Aguilar, W., & Morales, S. (2016). 3D Environment Mapping Using the Kinect V2 and Path Plan- ning Based on RRT Algorithms. Electronics, 5(4), 70. https://doi.org/10.3390/electronics5040070 Bischoff, M., (2018). ROS# is a set of open source software libraries and tools in C# for communicating with ROS from .NET applications, in particular Unity3D. Retrieved from https://github.com/siemens/ros-sharp

Codd-Downey, R., Forooshani, P. M., Speers, A., Wang, H., & Jenkin, M. (2014). From ROS to unity: Leveraging robot and virtual environment middleware for immersive teleoperation.

2014 IEEE International Conference on Information and Automation, ICIA 2014, (July), 932- 936. https://doi.org/10.1109/ICInfA.2014.6932785

Civera, J., & Concha, A., (2017). DPPTAM: Dense Piecewise Planar Tracking and Mapping from a Monocular Sequence. Retrieved from https://github.com/alejocb/dpptam

Kohlbrecher, S., Von Stryk, O., Meyer, J., & Klingauf, U. (2011). A flexible and scalable SLAM system with full 3D motion estimation. 9th IEEE International Symposium on Safety, Security, and Rescue Robotics, SSRR 2011, 155-160. https://doi.org/10.1109/SSRR.2011.6106777

Labb, M. (2014). Online Global Loop Closure Detection for Large-Scale Multi-Session Graph- Based SLAM, (Iros), 2661-2666.

Montiel, J. M. M. (2015). ORB-SLAM : A Versatile and Accurate Monocular, 31(5), 1147-1163.

Newcombe, R. A., Davison, A. J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., ... Fitzgib- bon, A. (2011). KinectFusion: Real-Time Dense Surface Mapping and Tracking. IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR), 127-136. ht- tps://doi.org/10.1109/ISMAR.2011.6092378

Russell, D. L. (2017, July 19). Emergency Response Robots. Retrieved April 11, 2018, from https://www.nist.gov/programs-projects/emergency-response-robots

Wang, S., Mao, Z., Zeng, C., Gong, H., Li, S., & Chen, B. (2010). A new method of virtual reality based on Unity3D. 2010 18th International Conference on Geoinformatics, 1-5. ht- tps://doi.org/10.1109/GEOINFORMATICS.2010.5567608

Nikita Vaneev University of Twente

Referenties

GERELATEERDE DOCUMENTEN

This study aims to determine how effective and efficient a virtual lab is compared to a remote lab in inquiry learning environment. The main reasons for this study are: 1)

For example, cooperation and collabo- ration is closely related to the sharing of knowledge; the employees willingness to accept new ICT initiatives is influ- enced by

15 Once the shared database of the Tales of the Revolt team has reached a certain definitive state, efforts will be taken to ensure that the dataset can be migrated to a

As a case study into 3D interaction on tabletop computers, we construct a virtual sandtray, inspired by the (physical) sandtrays used in sandtray therapy.. In this form of therapy,

In particular, we examine whether the addition of binocular disparity cues may modulate the extent to which nearby surfaces within a two-dimensional (2- D) projection plane of

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:.. • A submitted manuscript is

Het grafveld van Broechem blijkt ook reeds van de in de 5de eeuw in gebruik en op basis van andere recente archeologische gegevens uit dezelfde regio, is ondertussen bekend dat

In het spanningsequivalente circuit (fig. 3.17b) zien we dat de kabel- capaciteit C en de opnemercapaciteit C werken als een spanningsdeler, en dus het signaal aan het