Cyber RT vs ROS

在Apollo 3.5中,百度自研了运行时计算框架Cyber RT来代替ROS官方FAQ中提到其性能、延迟以及吞吐量都要优于ROS。这篇博客记录了我依据Issue #7220来验证性能是否提高的过程。

我们测试的是发送和接收消息的延时。在Cyber RTROS中分别有自带的demo来完成这一功能,我们只需将这两个demo中发送的数据统一一下,并在接受时输出latency即可。

测试Cyber RT

Clone apollo最新代码

1
git clone https://github.com/ApolloAuto/apollo.git -b master

运行docker环境

1
2
./docker/scripts/dev_start.sh
./docker/scripts/dev_into.sh

apollo提供了docker环境,已经预装了各种依赖的环境,官方也建议在docker中运行。

build项目

1
bash apollo.sh build

build后的可执行文件在bazel_bin/目录下。

我们要测试的两个可执行文件即为talkerlistener,它们在bazel_bin/cyber/examples/目录下,此时可以运行这两个文件测试功能是否正常。

修改talker和listener 的源代码

talker.cpplistener.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
/* cyber/examples/talker.cpp */
#include <iostream>
#include <sstream>

#include "cyber/cyber.h"
#include "cyber/message/raw_message.h"
#include "cyber/time/rate.h"
#include "cyber/time/time.h"

using apollo::cyber::Rate;
using apollo::cyber::Time;
using apollo::cyber::message::RawMessage;

int main(int argc, char *argv[]) {
// init cyber framework
apollo::cyber::Init(argv[0]);
// create talker node
auto talker_node = apollo::cyber::CreateNode("talker");
// create talker
auto talker = talker_node->CreateWriter<RawMessage>("channel/chatter");

Rate rate(10.0);
const std::string foo_content(307200, '+'); // 307200 = 480 * 640
int count = 0;

while (apollo::cyber::OK()) {
auto msg = std::make_shared<RawMessage>();
msg->message = foo_content;

std::stringstream ss;
ss << Time::Now().ToNanosecond();
msg->message.append(ss.str());

talker->Write(msg);
rate.Sleep();

++count;
if (count == 10000) {
std::cout << "talk over" << std::endl;
break;
}
}

return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/* cyber/examples/listener.cpp */
#include "cyber/cyber.h"
#include "cyber/message/raw_message.h"

using apollo::cyber::Time;
using apollo::cyber::message::RawMessage;

uint64_t count = 0;
uint64_t total_latency = 0;

void MessageCallback(const std::shared_ptr<RawMessage>& msg) {
uint64_t now = Time::Now().ToNanosecond();
std::string ts(msg->message, 307200);
total_latency += now - std::stoul(ts);
++count;
}

int main(int argc, char* argv[]) {
// init cyber framework
apollo::cyber::Init(argv[0]);
// create listener node
auto listener_node = apollo::cyber::CreateNode("listener");
// create listener
auto listener = listener_node->CreateReader<RawMessage>("channel/chatter",
MessageCallback);
apollo::cyber::WaitForShutdown();
std::cout << "count: " << count
<< " avg_latency_us: " << total_latency / count / 1000 << std::endl;
return 0;
}

修改conf文件

为了减少线程数,Cyber RT更改了共享内存的可读通知机制,现在默认使用UDP,但是由此导致的系统调用(sendto)会导致一些延迟,因此需要修改cyber/conf/cyber.pb.conf来更改这个机制来达到最低的延迟。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# cyber/conf/cyber.pb.conf
transport_conf {
shm_conf {
# "multicast" "condition"
notifier_type: "condition"
shm_locator {
ip: "239.255.0.100"
port: 8888
}
}
participant_attr {
lease_duration: 12
announcement_period: 3
domain_id_gain: 200
port_base: 10000
}
communication_mode {
same_proc: INTRA
diff_proc: SHM
diff_host: RTPS
}
resource_limit {
max_history_depth: 1000
}
}

run_mode_conf {
run_mode: MODE_REALITY
}

scheduler_conf {
routine_num: 100
default_proc_num: 16
}

使用opt选项编译

1
2
bazel build -c opt --copt=-fpic //cyber/examples/talker
bazel build -c opt --copt=-fpic //cyber/examples/listener

测试latency

在两个终端分别运行talkerlistenertalker会在发送1000条消息后退出,之后Ctrl-clistener,会显示其平均latency

测试ROS

准备源代码 repo

获取apollo-platform源代码

将解压后的ros文件夹放在apollo/third_party中(上面clone的apollo项目中的)。

build项目

build同样要在之前的docker容器中进行。

1
2
cd /apollo/third_party/ros
./build.sh build

不出意外的话,这里不会成功。。。

我遇到了以下几个错误:

缺少某些python包

报错中提示(*** not found. try to install ***),按照提示装就好了

找不到gtest

此时运行:

1
sudo apt install libgtest-dev

你可能会发现提示gtest已经安装了,且是最新版。

解决方案是先卸载gtest,再重新安装(此处参考gist):

1
2
3
4
5
6
7
8
9
10
sudo apt-get remove libgtest-dev

sudo apt-get install libgtest-dev
cd /usr/src/gtest
sudo cmake CMakeLists.txt
sudo make
sudo cp *.a /usr/lib
sudo mkdir /usr/local/lib/gtest
sudo ln -s /usr/lib/libgtest.a /usr/local/lib/gtest/libgtest.a
sudo ln -s /usr/lib/libgtest_main.a /usr/local/lib/gtest/libgtest_main.a

找不到ogg, vorbis, theora

这是在build最后一个包的时候遇到的问题,stack overflow中说这可能是CMake的bug,将其回滚到3.2.3-r1版本即可,问题链接。我尝试后发现该问题并没有被解决。但是奇怪的是更改CMake版本后,vorbistheora都能找到了,而ogg仍找不到。

解决方案是从source开始重装这几个包(参考文章):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
cd ~

wget http://downloads.xiph.org/releases/ogg/libogg-1.3.1.tar.gz
tar -xf libogg-1.3.1.tar.gz
cd libogg-1.3.1/
./configure --host=arm-unknown-linux-gnueabi --enable-static
make
sudo make install

cd ~

wget http://downloads.xiph.org/releases/vorbis/libvorbis-1.3.3.tar.gz
tar -xf libvorbis-1.3.3.tar.gz
cd libvorbis-1.3.3/
./configure --host=arm-unknown-linux-gnueabi --enable-static
make
sudo make install

cd ~

wget http://downloads.xiph.org/releases/theora/libtheora-1.1.1.tar.bz2
tar -xf libtheora-1.1.1.tar.bz2
cd libtheora-1.1.1/
./configure --host=arm-unknown-linux-gnueabi --enable-static
make
sudo make install

其他

build共有113项,我们要测试的talkerlistener大概是70项就完成了,但是如果没有完全build成功,生成的talker和listener运行也会出错(提示是无法找到动态链接库***.so)(WHY?)

运行talker和listener

生成的可执行文件talkerlistener/apollo/third_party/ros/devel_isolated/roscpp_tutorials/lib/roscpp_tutorials/文件夹下。直接运行会报错:

1
2
3
4
5
6
7
[FATAL] [1392021564.231775029]: ROS_MASTER_URI is not defined in the
environment. Either type the following or (preferrably) add this to your
~/.bashrc file in order set up your local machine as a ROS master:

export ROS_MASTER_URI=http://localhost:11311

then type 'rescore' in another shell to actually launch the master program.

解决方案是先运行setup.sh来配置环境:

1
source /apollo/third_party/ros/install/ros_x86_64/setup.sh

我执行这一步就没问题了,如果遇到其他错误可以参考这里

修改talker和listener的源代码

talker.cpplistener.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/* ros/ros_tutorials/roscpp_tutorials/talker/talker.cpp */
#include "ros/ros.h"
#include "std_msgs/String.h"

#include <sstream>
#include <string>

int main(int argc, char **argv) {
/* Ignored... */
ros::Rate loop_rate(10);

int count = 0;
const std::string foo_content(307200, '+');
while (ros::ok()) {
std_msgs::String msg;
msg.data = foo_content;

std::stringstream ss;
ss << ros::Time::now().toNSec();
msg.data.append(ss.str());

chatter_pub.publish(msg);

ros::spinOnce();
loop_rate.sleep();

++count;
if (count == 10000) {
std::cout << "talk over" << std::endl;
break;
}
}

return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/* ros/ros_tutorials/roscpp_tutorials/listener/listener.cpp */
#include "ros/ros.h"
#include "std_msgs/String.h"

#include <string>

uint64_t count = 0;
uint64_t total_latency = 0;

void chatterCallback(const std_msgs::String::ConstPtr& msg) {
uint64_t now = ros::Time::now().toNSec();
std::string ts(msg->data, 307200);
total_latency += now - atol(ts.c_str());
++count;
}

int main(int argc, char** argv) {
ros::init(argc, argv, "listener");
ros::NodeHandle n;
ros::Subscriber sub = n.subscribe("chatter", 1000, chatterCallback);
ros::spin();
std::cout << "count: " << count
<< " avg_latency_us: " << total_latency / count / 1000 << std::endl;
return 0;
}

重新编译tutorial并测试

1
cd /apollo/third_party/ros/build_isolated/roscpp_tutorials && /apollo/third_party/ros/install/ros_x86_64/env.sh make -j8 -l8

测试方法同上

测试结果

平均latency:

Cyber RT: 687us

ROS: 1040us

提升确实很多。。。