Raspberry Pi 2 Floating-point Benchmarks

By gaeddert on March 14, 2015

Keywords: raspberry pi benchmark thread beagle bone ARM

Here you'll find a quick benchmark shootout between a few inexpensive ARM-based platforms including the Raspberry Pi B+, the Raspberry Pi 2 B (names can be confusing), and an old Beagle Bone I had lying around.

blog/raspberry-pi2-benchmarks/boards.jpg

Inexpensive ARM-based hardware. From L to R: Beagle Bone, Raspberry Pi B+, Raspberry Pi 2 B.

Just a few weeks ago raspberry pi announced the second version of their super popular micro ARB-based computer. I have been using a copy of the B+ for quite some time but was quite disappointed with the performance (more in that later). When compared to the beagle bone the RPi B+ seemed downright anemic. The B+ only has an ARMv6 which means it is incapable of serious vector floating point operations and while it's fixed-point performance is nothing to scoff at, most enthusiasts in the SDR world want the luxury of not having to deal with bit precision. And I don't blame them.

So when my Raspberry Pi 2 arrived this week, I was ready with a card already flashed with the latest OS build. And while I am an ubergeek I'm not going to shoot some sort of RPi2 unboxing video. I have my standards. I'll leave hilarious unboxing videos of nerdy computation equipment to the professionals .

The Hardware

blog/raspberry-pi2-benchmarks/beaglebone.jpg

Beagle Bone I've had since early 2012.

blog/raspberry-pi2-benchmarks/raspberrypibplus.jpg

Raspberry Pi 1 Model B+.

blog/raspberry-pi2-benchmarks/raspberrypi2.jpg

Raspberry Pi 2 Model B. Visual differences from version 1 are subtle. Performance differences are not.

Here are the deets:

  • Beagle Bone : ^89 in 2011, AM335x 720 MHz ARM Cortex-A8, 256 MB DDR2 RAM. I would have used the more modern Beagle Bone Black but I didn't have one lying around.
  • Raspberry Pi Model B+ : ^35. To keep the cost low the developers opted for 700 MHz single-core ARMv6 with 512 MB SDRAM.
  • Raspberry Pi 2 Model B : ^35. Like the B+, the Pi 2 has an ARM core, but it has three major advantages: quad-core CPU, double the RAM, and an improved ARM instruction set.

While it's really unfair of me to compare the Raspberry Pi 2 to the original Beagle Bone, the purpose of this post is really to see how to the Pi 2 is at basic SDR tasks.

The Benchmarks

There are plenty of commercially available tools for benchmarking processors out there. But I am interested in seeing how well these ARM devices are suited to modern software radio design. I've decided to keep the tasks simple and stayed away from implementing a particular waveform. Three of the most common and computationally intense tasks for software radio are filtering, FFTs, and FEC decoding. For this entry I've decided to focus on filtering because it seems to be the most appropriate for most users.

To compare these three devices I've chosen two benchmarks:

  • Single-threaded FIR filter with complex inputs, complex outputs, and real coefficients
  • Multi-threaded firfilt_crcf ; same as above but with 4 threads running concurrently

The program is fairly simple: 1 or 4 independent threads are started which are each given an input and output buffer. These buffers are 256 samples long, pre-filled with 128 samples, and chained together in a ring fashion such that each thread writes its output to the next thread's input. The program runs for a fixed amount of time and counts the total number of samples processed through the chain. Each thread used a low-pass firfilt_crcf object with 81 coefficients. The benchmark results are plotted in Figure [ref:performance] , below.

blog/raspberry-pi2-benchmarks/performance.png

Figure [performance]. Threaded performance

Looking solely at the single-threaded performance, both the Beagle Bone and the Raspberry Pi 2 drasticly out-perform the Raspberry Pi B+ model. This is primarly due to the vastly improved CPU architecture of the ARMv7 over the ARMv6. Because of their similar architectures and CPU speeds, the Beagle Bone is roughly on par with the Pi 2 in single-threaded performance. Furthermore, the Pi 2 has about a 4x speedup when running with multiple threads due to its quad core ARM CPU.

None of this is particularly surprising, but it is nice to see performance match expectations on a new embedded platform. But the looming question remains: is the RPi finally fast enough to do some respectable DSP lifting for SDR? If you're considering using this as the back-end processor for an RTL-SDR, then it's probably fine. Considering how popular and inexpensive these devices are, their ease of setting up, and their wide-spreads community support, I feel the raspberry pi is seriously lowering the barrier to entry for SDR hobbyists.