# Raspberry Pi 2 Floating-point Benchmarks

By Joseph D. Gaeddert on March 14, 2015

Here you'll find a quick benchmark shootout between a few inexpensive ARM-based platforms including the Raspberry Pi B+, the Raspberry Pi 2 B (names can be confusing), and an old Beagle Bone I had lying around.

Inexpensive ARM-based hardware. From L to R: Beagle Bone, Raspberry Pi B+, Raspberry Pi 2 B.

Just a few weeks ago raspberry pi announced the second version of their super popular micro ARB-based computer. I have been using a copy of the B+ for quite some time but was quite disappointed with the performance (more in that later). When compared to the beagle bone the RPi B+ seemed downright anemic. The B+ only has an ARMv6 which means it is incapable of serious vector floating point operations and while it's fixed-point performance is nothing to scoff at, most enthusiasts in the SDR world want the luxury of not having to deal with bit precision. And I don't blame them.

So when my Raspberry Pi 2 arrived this week, I was ready with a card already flashed with the latest OS build. And while I am an ubergeek I'm not going to shoot some sort of RPi2 unboxing video. I have my standards. I'll leave hilarious unboxing videos of nerdy computation equipment to the professionals .

## The Hardware

Beagle Bone I've had since early 2012.

Raspberry Pi 1 Model B+.

Raspberry Pi 2 Model B. Visual differences from version 1 are subtle. Performance differences are not.

Here are the deets:

• Beagle Bone : $89 in 2011, AM335x 720 MHz ARM Cortex-A8, 256 MB DDR2 RAM. I would have used the more modern Beagle Bone Black but I didn't have one lying around. • Raspberry Pi Model B+ :$35. To keep the cost low the developers opted for 700 MHz single-core ARMv6 with 512 MB SDRAM.
• Raspberry Pi 2 Model B : \$35. Like the B+, the Pi 2 has an ARM core, but it has three major advantages: quad-core CPU, double the RAM, and an improved ARM instruction set.

While it's really unfair of me to compare the Raspberry Pi 2 to the original Beagle Bone, the purpose of this post is really to see how to the Pi 2 is at basic SDR tasks.

## The Benchmarks

There are plenty of commercially available tools for benchmarking processors out there. But I am interested in seeing how well these ARM devices are suited to modern software radio design. I've decided to keep the tasks simple and stayed away from implementing a particular waveform. Three of the most common and computationally intense tasks for software radio are filtering, FFTs, and FEC decoding. For this entry I've decided to focus on filtering because it seems to be the most appropriate for most users.

To compare these three devices I've chosen two benchmarks:

• Single-threaded FIR filter with complex inputs, complex outputs, and real coefficients
• Multi-threaded firfilt_crcf ; same as above but with 4 threads running concurrently

The program is fairly simple: 1 or 4 independent threads are started which are each given an input and output buffer. These buffers are 256 samples long, pre-filled with 128 samples, and chained together in a ring fashion such that each thread writes its output to the next thread's input. The program runs for a fixed amount of time and counts the total number of samples processed through the chain. Each thread used a low-pass firfilt_crcf object with 81 coefficients. The benchmark results are plotted in Figure [performance] , below.