About CMSIS DSP

ARM Ltd have developed a range of optimized DSP functions for all of the Cortex MCU's. I have found them a challenge to use in "baremetal" gcc based projects as they rely on a particular directory structure and certain compiler directives. The project outlined below tackles these difficulties and implements a real-time Finite Impulse Response FIR filter on an STM32F303 Nucleo board. You can download the CMSIS libraries from https://github.com/ARM-software/CMSIS (download zip)
The program behaved as hoped - resulting in a very sharp cutoff at 1kHz.

Directory structure

The CMSIS library zip file was extracted and placed into a file system hierarchy as shown below. The project in this article is in the 'fir' directory as shown. The relative placement of these directores is important and is reflected in the variables used in the Makefile shown later.

Makefile

The full Makefile is listed below followed by an explanation of the CCFLAGS and LIBSSPEC variables.

# Specify the compiler to use
CC=arm-none-eabi-gcc
# Specify the assembler to use
AS=arm-none-eabi-as
# Specity the linker to use
LD=arm-none-eabi-ld

CCFLAGS=-mcpu=cortex-m4 -mthumb -g -mfloat-abi=hard -fsingle-precision-constant -mfpu=fpv4-sp-d16  -I ../../CMSIS/CMSIS-master/CMSIS/Include -D ARM_MATH_CM4  -D __FPU_PRESENT=1
# Tell the linker where to find the libraries -> important: use thumb versions
LIBSPEC=-L /usr/local/gcc-arm-none-eabi/lib/gcc/arm-none-eabi/4.8.3/armv7-m -L../../CMSIS/CMSIS-master/CMSIS/Lib/GCC -larm_cortexM4lf_math 

# List the object files involved in this project
OBJS=	init.o \
		serial.o \
		main.o 

# The default 'target' (output) is main.elf and it depends on the object files being there.
# These object files are linked together to create main.elf
main.elf : $(OBJS)
	$(LD) $(OBJS) $(LIBSPEC) -lgcc -T linker_script.ld --cref -Map main.map -nostartfiles -o main.elf
# The object file main.o depends on main.c.  main.c is compiled to make main.o
main.o: main.c
	$(CC) -c $(CCFLAGS) main.c -o main.o

init.o: init.c
	$(CC) -c $(CCFLAGS) init.c -o init.o
serial.o: serial.c
	$(CC) -c $(CCFLAGS) serial.c -o serial.o


# if someone types in 'make clean' then remove all object files and executables
# associated wit this project
clean: 
	rm $(OBJS) 
	rm main.elf 


The compiler flags (CCFLAGS)

-mcpu=cortex-m4: The STM32F303 has an an ARM Cortex M4F core
-mthumb: generate Thumb rather than ARM machine code
-g: generate debugging information
-mfloat-abi=hard: generate code that uses hardware floating point
-mfpu=fpv4-sp-d16: this is the particular floating point unit in the 'F303
-fsingle-precision-constant: treat floating point constants as single precision
-D ARM_MATH_CM4: This is required by the CMSIS library - it causes particular sections to be included
-D __FPU_PRESENT=1: Tells CMSIS library that there is an FPU available
-I ../../CMSIS/CMSIS-master/CMSIS/Include: Header files for CMSIS are back up in these directories

The library specification (LIBSPEC)

-L /usr/local/gcc-arm-none-eabi/lib/gcc/arm-none-eabi/4.8.3/armv7-m: Tell the linker where to find libraries like gcc
-lgcc: (appears later in makefile) Include code from the gcc library e.g. long divide
-L../../CMSIS/CMSIS-master/CMSIS/Lib/GCC: Tell the linker where to find the CMSIS maths libraries
-larm_cortexM4lf_math: Include code from the CMSIS Cortex M4 library

The code

This program implements an FIR filter. The ARM CMSIS code for FIR filters processes data in blocks. The program must acquire a block of data, pass it to the FIR function, and then output the resulting data. In a real-time application you can use double buffering with block data processing to achieve continuous data flows. Data is acquired and output on a timed interrupt basis. At each interrupt, data is inserted into an input buffer and read from an output buffer. When the input buffer is full (and the output buffer is empty), the program switches to a different pair of input/output buffers and passes the freshly acquired data to the data processing function. The next time the input buffer is full these original buffers are swapped back and acquisition/output continues in this vein with more buffer swapping. Two input buffers and two output buffers are used by this program. Each buffer holds 256 samples (some experimenting should be done here to optimize this buffer size).
The SysTick timer is used to generates a 20kHz interrupt rate. During the interrupt service routine, the ADC is read and the the DAC is written; a count is maintained and when the input buffer is full, the buffers are swapped. The example code normalizes (on input) and denormalizes (on output). Strictly speaking this is not necessary and the values of 0 to 4095 from the ADC could be passed to the filter function - indeed many would say that doing floating point in an interrupt service routine is bad. Either way, the filter works fine although the SysTick ISR executes a lot faster without the (de)normaization. The actual FIR function is carried out in the main loop. The SysTick interrupt service routine uses a shared global variable (DataReady) to trigger the main loop into running the FIR function.
The code includes serial i/o functions for debugging and uses D13 (the LED) for performance measurement using an oscilloscope

The filter coefficients

Octave was used to generate the filter coefficients as follows:
pkg load signal;
FcN=1000/10000;
b=fir1(127,FcN);
The coefficients were written to a CSV file as follows:
csvwrite('/tmp/b.txt',b);
The cutoff frequency (FcN) is expressed as a 'Normalized' value i.e. divied by the Nyquist frequency (sample rate divided by 2)
The filter order is 127 which requires 128 taps.
The contents of this file were then pasted into the declaration of the filter coefficients in the program code.

Get the code

You can download the progam code here. The CMSIS library should be downloaded from the github link above.
Back to the STM32F303Nuclueo home page