For this project we will use two of Digilent’s boards, the Eclypse Z7 and the Genesys ZU.

Eclypse Z7 board will be used as signal generator to generate an analog signal modulated in Frequency Shift Modulation (FSK). That is the signal will has a base frequency, for logic ‘0’, and other different frequency for logic ‘1’. Signal will be synthesized by the ZMOD DAC.

Signal will be acquired by ZMOD ADC on the Genesys ZU board. Once all the samples are acquired, a Goertzel algorithm is executed in the RTU of the Zynq US+, detecting if the frequency to activate the output is present at the signal.

FSK means Frequency Shift Modulation, and is one of the digital modulation that we can found. This modulation uses different frequencies to encode a digital signal, for example, we can encode a logic '1' when signal has a frequency of 3MHz, and a logic '0' when signal has a frequency of 1.5MHz.

In general, this modulations works with a band of frequencies, so we can encode not only binary values, but also entire bytes, where each bit is encoded with one frequency, and the decoder only has to decode this analog signal with an FFT to get the value of each bit.

Little bit of theory.When we want to know the value of one harmonic, normally we have to compute a DFT, and we will obtain the value of nsamples harmonics. Obtaining the value of all possibles harmonics it's not free, and is translated in a high computing time, and a big delay.

If we analyze the DFT equation, we can notice that to compute each harmonic (k), the operation is recursive.

We can simplify this series in next graph.

As the term W is complex, requires 4 real multiplications and 4 imaginary multiplications for each sample. This can be halved applying some simplifications, obtaining the next second order filter.

Notice that both terms are constants, and can be computed before.

Download board files.First we need download the board files for Eclypse Z7 and Genesys ZU 3EG from Digilent's Github.

Once downloaded, copy genesys-zu-3eg and eclypse-z7 folder on <vivado dir>/data/boards/board_files.

Trick for Linux users.If you use Linux, and your system language is different than your local language, Vivado sometimes has problems with the decimal separator. To avoid that, change the environment variable LC_NUMERIC to your default language. In my case, I added this line to.bashrc.

export LC_NUMERIC="en_US.UTF-8"FSK generator design.

Let’s start with the Generator project. This project is based on Digilent's Eclypse Z7 board. This time we only will use one ZMOD Port where ZMOD DAC will be connected. On the PL of Zynq, wull be implemented the driver for DAC and the modulation.

First, we have to create a new project and select the Eclypse Board. This project will be implemented in RTL, so in this case we do not need a block design.

For can achieve high frequencies of the output signal, we will need a fast clock. Eclypse board has 2 clock sources connected to ZYNQ, one directly to PS and other from the ethernet PHY connected to the PL. In this case, since we don't use the PS, we will use the clock connected to the PL. This clock has a frequency of 25MHz, but for achieve higher frequencies, we need a faster clock, so we will use one MCMM to generate a new clock of 100MHz.

`clk_wiz_0 clk_wiz (`

.clk_out1(clk100mhz),

.reset(1'b0),

.locked(pll_locked),

.clk_in1(clk125mhz)

);

To generate the Frequency Shift Modulation, a sinusoidal signal will be stored in a BRAM. Every clock edge, an address of the BRAM will be read, ad the incremented to the next edge. That increment, can be 1, so read address will be 1,2,3,4,5..., 5 so the read address will be 0, 4, 9, ..., or 10, so the read address wil be 0, 9, 19..., thereby we will obtain 3 different frequencies, 100e6/1/32 = 3Mhz, 100e6/5/32 = 15MHz and 100e6/10/32 = 30MHz.

`reg [13:0] r14x32_signal [31:0];`

wire [13:0] w14_signal2write;

reg [5:0] r6_mem_index;

initial $readmemh("signal_goertzel.mem", r14x32_signal);

always @(posedge clk100mhz)

if (rst)

r6_mem_index <= 6'd0;

else

if (i2_btn[0])

r6_mem_index <= r6_mem_index+6'd10;

else if (i2_btn[1])

r6_mem_index <= r6_mem_index+6'd5;

else

r6_mem_index <= r6_mem_index+6'd1;

Output range on ZMOD DAC is configurable through an analog multiplexer IC3. This mux sets the gain of the output signal between 2 different output ranges. When low range is set, output voltage value can be from -1.25 to 1.25. For the high gain range, output value can be from 5 to 5 volts.

In this case, low range will be used.

Goertzel filter will be implemented in Genesys ZU board, using its Zynq MPSOC. This device has inside a quad core ARM A53 APU, dual core ARM R5 for real time purposes, and a GPU. For this project we will use the RPU.

ZMOD ADC is based on AD9648. That is an Analog Devices ADC, with a resolution of 14 bit, dual channel, 1.8 volts supply and 105MSPS. ADC is connected to ZU3EG through DDR interface.

On Zynq US+ devices, for create a DDR interface, IDDRE1 primitives has to be used. That kind of primitives are similar that IDDR from 7 series, but this difference made the Eclypse driver incompatible. To made it compatible, we have to change IDDR instantiation on Eclypse driver to IDDRE1 primitive.

PS has to generate 2 clock signals, one for SPI communication for configure ADC at 50MHz, and one for DDR interface and AXI modules at 100MHz, which is corresponding with the samples frequency.

As we can see in the reference manual of Genesys ZU, power supply for SYZYGY and FMC is not enabled by default, so we have set the correct voltage through VADJ_LEVEL0 and VADJ_LEVEL1 pins, and once configured, trigger a falling edge on VADJ_AUTO pin. In case of ZMOD ADC, the voltage needed to work is 1.8 volts. According to the reference manual, to configure 1.8 volts, both VADJ_LEVEL0 and VADJ_LEVEL1 has to be set to '1'. The module I have developed sets the correct level at pins, and 2.6ms later, clears the VADJ_AUTO pin, setting the correct voltage in SYZYGY and FMC.

Goertzel filter, like DFT, needs an static amount of samples to compute the selected harmonic. For this project, I've developed an AXI IP that reads a configurable amount of samples, and store the into BRAM. A guide to create custom AXI IP can be found here.

BRAM is configured as true dual port BRAM. One port is connected to my IP, and the other one is connected to a one port AXI BRAM Controller. To me, this is the easiest way to share a medium amount of data between PL and PS.

Axi_acquisition_control has 2 input registers, one for start the acquisition, and the second one for set the length of the window.

Configuring ZMOD ADC input.ZMOD ADC has 2 relays in each analog input. For CH1, Relay IC2 sets the coupling of the input, and IC1 the gain. According to the reference manual (link), if low gain is selected, the input range is +- 25 volts, on the other hand, if we select the high gain, the input range is reduced up to +-1V. In our case, we will select the high gain, and we ensure that the input do not reach the limit to avoid saturation that can distort the signal.

To select low gain, we have to activate IC1, and regarding the coupling, sinc our signal will be centered in zero, the AC coupling is optionally

The entire block diagram will include the Zynq US+ block, ZMOD driver, acquisition control module, BRAM and AXI BRAM Controller for store samples, and all AXI blocks needed.

First we have to open SDK and create a new BSP for the RPU.

Now, we have to create a new application project using the BSP we have created.

Then, we are ready to code.

C code for the Goertzel filter is shared on the repository, we only have to add the code to our project.

At code, first we have to define the constants for the desired harmonic, in my case, I want to detect the 10th harmonic. W term, even is computed once, is complex, so we have to separate the real part and the imaginary part.

`/* Goertzel constant definition */`

w_re10 = cos(2*pi*10/n_samples);

w_im10 = sin(2*pi*10/n_samples);

c10 = 2*cos(2*pi*10/n_samples);

Samples will be stored in BRAM. From access them from RPU we have to define a pointer to the first address of BRAM.

`long* x;`

x = (long*)0x80000000

Then, in a loop, we will compute the Goertzel filter algorithm.

`for (i=0; i<(n_samples*4); i=i+4) {`

y_re10 = (float)*x*w_re10-x_1+c10*y_re10_1-y_re10_2;

y_im10 = (float)*x*w_im10+c10*y_im10_1-y_im10_2;

y_re10_2 = y_re10_1;

y_re10_1 = y_re10;

y_im10_2 = y_im10_1;

y_im10_1 = y_im10;

x_1 = *x;

x++; /* New sample */

}

Output of the algorithm will return us a numeric value for imaginary part and real part. To increase speed, in my case I have not obtain the real absolute value. Instead of that, I add the module of both values.

To detect if the harmonic is present or not, we have to set a threshold, that must have a value according the amplitude of the read ADC signal.

Conclusions.The Goertzel algorithm works fine on Zynq Ultrascale thanks to the RPU, but the algorithm has a problem of using it as a signal processing at the edge because it needs to acquire an entire window for execute it. Also, even the RPU runs up to 500MHz and the processing of all samples takes 60us, this means that the maximum frequency at we can process data is 1/60us = 16kHz.

For improve that we have several options, most of all with the power of the Zynq US+. First we can execute the algorithm on the PL, so we can process each signal at the same time that is acquired. To do that, we need a clock at least twice of the sampling clock, but this is not a problem because the PLL from PS to PL can achieve up to 533MHz, and as the PL of the device is big enough, we will not have timing problems.

Also we can change the algorithm to a Notch Filter + Peak Detector. With this method, if the harmonic that we are looking for is the 10th, depending on the Q factor of the filter, we can obtain a correct signal in 2 or 3 cycles, at this moment the peak of the signal will be increased, so we can detect the value of this harmonic at 2/10 or 3/10 of the Goertzel time. Notice that with this method, we do not have information of the signal phase.

Other option is use twice of power. Zynq US+ devices has a dual core RTU that can run in Split mode. Unlike DFT where we can implement a Decimation In Time, on IIR filters we can’t do that because we need the past outputs to compute the next, but we can acquire windows at twice of frequency. In this case we will keep the delay of 60us, but we will improve the throughput, because the output will be updated at twice of the frequency.

As they are designed, SYZYGY expansion ports are a very good choice for signal processing mezzanine cards like ZMOD ADC and ZMOD DAC. Also, Zynq US+, has been designed thinking on real time signal processing with their RPU, so the combination of SYZYGY + Zynq US is a very good option to discover and test different signal processing algorithms.

## Comments