====== Using the Digital Discovery to look at Zynq boot sequence ====== ~~TechArticle~~ ==== Introduction ==== When developing new FPGA boards, it's important to know the specifications of the hardware on the board and see the timing of signals. The Digital Discovery provides a High Speed Logic Analyzer that allows you to visualize and analyze the signals traveling through the board. In the process of developing a new Zynq Board, the speed of the QSPI transactions in the boot sequence wasn't an evident specification. This project uses the Digital Discovery to visualize the boot sequence to determine timing. ==== Inventory ==== * Digital Discovery * Zynq board with flash * **Note:** //This document was written using a Zybo Z7 of a revision earlier than D.0.// * SOIC clip if available * Wires {{ :learn:instrumentation:tutorials:zynq-qspi-boot:soic_clip.png?400 |Figure 1. SOIC clip.}} When deciding how to tackle this problem, there were two smaller Instrumentation devices that have a Logic Analyzer, the Analog Discovery 2 and Digital Discovery. There were two reasons for using the Digital Discovery instead of the Analog Discovery 2. The first reason was that the QSPI transactions can take place at much higher clock speeds, over 100 MHz, so having an adequate sample rate is very important. The other reason was that because of the 512 MB DDR memory that the Digital Discovery has, it can perform very large acquisitions. ===== Step 1: Connecting the Digital Discovery ===== The following connections are required: ^ QSPI signal ^ QSPI/clip pin ^ Digital Discovery pin ^ ^ cs | 7 | DIO0 | ^ clk | 16 | DIO1 | ^ d0 | 15 | DIO2 | ^ d1 | 8 | DIO3 | ^ d2 | 9 | DIO4 | ^ d3 | 1 | DIO5 | ^ gnd | 10 | gnd | {{ :learn:instrumentation:tutorials:zynq-qspi-boot:digitaldiscovery_zyboz7-demo-2000.png? |Figure 2. Connections}} Make sure to check for signal integrity/cross talk when using cables like this. In some cases, twisting a signal with a GND wire will be needed (in this case it was the blue cs wire). ===== Step 2: QSPI script ===== A custom interpreter is used which will translate the QSPI signals into data. This is activated by adding a "Custom" channel from the Logic instrument in WaveForms. Below is the js code which interprets the QSPI signals. // rgData: input, raw digital sample array // rgValue: output, decoded data array // rgFlag: output, decoded flag array var c = rgData.length // c = number of raw samples var pClock = false; // previous cock signal level var iStart = 0; // used to keep track on word start index var cByte = 0; // byte count per transmission var cBits = 0; // bit counter var bValue = 0; // value variable var fCmd = true; for(var i = 0; i < c; i++){ // for each sample var s = rgData[i]; // current sample var fSelect = 1&(s>>0); // pin0 is the select signal var fClock = 1&(s>>1); // pin1 is the clock signal var fData = 1&(s>>2); // pin2 is the data signal var fData4 = 0xF&(s>>2); // DIN 2-5 DQ 0-3 if(fSelect != 0){ // select active low // while select inactive reset our counters/variables iStart = i+1; // select might become active with next sample cByte = 0; cBits = 0; bValue = 0; pClock = false; fCmd = true; continue; } if(pClock == 0 && fClock != 0){ // sample on clock rising edge bValue <<= 4; // serial data bit, MSBit first bValue |= fData4; cBits++; if(cBits==2){ // when got the 8th bit of the word store it cByte++; // store rgValue/Flag from word start index to current sample position for(var j = iStart; j < i; j++){ // Flag change will be visible on plot even when data remains constant. // This is useful in case we get more consecutive equal values. rgFlag[j] = cByte; rgValue[j] = bValue; } iStart = i+1; // next word might start after this sample cBits = 0; // reset bit count for the next byte bValue = 0; // reset value variable } } pClock = fClock; // previous clock level } In parallel with this interpreter, we can also use a standard SPI in order to see instructions which are not sent via QSPI, for example the first read instruction. ===== Step 3: Trigger and acquisition ===== Although the maximum QSPI clock frequency is about 100 MHz, when booting, a maximum frequency of 25 MHz is used. Also, the entire boot transfer takes about 700 ms. Because of this, both a large number of samples and a decent sample rate are needed, and this is where the Digital Discovery comes in handy. 268 million samples at 200 MHz would translate into a ~1.3 second frame. The acquisition itself is quite demanding, using a lot of the PC's memory (16 GB) and it also takes a long time to process the data. The trigger is set on the falling edge of the CS signal. Below is the entire QSPI transaction captured by Waveforms. {{ :learn:instrumentation:tutorials:zynq-qspi-boot:full_transaction.png |Figure 3. Full transaction.}} Notice the short pause near the left end of the acquisition, that is where the clock frequency changes from 5.4 MHz to 25 MHz. ===== Step 4: Boot transfers ===== There are two documents that need to be read in order to understand what the data transfers represent. One is the [[https://docs.xilinx.com/v/u/en-US/ug585-Zynq-7000-TRM|Zynq TRM]] and the other one is the [[https://www.infineon.com/dgdl/Infineon-S25FL128S_S25FL256S_128_Mb_(16_MB)_256_Mb_(32_MB)_3.0V_SPI_Flash_Memory-DataSheet-v18_00-EN.pdf?fileId=8ac78c8c7d0d8da4017d0ecfb6a64a17|flash memory's datasheet]]. The instructions sent from the Zynq to the flash memory are always sent via SPI using D0. The first instruction sent is 0x03 0x00 0x00 0x20 which means SPI READ from address 0x20 and the reply is also received via SPI using D1, 0x66 0x55 0x99 0xaa. The flash read instruction is explained on page 85 of the datasheet. {{ :learn:instrumentation:tutorials:zynq-qspi-boot:first_transfer.png |Figure 4. First transfer.}} In the Zynq TRM pages 170 and 179 explain what that reply means. In short, that set of bytes tell the Zynq that the memory is QSPI capable. It is also important to observe that, at this point, the SPI clock frequency is 5.405 MHz, which is a relatively low speed. From this point on, since it has been determined that the memory supports QSPI, all transactions will be done on all 4 data lines. For instance, the next instruction will be 0x6b followed by a 3 byte address. 0x6b represents a quad read instruction and the response will be seen on the QSPI interpreter after 8 clock periods, which are "dummy" bytes. {{ :learn:instrumentation:tutorials:zynq-qspi-boot:second_transfer.png |Figure 5. Second transfer.}} In this case, the address is 0x1d and 7 bytes are read. These bytes are from addresses 0x1d, 0x1e, 0x1f which are part of an interrupt table and then it reads 4 bytes from address 0x20 which are the same bytes read at the first SPI read. The Zynq will proceed to read bytes, incrementing the address until it reaches 0x45, which is the end of the bootROM header. Unfortunately, because we do not have access to the BootROM code, the rest of the boot sequence is not so transparent. At some point, the FSBL (first stage boot loader) will begin to run, most likely where the SPI clock frequency changes to 25 MHz as seen below, 84 ms after the boot process started. {{ :learn:instrumentation:tutorials:zynq-qspi-boot:fsbl_start.png |Figure 6. FSBL start.}} The FSBL will then read the boot image and analyze the different partitions that it contains, including the .bit file, which will configure the Zynq's PL, and the .elf which will run in the ARM. More details on the boot image and boot process can be found in [[https://docs.xilinx.com/v/u/en-US/ug821-zynq-7000-swdev|this user guide]]. {{tag>digital-discovery project }}