In his blog post, he explains that FPGAs have TWO big advantages over other means of processing:
- The ability to use custom width in data to save time and space
- The processes being managed can be handled in parallel, which essentially doubles the compute power
Pablo walks you through accelerating some of the algorithms and executing them partially (or entirely) on FPGA. The algorithm used is an iterative square root finder – which performs an iterative loop multiplying each number by itself and compares the result with the input value. When the result number is greater than the input value, the square root result is corresponding with the value that we multiplied by itself less 1. Each iteration of the algorithm will perform a multiplication, and a comparison, so the duration of the algorithm will depend of the input number, since the square root of the input number will be the similar to the multiply and compare operations. The time spent on the execution of this algorithm on an APU or RPU will be the time that processor spends in execute this instructions. On the other hand, the time that our acceleration will spend to execute the same algorithm depends on the programmable logic itself. In this case, the processor has to send input value through AXI to the PL, then PL will execute the algorithm, and then, data will return to the processor. This kind of acceleration is very useful when time spent in data interchange is significantly lower than the execution time, meaning the acceleration can be transformed in deceleration.
To read the entire guide, head over to ControlPaths.com for a detailed example of creating custom AXI IP for accelerating your applications.