Compute Acceleration
Today's workloads in compute acceleration are as diverse as the end applications — everything from financial trading and genomics to machine learning inference and training. However, the workloads share some common characteristics including the types of arithmetic functions, number formats (integer and floating point), and aggressive performance targets. Furthermore, as processing naturally migrates closer to the edge, power, thermal aspects and performance per watt become key metrics. It is in these areas that FPGAs in general, and the Speedster7t family in particular, excel.
The Speedster7t FPGA family is optimized for high-bandwidth workloads and eliminates the performance bottlenecks associated with traditional FPGAs. Built on the TSMC 7nm FinFET process, Speedster7t FPGAs feature a revolutionary new 2D network-on-chip (2D NoC), an array of new machine learning processors (MLPs) optimized for high-bandwidth and artificial intelligence/machine learning (AI/ML) workloads, high-bandwidth GDDR6 interfaces, 400G Ethernet and PCI Express Gen5 ports. The 2D NoC connects all of the interfaces to over 80 access points in the FPGA fabric to deliver ASIC-level performance while retaining the full programmability of FPGAs. Get started today with the VectorPath accelerator card, featuring the Speedster7t FPGA.
Speedster7t Solution
- Speedster7t FPGAs provide a high-performance, power efficient computational acceleration solution for defense, financial, medical, scientific, oil and gas, and life science applications:
- Machine learning (ML) inference and edge training
- Financial analysis and high-frequency trading
- Genomic analysis
- Video and image processing
- The inherent parallelism and flexibility of the FPGA architecture is well suited to these high-throughput applications.
- High-speed interfacing with PCIe Gen5 connectivity and high-performance Ethernet, as well as a dedicated 2D network-on-chip (NoC) for high bandwidth data movement.
- Storage of large data sets is possible with DDR4/5 bulk storage and GDDR6 interfaces for high-bandwidth access to external memory.
- Data processing supports a wide-variety of number formats from low-bit width integer math to high-performance floating point operations, including native support for matrix multiplications and complex arithmetic (for example, to support beamforming applications).
-
Speedster7t FPGAs are particularly well suited to ML inference and edge analytics operations.
Application Requirements | Speedster Value |
---|---|
Need for high bandwidth external connectivity | Multiple ports of 400G Ethernet and PCIe Gen5 |
Highest memory bandwidth for buffering, >1 Tbps | Up to 16 independent GDDR6 channels at 16 Gbps offering up to 4 Tbps of total bandwidth |
Wide and high-performance datapath |
Dataflow optimized for compute acceleration matrix vector mathematics
|
Significant computational requirement for integer arithmetic |
|
Neural network inferencing requires a large number of matrix multiplications, high-performance computation and significant amounts of data movement |
Optimized multiply-accumulate core for integer and floating-point arithmetic
|
Machine Learning Deep Learning |
High Performance Compute | Genomics | Video & Image Processing | |
---|---|---|---|---|
Highest Performance SerDes | ||||
112G multi-Standard SR/MR/LR PHY | Yes | Yes | Yes | Yes |
Most Advanced Interface IP | ||||
PCIe Gen5 | Yes | Yes | Yes | Yes |
GDDR6 - 4 Tbits/sec of memory bandwidth | Yes | Yes | Yes | Yes |
DDR4 - up to 3,200 MHz, 3DS stacked memory | Yes | Yes | Yes | Yes |
DDR5 - up to 4,400 MHz | Yes | Yes | Yes | |
Application specific interface | Yes | Yes | ||
Terabit Speed Routing | ||||
NoC | Yes | Yes | Yes | Yes |
Bus routing | Yes | Yes | ||
Fully flexibility bit wise routing | Yes | |||
High-Throughput Processing | ||||
Datapath crypto | Yes | Yes | ||
MLP | Yes | Yes | Yes | Yes |
Fine grain hardware reprogrammability (examples listed) | Format conversion, activation function | Monte Carlo analysis | PairHMM algorithm | Custom codecs |