|
« Issue 3: What Is a DWA, Anyway? |
Blog Home
| Issue 7: Partner User Conference Update »
January 15, 2007
Issue 6: Spotlighting FPGAs, last of 3
Performance Multipliers for Data Stream Processing
Yes, star crossed in pleasure the stream flows on by Yes, as we're sated in leisure, we watch it fly.And time waits for no one, and it won't wait for me And time waits for no one, and it won't wait for me. Time can tear down a building or destroy a woman's face Hours are like diamonds, don't let them waste. Time waits for no one, no favours has he Time waits for no one, and he won't wait for me.
- The Rolling Stones, Time Waits for No One (Jagger/Richards), from the album, "It's Only Rock 'n Roll" (1973)
We've dedicated the last several postings to the Field Programmable Gate Array (FPGA) - a key performance multiplier in the NPS® system architecture. Last time out I talked about the market growth of FPGAs as a mainstream technology in multiple applications settings outside of data warehousing.
This is the last of a three-part series on FPGAs, spanning the following topics:- "So, What Is an FPGA?" - aimed at providing a most-basic introductory primer of the technology, its capabilities and its promise (posted 28th November).
- "FPGAs in the Mainstream & Some of Their Practical Uses" - a look at the use of FPGA technology across a broad swathe of market applications. (posted 20th December)
- "OK - How Does Netezza Get a Performance Edge from FPGAs & What Does the Future Hold?" - linking FPGA capabilities to the benefits it brings to the NPS system and possible future directions it could enable.
|
Today, we'll dive in a bit into how FPGAs enable high performance at low cost in the NPS appliance, and what types of applications the technology may enable for the NPS in the future.
OK - So how does Netezza get a performance edge from FPGAs? A critical element of Netezza's architecture is the implementation of direct-attach storage in a massively parallel array of query processing elements. Called Snippet Processing Units (SPUs), these query processing elements collocate CPU, memory and FPGA with each disk drive. The SPUs are arranged in an array that can be as small as several dozen or as large as nearly a thousand in today's NPS systems.A critical component of overall data warehouse performance lies in the disk bandwidth that can be applied to a given problem and in turn, the level of processing horsepower that can be applied to that data. In short-hand terms, Netezza refers to its architectural approach as "bringing the query to the data." Rather than moving vast amounts of data across high-speed interconnecting (and sometimes non-blocking) networks as other systems do, the NPS system reduces the data to the information essential to the query as close to the disk source as possible.
The focus of the architecture is to enable streaming processing of the data: eliminating unneeded data as early as possible and processing the rest as rapidly as it can be read from the disk drives. That's where the FPGA comes in. The FPGA in a Netezza SPU has two primary roles.
In the first, it acts as the disk controller, controlling all of the disk read and write activities on the SPU.
In the second, the FPGA efficiently applies low-level database primitives, offloading significant work from other processing elements in the system. As table data streams from the disk on the SPU, the FPGA applies the transaction visibility list (only transactions that were current in the database at the start of the query are visible to it) and then applies the appropriate column projection and row restriction rules. Then only data that satisfies the visibility, projection and restriction rules is sent from the FPGA to the memory and CPU on the SPU for additional processing, if necessary.
Adding to the performance boost provided by the FPGA in general, another important system feature known as "Zone Map" is realized in a software module of the NPS system known as the storage manager. We think of Zone Maps as an anti-Index in Netezza, telling the system what data not to read. For each numerical column, the Zone Map can take advantage of any natural ordering of the data in the table (e.g., date, customer number, order number, etc.) and reduce the number of data blocks read in response to a query to only those required. For example, if a query were looking for information about transactions that took place between the beginning of September and end of October, the Zone Map function of the storage manager would direct the FPGA to read only those data blocks containing records from September or October, thereby eliminating the need to perform a full disk scan for each query.
The FPGA implements the read of the appropriate disk blocks and additionally filters and projects only data relevant to the query. This can improve query-processing rates by two or more orders of magnitude.
FPGA as performance multiplier: an example As an example, consider the following simple SQL query:
| Select state, gender, age, count(*) From 8 billion Row Table Where dob < '04/01/2000' And dob > '12/31/1999' And zip = 32605 Group by state, gender, age;
|
In this example, the storage manager and FPGA would use Zone Maps to first limit the disk read to only those disk extents with dates of birth occurring in the three-month period of January through March 2000, rather than the full table. Then, when the data was read from the disk, the FPGA would further restrict the rows of data returned to those records within the three-month range and a zip code matching the query and finally, the column data projected to the memory and CPU would be limited to only state, gender and age information of each record. If the table in question contained 100 or more columns for each record, this could represent less than 3% of the column data. If one assumes the table in question contained birthdate information for just the last seven years, this would dramatically reduce the row-count of data delivered to memory/CPU as well - specifically by more than 25:1, or 3 months out 84.
Overall, for this example, the combination of Zone Maps with FPGA projecting and filtering of the data would result in just 0.1% of the full table data being sent to the memory and CPU for additional processing.
From this, you can see how the FPGA acts as a Performance Multiplier for query processing. Before a single CPU cycle or RAM memory location has been used, the FPGA has reduced the overall data required for processing by as much as multiple orders of magnitude.
And what does the future hold? As suggested by Keith Underwood of Sandia National Labs, the price-performance and power efficiency look like they will enjoy an order of magnitude advantage over the 'x86' CPU technology roadmaps by the end of the decade. Using its performance and I/O advantages, FPGA vendors are already able to embed CPU core technology (Xilinx - "Embedded Processing" & DSP-FPGA.com - "FPGAs - Poised to play in embedded applications") directly inside an FPGA.
Projected FPGA Roadmap Capabilities
 Source: Composite of FPGA Vendors' Historical & Roadmap Data |
We at Netezza fully expect the FPGA advantage to increase over time. Based on suppliers' and research technology roadmaps, by the end of the decade we are anticipating 5X enhancements in each of the following areas: - cost
- available logic
- functionality per unit of power
- speed
The result will be extended, differentiated functionality introduced into current and/or future versions of FPGA technology, further increasing the price/performance and capability advantages of the NPS data warehouse appliance. Possibilities for expanded functionality include, but are certainly not limited to, in-line, streaming data compilation or encoding, advanced filtering and analytic logic operations ("Legacy FPGA Designs Can Be Migrated to Achieve Better Performance"); and even much more powerful pre-processing of query data by embedding CPU processing capabilities directly within the device ("FPGA Advances Pave The Way Toward True SoC Solutions"). If, how and when these may be manifest in the Netezza technology roadmap is still to be seen. However on the strength of the FPGA technology roadmap and the technology's significant benefit to the streaming processing needs of data warehousing, it's clear to us that the FPGA will continue to play a major role with Netezza for the foreseeable future.
The technology trends for high-performance systems is clear. In more and more industry domains ("In Praise of FPGAs"), low-power programmable logic devices are going to act as either performance accelerators or even the primary performance engine. By offering high performance, low power requirements and highly-flexible reprogrammability, the use of FPGAs promise to continue as a strong industry trend.
In short, we believe that the advantages that FPGA technology brings to the NPS system have 'legs'. We plan to continue to exploit those advantages for the benefit of our customers and don't intend to hide them "under a bushel" any longer.
Share:
Posted by Phil Francisco at January 15, 2007 11:00 PM
Post a comment
|