|
||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
New FPGA Architectures Address Wide Gating Functions
By By Om Agrawal, Vice President and Chief Technical Officer (om.agrawal@vantis.com), and Bill Harding, Senior Product Marketing Engineer (bill.harding@vantis.com), Vantis Corp.
Introduction
The explosive growth in FPGA density coupled with recent advances in FPGA architecture have led to speculations that FPGAs will displace CPLDs in virtually all PLD applications. Any such speculation ignores fundamental differences between FPGA and CPLD architectures that make one device more appropriate for a given application than the other. While some integration of functionality will occur, both a CPLD's wide logic and interconnect structure gives it high, predictable performance from input pin to output pin, regardless of design complexity. An FPGA's flexible interconnect structure makes overall performance design-dependent, and performance is usually a measure of the highest clock frequency that can be applied to the design without violating setup and hold times. Any measure of design-independent pin-to-pin delays in an FPGA is meaningless. Timing analysis tools and not data sheet tables tell designers how fast their circuits will run in an FPGA.
An architectural difference that plays a major role in determining the types of designs that a PLD can support is the number of registers that a device has relative to the amount of logic that it can implement. CPLDs tend to have relatively few registers while FPGAs have a considerably larger register set. This arrangement works very well when CPLDs are used to implement state machines and FPGAs are used to handle data path applications.
Finally, reprogrammable CPLDs usually have non-volatile EEPROM internal configuration memories while reprogrammable FPGAs (by far the largest segment of the FPGA market) have volatile SRAM internal configuration memories. Once its configuration program is loaded, a CPLD is fully functional whenever power is applied. An FPGA, however, loses its configuration program whenever power is turned off, thus requiring reconfiguration whenever power is applied. For this reason alone, many control applications cannot be integrated into an FPGA and must use a CPLD.
On the other hand, the more flexible SRAM internal configuration memory used by FPGAs allows them to be rapidly reconfigured by a host microprocessor within a system, even during normal system operation. This is a significant advantage for data path applications where reconfiguration "on the fly" may be beneficial, and may also be used by control functions that can otherwise tolerate FPGA characteristics.
Trends
System-level integration pulls diverse types of logic into a single chip. This diagram shows control logic (wide gating functions), FIFO and SRAM (embedded memory), data path logic (narrow gating), a core (predefined IP), and glue logic (random) integrated on a single chip. Neither traditional CPLD nor FPGA architectures can handle this level of integration. This level demands a new class of PLD architecture.
Therefore, if a control or state machine function gets many of its inputs from an FPGA, it may be possible to integrate that function into the FPGA that is the source of the input signals. The control function output signals may either exit the FPGA for use elsewhere in the system or become inputs to other functions within the FPGA.
The problem with the above scenario is that traditional narrow-gate FPGA architectures do not handle wide-gate functions very efficiently. To generate wide-gate functions, the narrow-gate blocks in a traditional FPGA must be cascaded through several levels of logic, with each level degrading performance.
If any traditional CPLD functions are to be integrated into FPGAs, the challenge is to create an FPGA architecture that can handle wide-gating functions efficiently.
This diagram shows how a single CPLD macrocell can implement a 20 product-term (32 to 36 input) function in one logic level, while an FPGA must cascade two or more levels to handle wide-gating functions. However, the VF1 FPGA shown in this diagram implements a 16-input function using a single building block without resorting to general routing resources. While the implementation is slower than the CPLD, it is faster than FPGAs that must cascade multiple building blocks to achieve the same circuit.
Approaches to Implementing Wide Gating Functions in an FPGA
Fixed Granularity FPGA Architecture
Applications that need four or fewer inputs are easily implemented in the fixed granularity architecture. While a four-input application makes the most efficient use of the building block, two- or three-input functions can be implemented with a 33% to 50% sacrifice of the LUT resource. Also, when an application calls for anything wider than four inputs, multiple four-input blocks may be cascaded to create whatever level of logic is needed.
On the plus side, the fixed-granularity 4LUT architecture is well known and makes efficient use of silicon. It scales easily to virtually any density. Because it is a simple architecture, the development software that supports it is also simple and can scale with the architecture.
On the other hand, cascading requires extensive use of general routing resources. As a result, performance suffers when 4LUT building blocks are cascaded to build the wide functions of the type required for CPLD applications. While process technology advances give this architecture better performance than it had in older technologies, it usually isn't enough of an increase to overcome the performance degradation caused by cascading. As a result, the fixed granularity architecture may prove unacceptable for integrating CPLD applications into an FPGA.
Mixed Logic FPGA Architecture
In this approach, both CPLD structures and FPGA architectures share a global, hierarchical FPGA-like interconnect structure for inputs and outputs. A control function implemented in a CPLD structure may receive its inputs from FPGA-based functions elsewhere in the FPGA, and may send its outputs to other FPGA or CPLD logic within the chip or directly to output pins. The global interconnect structure provides the lines of communications.
This approach has obvious advantages. It allows CPLD blocks to handle wide gating functions without cascading FPGA building blocks, thus improving performance within the local CPLD block. It also provides a level of predictability within the CPLD blocks that is not possible in classic FPGA architectures.
This approach has equally obvious, and in some cases not so obvious, disadvantages. The most obvious is that that the hierarchical FPGA-like interconnect structure dampens the performance of the CPLD blocks and decreases their predictability. An embedded CPLD cannot possibly match the performance of a stand-alone CPLD. For applications where this is not a serious problem, this class of FPGAs may be a good choice. For those applications where stand-alone CPLD performance is needed, stand-alone CPLDs will be used.
A less obvious disadvantage is that mixing different types of architectures on a single chip puts a heavy burden on logic synthesis and place-and-route software. Providing efficient synthesis or place-and-route support is complex enough when dealing with a single type of CPLD or FPGA logic. New software must be developed to support these new merged architectures, and simply joining CPLD place-and-route software with FPGA software in a single package will not solve the problem. For example, how does the software recognize CPLD functions and assign them to CPLD blocks while assigning FPGA functions to FPGA blocks? This isn't a problem for stand-alone CPLD or FPGA software because this decision is made by the system designer before submitting the design to the software. New HDL coding styles may be required to allow the system designer to direct assignments of various blocks of logic.
Perhaps the toughest problem for the FPGA/CPLD architecture designers is determining the ratio of FPGA logic blocks to CPLD logic blocks. Any ratio that you choose will be exactly right for a small percentage of designs, almost right for a somewhat larger percentage, and wrong for an even larger percentage. The result will be inefficient use of silicon for most designs.
When integrating CPLD and FPGA architectures on a single chip, it's most likely that all the logic blocks in the integrated architecture chip will use volatile, SRAM-based internal configuration memories. To do otherwise would limit the device to being manufactured in a process technology that implements non-volatile memory cells. This should not be a problem for most CPLD applications, except for those applications where the function must be operational immediately on system power up. In those cases, stand-alone CPLDs will be used.
Variable Granularity FPGA Logic
Specifically, variable granularity allows both wide- and narrow-gating functions to be implemented using the same logic without cascading. In the Vantis VF1 FPGA architecture, for example, the lowest level building block is a 3LUT, with two 3LUTs sharing a single flip-flop. These two 3LUTs can be combined to create a classic 4LUT-plus-flip-flop architecture. But they can be combined even further to create 5LUT and 6LUT functions in a single level of logic, using only very high-performance multiplexers to establish the connections between them.
A variable-granularity architecture that targets CPLD applications must address more than six input functions. Preferably, thirty-two input-wide functions should be a minimum, with sixty-four being even better. The Vantis VF1 architecture cited above handles thirty-two inputs in two logic levels using only local interconnect. That is a step in the right direction, and proves the viability of variable-grain architectures in handling wide-gating functions without embedding CPLD architectures on the chip.
A key element in a variable-granularity architecture is the use of high-performance local interconnect in addition to hierarchical interconnect. Hierarchical interconnect lines typically add as much or more delay than logic blocks. The use of short, dedicated interconnect lines reduces this delay significantly, thus improving performance for both CPLD and FPGA functions.
The variable-grain architecture approach offers several advantages over both the classic FPGA architecture and the mixed-architecture approaches. First, it offers performance that is comparable to the mixed-architecture approach and higher than the classical FPGA cascaded approach. Because it uses only one type of architecture, it offers more efficient use of resources than the mixed-architecture approach, resulting in more efficient use of silicon in most applications.
Since variable-granularity architectures are more complex than classic FPGAs, development software will be more complex, but should not be as complex as that needed for mixed-architecture devices. Like the mixed-architecture devices, special HDL coding styles may be needed to extract the best possible performance.
Architecture characteristics comparisons for wide-gating functions
Conclusions
The most important advantage of new FPGA architectures, then, is the new levels of flexibility of addressing wide-gating functions that they offer to the designer. New FPGA architectures will allow designers more flexibility in partitioning their systems. Narrow-gating functions will generally be implemented in an FPGA. If a wide-gating function is needed, the designer will have the option of integrating it in an FPGA if the FPGA delivers adequate performance, or implementing it in a stand-alone CPLD. Economics and performance considerations will tend to drive the decision.
Home | Product of the Week | Tech Note | AppReview | Vendor Tools | Feedback
|
|||||||||||||||||||||||||||||||||||||||||||||||||
|
Copyright © 2003 ChipCenter-QuestLink About ChipCenter-Questlink |
||||||||||||||||||||||||||||||||||||||||||||||||||