|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||
|
|
Designing for Performance: CPLDs vs. FPGAs
By Anita Schreiber, Philips Semiconductors
Traditionally, large designs have been targeted to FPGAs, however, designers have been challenged with how to achieve high performance from these devices. As the densities of CPLDs increase, applications that once were only targeted to FPGAs are now being targeted to CPLDs. Philips Semiconductors' CoolRunner 960 is an example of how larger, low-power devices are encroaching on the FPGA market. The CoolRunner 960 contains 960 macrocells and simultaneously delivers high performance and low-power consumption. This device can run at system speeds well over 100 MHz while consuming less power (< 100uA at standby and approximately 300 mA at 100 MHz) than a FPGA. The CoolRunner 960 also has deterministic timing and the ability to incorporate last minute design changes without changing pinout. Designers are finding that the architecture of CPLDs such as Philips Semiconductors' CoolRunner devices makes the implementation of high-performance applications easier with a shorter development cycle because device resources are more efficiently used and there are no routing or timing issues to deal with.
Architectural Differences between FPGAs and CPLDs
Because the number of inputs to the logic cells of FPGAs are limited to 4 or 5 inputs, wide logic functions are spread across several of these cells. The delay through these additional cells increases the propagation time between registers and limits the maximum frequency of the design. The delay between registers is also based on the horizontal and vertical routing channels that connect the logic cells together and can vary greatly depending on the relative location of the logic cells and the different routing channels that can be used. The total delay of the implementation of a logic function is not known until the design has been placed and routed by the FPGA software.
The basic building block of a CPLD is a macrocell, which typically consists of 4-5 product terms with up to 36 inputs followed by a D-type or T-type register. Macrocells are grouped into logic blocks which are connected via a centralized interconnect array. Logic functions are synthesized into sum of product equations with up to 36 input terms.
A CPLD macrocell can implement a function with 36 inputs. Wide logic functions can be implemented in a single macrocell, so there are no additional delays through extra blocks. Since additional blocks are not necessary, relative placement of the macrocells and additional routing delays are not an issue. Thus, the delay of the implementation of the logic is predictable and is known from the data sheet values. The delay is known before the design has been placed and routed in the CPLD.
Techniques for Implementing High Performance Designs Pipelining
Pipelining is not necessary with CPLDs. Because a wide logic function can be implemented in one macrocell, there is no need to break up the logic function. This decreases the latency of the design and leaves more available registers for implementing other logic functions within the device. The delay between registers within the design is based off the data book values and the performance of the design is known before the design is placed and routed. Since registers are not wasted implementing the pipeline, the implementation of the design in a CPLD uses fewer registers than the FPGA implementation. Therefore power dissipation is lower in the CPLD.
Replication of Logic to Reduce Fan-out
In a CPLD architecture, all macrocells are interconnected through a centralized interconnect array. An output from a macrocell is routed back into the interconnect array and is then available to all other macrocells in the device. The delay for this route is deterministic and does not vary depending on the number of loads or the location of those loads. It is therefore unnecessary to replicate logic in order to increase the performance of the device.
"1-Hot" Encoding of State Machines
The number of registers required to implement a "1-hot" state machine is larger than that of a binary encoded state machine, therefore the number of registers required to implement a state machine in an FPGA increases. As the number of registers increases, the power dissipation of the device increases. Also, large state machines tend to be hard to implement due to the large number of registers and routing resources required by this encoding method.
CPLDs typically use a binary or gray-code encoding method for state machine implementation. The number of registers used in a binary encoded state machine is log2 of the number of states. A 16-state state machine would use only 4 registers. This encoding method works well in CPLDs because the logic to determine the active state can have a large number of inputs. Less registers are required to implement a state machine and therefore CPLDs can implement large state machines quite easily.
Effects of these Design Techniques on Logic Synthesis
To successfully synthesize a high performance HDL design to an FPGA, the HDL description of the design must break up a wide logic function into pieces that fit within the basic building block of the FPGA and include the additional pipeline registers. This then makes the HDL description of the design device specific and removes the capability of re-targeting the design without changes. It is also not intuitive, when describing the behavior of a design, to break up logic and insert registers that would otherwise be unnecessary.
For example, consider the HDL description of a 12 to 1 multiplexor. The device independent and intuitive description of this function is to describe the decoding of the select lines to output the selected input signal. The implementation in an FGPA would require that this function be pipelined, therefore the designer would have to decide how to break up the 12 to 1 multiplexor into smaller multiplexors that fit within the FPGA logic cell. The outputs of these smaller multiplexors are then multiplexed to form the 12 to 1 multiplexor. The HDL description would then be written to describe the smaller multiplexors with the insertion of the necessary pipeline registers followed by the final level of multiplexing. At this point, the HDL description is very device specific and is no longer intuitive.
As mentioned above, there are many signals with high fan-out where the generation of these signals has to be replicated to achieve high performance within an FPGA. This replication of logic is again not intuitive to the basic description of the application and makes the HDL description of the design device dependent and not easily re-targetable to an ASIC or other device without design changes.
Because the routing between logic cells in an FPGA contributes significantly to the delay of the design, achieving high performance from an FPGA typically requires floorplanning the design to keep routing delays at a minimum. The need to control the placement of blocks in the design also makes the use of a HDL description of the design device dependent and synthesis more difficult.
The combination of high speed, low power and routable architectures with CoolRunner CPLDs addresses some of the market's most challenging design considerations. Since CoolRunner CPLDs have the capability to implement wide logic at high performance without pipelining, synthesis to the CPLD is much more effective and allows the HDL description of the design to be intuitive and device independent. This HDL description is directly re-usable if the design needs to be re-targeted to an ASIC. Control of the placement of the logic is not necessary to insure the performance of the design, nor is additional logic needed to replicate signals with high fan-out. The HDL description of the design does not need to include registers for the sake of pipelining or logic replication that are otherwise functionally unnecessary, but can instead describe the true desired behavior of the design. This enables designers to develop HDL descriptions of their designs quicker and easier.
Summary
Because the architectural differences between CPLDs and FPGAs, additional registers are not necessary to achieve high performance in a CPLD. This means that the registers within the device are utilized more effectively and that a design will not be forced into a larger device to accommodate functionally unnecessary registers. Since the number of registers to implement a high performance design in a CPLD is less than the registers required in a FPGA, the power dissipation of the CPLD will be lower. The use of a HDL for design entry is intuitive and device independent.
Home | Product of the Week | Tech Note | App Note | Vendor Tools | Feedback
|
|||||||||||||||||||||||||||||||||
|
Copyright © 2003 ChipCenter-QuestLink About ChipCenter-Questlink |
||||||||||||||||||||||||||||||||||