|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||
|
|
Improving Performance in Complex Programmable Logic Devices (CPLDs) with the FPGA Express Software
By Phil Simpson, Altera Corporation (psimpson@altera.com)
Introduction
The first step in this iterative process is to write HDL code that is "architecture-aware" and "synthesis-aware." This means that your HDL source code should use constructs that will utilize the architecture features of PLDs, such as abundance of registers and embedded memory blocks. While writing HDL code you must also have some knowledge about how the synthesis tools generally interpret particular HDL styles. The next step is to use design methodologies, like pipelining and logic duplication, to improve the design implementation. The final step in this iterative process is to use various options and constraints in the FPGA Express software or in the place and route tool to achieve the desired results.
Effective HDL Design Techniques for the FPGA Express Software
By using effective HDL design techniques in the FPGA Express software, you can streamline your designs, reduce and optimize logic, reduce logic delay and improve overall performance. This section describes the following techniques:
After creating a hierarchical design, you have the option of flattening the design during synthesis to allow FPGA Express to optimize across the boundaries. However, if you have different optimization conditions (area or speed) for different blocks of your design, you now have the option of preserving the hierarchy and setting the desired constraints during synthesis.
Latches vs. Registers: Since PLDs have registers built into the silicon, designing with latches generates more logic and lower performance than designing with registers. Therefore, when you are designing combinatorial logic, you should avoid unintentionally creating a latch due to your HDL design style. For example, when Case or If statements do not cover all possible conditions of the inputs, combinatorial feedback can generate latches. Figure 1 shows sample VHDL code that generates a latch.
A latch is generated when the final ELSE clause or WHEN OTHERS clause is omitted from an If or Case statement, respectively. Figure 2 shows sample VHDL code that prevents the unintentional creation of a latch.
Priority-encoded If statements: To reduce the propagation delay of critical-path signals in a design, you can use If statements to perform priority encoding. Example 3 illustrates good design practice if sel1 is a late-arriving signal in the critical path. In this case, sel1 has the highest priority.
Figure 4 shows the schematic representation of the Verilog code from Figure 3. The late-arriving signal sel1 is placed such that it passes through minimum logic.
"Don't Care" conditions: The FPGA Express software generally treats unknowns as "don't care" conditions to optimize logic. Within a design, you can assign the default case value to "don't care" instead of to a logic value to give the best logic optimization. However, you must verify all "don't care" conditions in simulation.
Gated clocks: Gated clocks create logic delays and clock skew, and use additional routing resources within devices. Therefore, you should avoid using gated clocks or sometimes you may be able to use the clock enable input. However, if you must implement a gated clock in your design, some PLD architectures include features that will reduce the hazards associated with them. For example, in Altera's FLEX devices you can use the GLOBAL primitive to place the gated clock on one of the high-fan-out internal global signals. Figure 5 shows an example that implements a gated clock using the GLOBAL primitive in a VHDL design.
State Machines: In designs containing state machines, you should separate the state machine logic from all arithmetic functions and data paths to improve performance. Use a state machine purely as control logic. For state machines targeting PLDs, using one-hot encoding gives better results as this encoding style uses one bit per state, which uses more state registers but reduces the decoding logic required. To optimize a state machine, use one-hot encoding when targeting register-rich, look-up table (LUT)-based architectures such as Altera's FLEX devices since such devices work well with low-fan-in decoding logic. Consequently, one-hot encoding increases performance and efficiency for these devices.
Another design technique that increases state machine performance for PLDs with embedded memory blocks, like Altera's FLEX 10K device EABs, is placing state machines in EABs. Extremely complex state machines with limited I/O are ideal for implementing in EABs. This is because the EAB implements complex functions in a single logic level, resulting in more efficient device utilization and higher performance. Follow these basic guidelines when placing a state machine in Altera's FLEX10K EABs:
FPGA Express Design Techniques for CPLDs
Register balancing: This technique is used to reduce long delays and to increase shorter delays. This is especially beneficial in applications where registers are used purely for latency purposes. As you can see from figure 6, this technique is used to satisfy the register-to-register timing requirements by moving the registers in the design. This does not change the latency of your design, as the number of register levels remains constant.
Pipelining: This technique uses registers rather than combinatorial latches to hold logic. When pipelining a design, you add registers to break up large combinatorial delays. Because LUT-based PLD architectures include a register with each LUT, pipelining combinatorial logic generally does not require additional device resources. Therefore, using registers for pipelining improves performance without increasing the logic utilization in a device. See Figure 7.
Logic Duplication: Logic duplication is a good method for reducing fan-out and improving the design performance on a given path. You should use logic duplication on high-fan-out nodes and flip-flops because it reduces the number of loads these signals drive and can potentially ease routing. The Synopsys FPGA Express software provides the Merge Duplicate Register option from the GUI. You can disable this option forcing FPGA Express synthesis to preserve a register and not optimize across it. By default, this option is enabled in FPGA Express; therefore duplicate registers are optimized out. Here is an example of a VHDL code that has duplicate registers.
Figure 9 shows the schematic representation of the VHDL logic from Figure 8 with the Merge Duplicate Register option disabled in FPGA Express.
Figure 10 shows the schematic representation without logic duplication. In this case the Merge Duplicate Register option is enabled.
LPM Functions: Functions from the Library of Parameterized Modules (LPM) are large building blocks that can be customized easily for your application by using different ports and by setting different parameters. Some PLD vendors, such as Altera, optimize LPM functions for optimal placement in their device architectures. In FPGA Express, LPM functions can be instantiated in the HDL source code or inferred from certain operators. Figure 11 below shows a VHDL design that instantiates the LPM_MULT function.
In addition to instantiating LPM functions, the FPGA Express software infers LPM functions from certain operators. For example, in designs targeting Altera's FLEX devices, LPM functions are inferred from relational operators with non-constant operands and multiplication operators. Figure 12 shows a VHDL design where a lpm_mult function is inferred by the FPGA Express software from the multiplication operator.
FPGA Express Constraints and Settings
Finite State Machine Implementation: FPGA Express software provides many options to control the implementation of state machines. It automatically extracts state vectors and re-encodes state machines as one-hot, binary or Altera-specific zero-one-hot when the state vectors are described using an enumerated type. To select the FSM encoding style, you must select the style before analyzing the HDL files in FPGA Express. Figure 13 shows a VHDL state machine design that describes enumerated types. FPGA Express will automatically encode this design with the style you choose.
The FPGA Express software also allows you to specify the encoding style in the HDL source code. In a VHDL file you can do this by using the Synopsys attribute ENUM_ENCODING. For a Verilog design file specify the enumerated states by using the parameter keyword. Figures 14 and 15 below show VHDL and Verilog HDL FSM examples with user-specified encoding respectively.
Additionally, FPGA Express provides control over the implementation of "when others" statement in your VHDL state machine design. This statement is used to cover all states that aren't specified, including invalid states. In such a case, the FPGA Express software generates next state logic to ensure that the implementation is an exact match of the VHDL description and the FSM is able to recover from invalid state transitions.
However, if guaranteed recovery from invalid state transitions is not required, a smaller and faster implementation of the one-hot FSM can be generated. The selection logic for all of the invalid states from the "when others" choice is eliminated resulting in a smaller and faster one-hot state machine. This is the default option in FPGA Express.
Area/Speed Constraints: In the FPGA Express software you can choose to synthesize your design for area optimization or for speed optimization. These constraints can be implemented on a global basis or on individual hierarchical levels. The impact of this constraint varies depending on your HDL coding style and the size of your design (large design blocks generally show significant difference in results.)
Advanced Synthesis Optimization: The FPGA Express software allows you to choose High Effort versus Low Effort option. Low effort corresponds to fast compilation mode, while high effort option takes longer compilation time due to the use of advanced synthesis optimization algorithms. This option is selected during synthesis on a global basis or it can be applied to individual hierarchical levels. As described in effective HDL design techniques' section, your HDL coding style plays an important role during synthesis, and therefore it may limit this optimization.
Device and Timing Assignments: The FPGA Express software allows you to make device and timing assignments for various PLD architectures. For Altera, these assignments are passed to the MAX+PLUS II software, Altera's place and route tool, via the ACF file (assignments and constraints file) produced by the FPGA Express software. Figure 16 shows an example of the ACF file generated by FPGA Express. This MAX+PLUS II compatible ACF file may have assignments for family, device, clock frequency, timing assignments, synthesis style and pad settings (pin location, slew rate and use of I/O register).
TimeTracker: TimeTracker is the timing analysis tool of FPGA Express. This tool helps speed up the design cycle by allowing you to identify and fix critical portions of your design before doing place and route.
In the pre-optimization stage this tool allows you to enter timing constraints for clocks, path groups, multi-cycle paths, sub paths and ports. After the chip has been optimized, you can view the results in a familiar and easy-to-use spreadsheet format. TimeTracker displays the required timing and the achieved timing side-by side to show which constraints were met and which ones failed. The timing paths can be traced in the schematic viewer too. Although the timing results reported by the tool are pre-place and route, the results give you a good idea where the design is not performing optimally. You can now change the HDL source code to achieve better performance before running the design through the place and route tool. However, once you are satisfied with the results of FPGA Express you must use the place-and-route tool's timing analyzer to obtain the actual results.
Visual Tools for Analysis (Vista): Vista is the FPGA Express schematic viewing tool. This tool helps you visualize your synthesis results and improve your device's performance and area results.
You can use it to view your pre-optimized design. At this stage the design consists of generic gates and it allows you to visualize how the FPGA Express synthesis engine interprets HDL code. Once you optimize the design the FPGA Express software maps the generic gates to architecture-specific elements. The optimized schematic shows the design mapped to LUTs, lcells, and carry and cascade chains. This view allows you to evaluate how well the design has been mapped to the given technology.
Vista's hierarchy browsing capability enables you to analyze your design at different levels of hierarchy and helps you visualize where changes might reduce device area or increase device performance.
Vista also highlights critical paths within your design, allowing you to quickly analyze problem areas. The optimized schematic view is tightly integrated to TimeTracker, which allows you to see clock group, path or cell when selected in TimeTracker. You can move along the critical path by using the Next Pin and Previous Pin commands.
Vista also allows you to perform fan-in and fan-out analysis on both the pre-optimized and optimized designs. You can trace the fan-in and fan-out paths for a given cell using the filter options available in FPGA Express.
Conclusion
Home | Product of the Week | Tech Note | AppReview | Vendor Tools | Feedback
|
|||||||||||||||||||||||||||||||||
|
Copyright © 2003 ChipCenter-QuestLink About ChipCenter-Questlink |
||||||||||||||||||||||||||||||||||