Guided Synthesis Using FPGA Express/FPGA Compiler II Block Level Incremental Synthesis
Karen Fidelak (kfidelak@xilinx.com)
Xilinx, Inc
ABSTRACT
With design densities approaching the multi-million-gate level, the need for more sophisticated design methodologies becomes increasingly evident. Timing predictability and runtime reduction when making small design modifications are some of the important considerations. The Synopsys FPGA Express/FPGA Compiler II "Block Level Incremental Synthesis" feature provides a mechanism for identifying blocks of logic in a design such that the synthesized netlists for these blocks remain intact as other portions of the design are modified. This paper will demonstrate how to use Block Level Incremental Synthesis with Xilinx® Guided Place and Route to improve timing predictability and reduce runtime to improve overall productivity.
1.0 Introduction
As designs increase in size and complexity and design cycle time is reduced, it becomes increasingly necessary to identify a design methodology to support better predictability and efficiency when making incremental design changes. When design performance requirements have been achieved, it is a goal of the designer to maintain the results of the placed and routed design as much as possible as incremental changes are made. The desire to reduce runtime when compiling small changes and also maintain the timing results of the placed and routed design are driving forces for developing a guided synthesis methodology.
Ideally, synthesis and place and route software tools should recognize where changes have been made in the overall design and recompile just those portions that have changed. While we are not yet in this ideal situation, there have been recent improvements to the tools to improve productivity. Today, FPGA Express/FPGA Compiler II v3.4 along with Xilinx® 3.1i FPGA Implementation software provide the user with an improved guided synthesis methodology over previous releases. This paper will discuss the new FE/FCII "Block Level Incremental Synthesis" feature and show how using this BLIS flow along with the Guide functionality of the Xilinx place and route tools can significantly improve productivity by reducing runtime and improving timing predictability in an incremental design flow.
2.0 Block Level Incremental Synthesis
With FPGA Compiler II and FPGA Express v3.4, Synopsys introduced a new feature called "Block Level Incremental Synthesis" (BLIS) which is designed to facilitate the incremental design flow. As design changes are made, FE/FCII recognizes "blocks" of the design which have been changed in the source and intelligently updates only those portions of the design in the synthesized design. In this flow a "block" is defined as a module/entity and the hierarchy tree beneath it.
To enable the BLIS flow, the user chooses blocks in the design he wishes to denote as "Block Roots" through the FE/FCII Constraint Editor GUI or scripting language.

Figure 1 Constraint Editor - Specifying Block Roots
(click thumbnail above to view full image)
A "block root" is a "block" which is intelligently updated by FE/FCII in incremental synthesis runs. A "block root" has the following characteristics:
- A separate netlist is created by FE/FCII for each block root.
- From run to run, only those block roots whose corresponding source has been modified are re-synthesized.
- The block root has hard boundaries around it, meaning that no optimization occurs with neighboring modules.
There are two main advantages to using this type of incremental flow. Runtime for both synthesis and place and route will be improved, in some cases dramatically, due to the fact that only that portion of the design whose source has been modified will be re-synthesized and re-netlisted. The rest of the design will not be re-elaborated or re-optimized and the netlist(s) for the unchanged portions of the design will not be rewritten. Since the netlists of the unchanged portions of the design remain untouched, the user is assured that all net and instance names in that part of the design are identical to earlier runs. This means that timing predictability will be improved since the Guide function of the place and route tools, which relies on matched component names from run to run, will have a higher success rate.
2.1 BLIS vs. ECO
It is important to understand the difference between the BLIS feature of FE/FCII and the incremental design flow commonly referred to as ECO or Engineering Change Order. BLIS maintains incremental design consistency on a block level as opposed to within an individual module. If a change is detected in a given block, the entire block will be re-synthesized. The flow is not granular enough to change only those portions of a modules implementation which have changed in the source.
2.2 Team Designing with BLIS
The BLIS flow is currently intended for use by a single designer. While individual blocks of the design are written out to separate netlists which is often conducive to a team design approach, BLIS still requires a single project in FE/FCII and therefore is easiest to use in a single designer environment.
2.3 Limitations
There are a few limitations within the BLIS flow worth mentioning. FE/FCII relies on timestamps of the analyzed HDL files to determine whether the source for a block has changed or not. Therefore, for proper operation of the flow, it is recommended that all HDL files contain only one module or entity. If an HDL file contains more than one module or entity, then when any one module or entity within that file is modified and the file is re-saved, FE/FCII will believe that all modules/entities within the file have changed and will resynthesize all corresponding blocks in the design. This will eliminate the benefits of the BLIS flow.
As mentioned earlier, blocks in the design denoted as Block Roots will have hard boundaries. This means that optimization that may normally have been possible in a flow where the design was flattened will not take place. Good HDL coding techniques, namely placing registered outputs at the output ports of the modules and avoiding placing glue logic such as inverters outside of module boundaries, are especially important with the BLIS flow for this reason.
The most dramatic improvements over using a traditional design flow will be seen when the design is well partitioned into several functional blocks, and incremental changes affect only some of the blocks. In these cases you will reap the benefits of not having the larger, unchanged portion of the design re-synthesized and re-implemented, thus potentially altering net and instance names as well as increasing runtime.
Finally, in order to take advantage of the BLIS flow, you must know which modules you plan to be changing ahead of time since the block must be tagged as a Block Root for an initial run before the modification is made. If the entire design is synthesized flat with no block roots specified and you later want to make an incremental change to one of the modules, you will not initially benefit from the incremental flow.
3.0 Design Flow Tests
Multiple designs were used as case studies for testing the BLIS flow for this analysis. The designs were synthesized using both a traditional synthesis flow and the BLIS flow and then implemented with the Xilinx implementation tools using a guided methodology. Xilinx place and route tools allow the user to specify an existing placed-and-routed design to be used as a "Guide" when implementing a design. In this flow, the existing placed and routed design is used as a template when re-implementing the design. Any portions of the design which exist in both the "Guide" design and the new modified design (determined by matching net and component names) will be placed in the same location in the new implementation as they were in the "Guide" design. New or changed logic will be implemented around existing, guided logic.
3.1 Traditional Design Flow Test - Overview
In the traditional design flow, the designs were first synthesized with FE/FCII using the Preserve Hierarchy switch set to OFF to result in a flattened design. The design was then completely placed and routed with the Xilinx implementation tools. Next, incremental changes were made to one or more of the design modules. The project was updated in FE/FCII and a new updated netlist was written. Finally, the design was re-implemented with the Xilinx tools using the placed and routed design from the initial run as a Guide file.
3.2 BLIS Design Flow Test - Overview
To test the BLIS design flow, the same basic flow was used as described above with the exception that the design modules which contained the incremental changes were set as Block Roots in FE/FCII. Therefore, when the project was updated in FE/FCII after the changes were made, only those blocks which contained modifications resulted in updated netlists. The modified design was again run through Xilinx place and route using the initial design (from the BLIS run) as a Guide.
3.3 Testcase Designs
The designs used in the analysis typically contained at least 20 modules/entities, each contained in a separate HDL file. The types of incremental changes made to the source ranged from adding synchronous processes to modules, to modifying combinatorial logic, to swapping bits of a bus.
The designs used in the tests were synthesized with the bulk of the design flattened and with selected modules set as block roots when using the BLIS flow. This allowed the design to benefit from the improved cross-boundary optimization for those portions of the design which were not likely to change, and yet still take advantage of the improved runtimes and timing predictability provided by BLIS for blocks of the design with incremental changes. It is worth noting that synthesis runtimes are generally improved when the design hierarchy is preserved as opposed to flattened, yet at the expense of potentially improved optimization across module boundaries. In these tests the choice was made to flatten the non-block-root portions of the design in order to benefit from the cross-boundary optimizations.
The diagrams below show the block-level hierarchy of the designs as well as the blocks in the design which were denoted as Block Roots.
Figure 2 Case 1 Design Hierarchy (~1500 Logic Cells)
(click thumbnail above to view full image)
Figure 3 Case 2 Design Hierarchy (~2800 Logic Cells)
(click thumbnail above to view full image)
Figure 4 Case 3 Design Hierarchy (~8500 Logic Cells)
(click thumbnail above to view full image)
4.0 Analyzing Results
Results were analyzed to compare runtimes and guide success rates between the traditional and BLIS flows. Runtimes included both synthesis runtimes and implementation runtimes. Guide success rates were taken from the Place and Route report and indicate a level of timing predictability from run to run, as consistent placement leads to consistent timing results.
4.1 Runtime Improvements
The key statistic to look at when analyzing runtime is the improvement in the runtime from the original design to the modified design and how this improvement compares between the traditional and BLIS flows. In an incremental design flow it is desirable to have reduced runtimes after incremental design changes are made.
Runtimes for the modified version of the design improved when using BLIS for both the synthesis and implementation phases of the flow. FE/FCII did not re-elaborate or re-optimize blocks which remained unchanged. Additionally, since the EDIF netlists were not updated for blocks which had not changed, the Xilinx translation tool NGDBUILD did not need to rerun the EDIF translation program EDIF2NGD. The Guide feature of the place and route tool will generally improve runtime on subsequent runs regardless of whether the source has changed or not, which will explain why improvements in implementation runtime can be seen for both the BLIS and traditional flows.
Figure 5 Runtime Results
(click thumbnail above to view full image)
Case 1 contained timing constraints which were used by the place and route tool and which explain the relatively larger implementation runtime. Since the design in Case 1 was smaller than Cases 2 and 3 and therefore the synthesis runtimes were faster across the board, the runtime advantage from using BLIS is not as dramatically evident. Case 2 and Case 3 show more clearly how the synthesis runtime can be greatly reduced when using the BLIS flow for incremental changes, as the runtime went from 10 minutes to 5 minutes and from 109 minutes to 54 minutes respectively when performing the incremental synthesis. Finally, the improvement in runtime for the original design in the BLIS flow (as compared to the original design with the traditional flow) may be explained by the fact that the design was completely flattened in the traditional flow, and only partially flattened in the BLIS flow due to the blocks denoted block roots. Any preservation of hierarchy can lead to improved runtimes as the expense of further cross-boundary optimization.
4.2 Guide Improvements
Guide success rates also improved when using the BLIS flow as opposed to the traditional incremental synthesis flow. This can be attributed to the increase in net and component name matches between the original placed and routed design and the incrementally modified version of the design. Since unchanged blocks of the design were not re-synthesized, the netlists remain untouched and thus identical to the original version. Even if there were no logic changes in the source, the mere fact that the block is re-synthesized can lead to net and component names being changed in the final netlist.
Figure 6 Guide Results
(click thumbnail above to view full image)
4.3 Flattening vs. Preserving Hierarchy
It was generally seen that the benefits of using BLIS were significantly enhanced when using an overall design-flattening methodology for the portions of the design which were static. On designs where the hierarchy was preserved on all modules in all cases, the guide success rates were not noticeably improved. The guide success rates were actually quite good in both cases of using BLIS and not using BLIS. In these cases, however, the portions of the design which were static were not benefiting from cross-boundary optimization. Using BLIS on the modified portions of the design and flattening the rest proved to give the best overall result with respect to runtime improvements, timing predictability, and design optimization.
4.4 Effect of Design Size on Benefits of BLIS
The designs used for testing were not anywhere near the multi-million gate densities that are becoming more and more feasible with FPGAs. It is quite conceivable that the benefits of using the BLIS flow will be considerably more significant on these much larger designs.
5.0 Conclusion
In summary, it can be seen through statistical data that use of the FE/FCII Block Level Incremental Synthesis flow can significantly enhance the productivity of the designer through runtime improvements and placement and timing consistency in the implemented design. This flow provides definite advantages over earlier versions of FPGA Express/FPGA Compiler II software (pre-3.4) which did not provide an intelligent incremental or guided synthesis flow. While this flow does require some planning on the part of the designer to denote ahead of time which blocks in the design should be considered "block roots," this paper has demonstrated that doing so will provide benefits of improved runtime and timing predictability as incremental changes are made. |