ChipCenter Questlink
SEARCH CHIPCENTER
Search Type:
Search for:




Knowledge Centers
Product Reviews
Data Sheets
Guides & Experts
News
International
Ask Us
Circuit Cellar Online
App Notes
NetSeminars
Careers
Resources
FAQ
EE Times Network
Electronics Group Sites

The Programmable Logic Proving Ground

PLDs are an Increasingly Useful Component of System Design Throughout a Product's Lifetime

By Tom Troksa, Networking Processor Architect, Packet Engines (part of Alcatel); Steve Dabell, Networking Processor Architect, Packet Engines (part of Alcatel); Martin S. Won, (mwon@altera.com) Member of Technical Staff, Altera Corporation

Introduction System designers are faced with larger, more complex projects and less time to complete them. Completing a system design today is not simply seeing a single vision through to completion; it is an event representing the successful convergence of several changing technologies. Programmable logic devices (PLDs) have emerged as useful tools in dealing with these issues. PLDs are familiar to many designers as interface or "glue" logic, but the rapid rise in PLD size and features now makes it desirable to implement large subsystems within a single chip.

The decision to use PLDs is accompanied by other choices, such as which device to use, how to integrate it into existing design flows, and where to include it in your development cycle. This article will focus on these aspects of using PLDs using the example of a recent network processing device (called Argus) developed by Gigabit Ethernet provider Packet Engines. Argus is a core device in Packet Engines' PowerRail 5200 enterprise routing switch. The PowerRail 5200 switch (shown in figure 1) is a wire-speed router designed for the core of enterprise networks. These devices, which perform routing functions in custom ICs, are replacing traditional routers.

Figure 1

Figure 1: Packet Engines PowerRail 5200 Gigabit Ethernet Routing Switch

A brief description of Argus will assist in understanding the issues covered in this article: Argus is a PowerPC-compliant system controller for networking products. It was architected to provide massive amounts of system bandwidth between the PowerPC processor and the gigabit ethernet switch fabric. In a stand-alone configuration, Argus provides a PowerPC 603, 604, 740 and 750 compliant data transfer engine which is capable of supporting 6 Gbit/sec memory bandwidth, a 2 Gbit/sec DMA receive channel, a 2 Gbit/sec DMA transmit channel, an industry compliant I2C interface, and a 32-bit local bus. When coupled to a local bus controller, it can control an entire computer system including flash memory, PCMCIA cards, a MODEM interface and an RS-232 serial monitor port. Argus integrates two independent synchronous DRAM controllers, a receive DMA channel, a transmit DMA channel, a 32-bit local bus, an I2C controller, an interrupt controller, and a system configuration controller. An internal multi-master/ multi-slave parallel bus structure provides simultaneous connection between six independent execution units with compliance for up to seven concurrent transactions. Figure 2 shows a block diagram of Argus.

Figure 2

Figure 2: Argus Block Diagram--Click to view full size.

Although Packet Engines' decision to start with its big switch required more complex development initially, we believed it would be more practical to scale down its architecture than scale it up. The PowerRail 5200 routing switch is the industry's first wire-speed routing switch of such a scale. Therefore, as development of the PowerRail 5200 routing switch proceeded, it was necessary to prove its capabilities to early adopters. Packet Engines decided to use a PLD as the initial technology of the Argus device to gain the confidence of the market and initiate the sales cycle ahead of the full production release of Argus.

When to Use Programmable Logic
The typical design timeline includes stages such as prototyping, initial manufacturing, and full-scale production, and there are good reasons to use PLDs in each. These reasons are generally most compelling early on, but advances in PLD capabilities are making it more attractive to introduce and keep PLDs in the design lifetime for longer periods. For example, the prototyping stage has always been a home for PLDs. Initial manufacturing is also a well-accepted use for PLDs, leading to early market introduction. Finally, falling prices makes full-scale production using PLDs an increasingly viable option for large designs. The decision to use PLDs in each of these stages should be based upon an analysis that considers gate density, pin density, system performance, time to market, unit cost, NRE cost and development risk.

In the case of Argus, use of PLDs made the most sense during the prototyping and initial manufacturing stages, with a transition to a gate array for full-scale production. The decision to use a gate array was based on the lower cost of a custom device in the density range (roughly 100K gates) that could accommodate Argus and its performance requirements. For lower density designs that achieve their speed goals in PLDs, it makes sense to perform a cost analysis to determine if a custom device is necessary. At today's PLD volume prices, many designs in the 50K-gate range could warrant full-scale production with PLDs alone.

Which Device to Use?
This choice should be focused on the device characteristics and your design, but other issues are also relevant, such as PLD tools and how they integrate into your design flow (discussed later in the article). The design size in terms of logical gates, memory bits, and pin count along with the performance characteristics of a specific architecture are initially the most useful numbers. Previous experience with a device family is the best resource for estimating how your design will fit into a PLD, but in the absence of this, PLD vendors offer references for determining device capacity. In our situation, we had previous experience with different families of PLDs from the leading manufacturers and had a good idea of what to expect from each.

In some cases, specific device features like on-board memory will further steer the decision. Many high-density PLDs offer ways to implement on-board memory; the two prevalent schemes are embedded (in which the device includes dedicated memory structures) and distributed (in which logic resources are converted into memory resources). Each of these implementations has advantages and disadvantages. For Argus, we required two-256 x 64 memories (single-port RAMs) to interface between the DMA engines internal to Argus and our external switch fabric at 41 MHz. For memories of this size and speed, distributed memory was too slow and resource-inefficient, so we favored embedded memory.

The next consideration is pin count. There are no hard and fast rules for choosing the right pin count; some designers prefer a buffer of anywhere from 5%-10% extra I/O pins to address changes and modifications. The Argus design required 450 I/O pins, and we estimated that the design logic would take roughly 100K gate-array gates along with 4Kbytes of internal single-port memory. Among the embedded memory PLDs, the EPF10K130 in Altera's FLEX 10K family seemed like a good fit. The EPF10K130 offers up to 130,000 usable gates, up to 32K RAM bits, and 470 I/O pins. We were also attracted to FLEX 10K's pin-compatibility between different family members. Several PLDs offer this capability, which is also useful in planning future versions. Our plans to upgrade Argus required more logic and memory resources while using the same I/O. We planned to use an EPF10K250, with nearly twice the logic and memory resources as the EPF10K130 in the same pinout and packages.

Integration Into Your Design Flow
Before you choose a PLD, understand how it will fit into your design flow. For most devices, the exact details of integrating a PLD into your design flow will vary depending on the PLD company and possibly on the family of device. To describe all the possible variations could be the subject of a separate full-length article, so instead we will focus on our experience with Argus.

The block diagram in Figure 2 shows our design flow. It begins with capturing the hierarchical design using an HDL (in our case, Verilog). The design synthesis stage follows (we use Design Compiler, FPGA Compiler, and FPGA Express from Synopsys), which yields a gate-level representation of the design. After this stage begins the PLD place-and-route phase, which is similar to typical gate-array development with some minor differences. PLDs (especially the high-density PLDs we used) require place-and-route tools that are provided by the PLD vendor. These tools can typically be used either alone to develop PLD designs or together with gate-array design tools.

Figure 3

Figure 3--Click to view full size.

In our case, the Synopsys tool is directed to produce a netlist for processing by the PLD tools. During synthesis, we created of a set of scripts with a specific script associated with each Verilog file. A bottom-up strategy produces gate representations of each synthesizable leaf-level module. The gate-level output files produced by the leaf-level synthesis scripts are stored in a common directory. In the bottom-up strategy, the gate-level design files are connected (using scripts within the Synopsys environment) at higher levels of the design hierarchy until a top level design file (representing the entire design structure as a hierachical gate-level netlist) is created. This design file is output from the Synopsys environment as an EDIF hierarchical netlist, which is then provided to the PLD place-and route-tools.

In our flow, the place-and-route tool is Altera's MAX+PLUS II. Hierarchical EDIF is useful because it allows the designer to manage the timing / area requirements with constraints assigned to any module within the hierarchy and at any hierarchical level. MAX+PLUS II provides a design hierarchy viewer/editor which serves this purpose well. Logic assignments that carry over to MAX+PLUS II can also be made within the Synopsys tool environment (in the case of Design Compiler, it requires entering commands at the dc_shell prompt).

Generally, the PLD vendor's tools provide more accurate post-route timing information than can the pre-route estimate from the logic synthesis tool. Although this post-route information can be imported into the Synopsys static timing tool for static timing analysis, we used the MAX+PLUS II static timing analyzer. To check the functionality of the design, we exported a Verilog file of the compiled design from MAX+PLUS II into Cadence's gate-level simulator, Verilog-XL.

Board Layout and Hardware/Software Codesign
The design of Argus was conducted in parallel with the development of the PCB and the embedded software. Although PLDs are designed for flexibility, a designer can often reap benefits from intelligent placement of I/O pins depending upon the device architecture and the needs of the design. In our case, we knew that we were using 100% of the pins in the device and would have no opportunity to change the pinout after the PCB was completed. Accordingly, we identified I/O buses that required the most stringent timing and placed them on pins that corresponded to "rows" in the FLEX 10K device. This placement is based on the device interconnect, which utilizes rows and columns. A simple observation of the FLEX 10K architecture reveals that more I/O pins and logic resources are associated with a given row than with any given column, and that dedicated logic resources (called carry and cascade chains) communicate along rows. For applications in which busses of data are passed through several batches of processing, it makes sense to orient these signals along rows. With this I/O placement, we were able to lay the board out before synthesizing the Verilog. The additional 20 I/O pins were later brought out to probe points for diagnostic use.

An incremental release strategy for the PLD design allowed for early hardware/software integration using subsets of the final design. We planned four prototyping releases of Argus, the fourth being the first production release. The first release, which took five weeks from specification to completion, contained the PowerPC interface, memory controllers, and the bus fabric. This release allowed our software developers to get to work with their PowerPC emulator and test routines for transactions between the PowerPC and its DRAM.

In the second release (one week later), we added the I/O bus, which gave Argus access to data sources in the system (ie, UART, Flash memory, and a PCMCIA port). For this release, the software team developed code for the PowerPC to talk to the data sources (for example, booting off the Flash or the PCMCIA port and loading the corresponding instruction sets into the DRAM). Since the PowerPC-to-DRAM routines had already been established, the software team could focus on dealing with the new data sources. The next version came two more weeks later; it included the DMA engine and the single-port RAMs for communicating between the external switch fabric and the PowerPC. The fourth release followed two weeks thereafter, in which we made minor enhancements and began intense software testing.

Following the release of the PLD design, we started the gate array retargetting. The difference between the PLD design and the gate-array design was the structure of the DMA engines single-port memory. To ease this process, the hierarchical design isolated the interface to the memory. The resulting design changeover was smooth, although the timing of this portion of the design was scrutinized during Verilog simulation and testing of the gate array design.

In-System Testing and Initial Production
PLDs provide the ability to test real hardware under actual operating conditions. This usage proves the design and verifies some of the more difficult-to-simulate aspects, such as system timing. This testing can be performed in parallel with the completion of the gate array design so that design changes can be made before building the gate array samples. While we were retargeting the Verilog towards the gate-array version, we subjected the PLD version-equipped board to billions of ethernet packets. Having the ability to stimulate the design on the board, using software executing on a physical PowerPC, with visibility into the PLD through a probe bus via spare I/O pins provided a powerful verification platform.

Regarding PLD compilation times: as with gate arrays, compilation times vary with size and complexity, and are a factor in design cycle efficiency. With Argus, early releases took about 20 minutes (using MAX+PLUS II on a Sun UltraSPARC 2-based workstation) to compile. Final releases required up to six hours. The final PLD version of Argus occupied 82% of the logic and all of the memory resources of the device. Regarding gate counts, Altera states that the EPF10K130 provides 82K to 211K gates, depending on how the logic and memory are used. For comparison, the gate array version of Argus required 95K "gate-array" gates and two 2Kbyte single-port RAMs. So, by the measure of Argus, the logic cells in an EPF10K130 provided ~115K gates, and the EABs provided 4K memory bytes.

Conclusions
Early releases of a product using PLDs can afford months of market penetration that are impossible with a gate-array-only strategy. We estimate that using a PLD netted an additional 2-3 months of market presence that in a gate-array only design would have likely been spent on additional software simulation. The PLD vehicle also provides a contingency position if the first gate array silicon release is delayed. Coming generations of PLDs will offer even more benefits to system design, while maintaining compatibility with existing devices. A future version of Argus, for example, will likely use a FLEX 10KE device, which offers a higher memory-to-logic ratio than existing FLEX 10K devices, allowing it to support deeper transmit and receive buffers.

With today's prices and product cycles, it makes sense for many 50K-gate designs with production runs in the thousands or less to consider using PLDs throughout the product's lifetime. In the near future, the size of design that can remain with PLDs will quickly grow to the 100K-gate range and beyond.


Home | Product of the Week | Tech Note | AppReview | Vendor Tools | Feedback

Click here to get your listing up.

Copyright © 2003 ChipCenter-QuestLink
About ChipCenter-Questlink  Contact Us  Privacy Statement   Advertising Information  FAQ