|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||
|
|
Incorporating Phase-Locked Loop Technology into Programmable Logic Devices
By Greg Steinke, Components Applications Engineering Manager, Altera Corporation
As higher density programmable logic devices (PLDs) become available, on-chip clock distribution becomes more important to the integrity and performance of the designs implemented in these devices. The impact of clock skew and delay becomes substantial in high density PLDs, exactly as in gate array and custom chip implementations. Existing solutions for this problem, such as hardwired clock trees, are less effective for the high density PLDs that are being released in today's programmable logic market. One recent solution to this problem is the incorporation of phase-locked loop (PLL) structures into the PLDs themselves. The PLL can then be used along with a balanced clock tree to minimize skew and delay across the device.
An additional benefit of a PLL is the ability to multiply the incoming device clock. Gate array and custom chip designers have found clock multiplication very useful in their designs; a common example is in microprocessors where a 100-MHz processor may be fed by a 50-MHz clock, which is doubled in the processor. This technique allows easier board design, as the clock path on the board does not have to distribute a high-speed signal.
This paper describes how to use an on-board PLL to perform these functions in Altera's FLEX 10K and MAX 7000S devices. Specific design examples of how to reduce clock skew and perform clock multiplication are given, including schematics, VHDL and Verilog. Other considerations, such as timing and board layout considerations are also addressed.
ClockLock and ClockBoost Features in FLEX 10K and MAX 7000S
In the FLEX 10K device family, the ClockLock circuit locks onto the incoming clock, minimizing clock delay. The ClockBoost circuit can be engaged to multiply the incoming clock by two. Whether or not the clock is multiplied, the clock delay is reduced, improving clock-to-output and setup times. In the MAX 7000S device family, the clock delay is already quite low. Therefore, the ClockLock circuitry does not further reduce clock delays. The advantage of the ClockLock circuit in MAX 7000S is the ClockBoost circuit; in MAX 7000S, the ClockBoost circuit is always engaged when ClockLock is used. The ClockBoost circuit can multiply the incoming clock by two, three, or four in MAX 7000S devices.
Specifying ClockLock and ClockBoost Usage in MAX+PLUS II The CLKLOCK primitive is parameterized to allow the user to specify the operating conditions. There are two parameters associated with the CLKLOCK primitive: INPUT_FREQUENCY and CLOCKBOOST.
The INPUT_FREQUENCY parameter tells MAX+PLUS II at what frequency this circuit will be clocked. Based on the INPUT_FREQUENCY parameter, MAX+PLUS II sets RAM bits in the configuration bitstream that tune the PLL in the ClockLock circuit to respond to the appropriate frequency. If the circuit is then clocked at a different frequency, the ClockLock circuit may not meet its specifications, or may not function correctly. The CLOCKBOOST parameter sets the clock multiplication factor. Depending on the device chosen, the CLOCKBOOST parameter can be set to 1, 2, 3, or 4. For instance, if CLOCKBOOST is set to 2, then the incoming clock will be multiplied by two. The CLKLOCK primitive can be used in MAX+PLUS II schematic designs, AHDL designs, or in a third-party tool. When creating a schematic design, the engineer will use the CLKLOCK symbol provided in MAX+PLUS II. Figure 1 shows an example of a schematic instantiation of the CLKLOCK symbol.
Figure 1. Schematic Instantiation of CLKLOCK primitive
The CLKLOCK primitive can also be used in an AHDL design. Figure 2 shows an example of an AHDL instantiation of the CLKLOCK primitive.
Figure 2. AHDL Instantiation of CLKLOCK primitive
Subdesign TDM
(ClockX1, D : Input;
Q: Output;)
Variable
Begin
The CLKLOCK primitive can be used with a VHDL design as well. Version 7.0 of MAX+PLUS II supports instantiation of VHDL components with a GENERIC MAP clause. This GENERIC MAP clause is used to specify the expected input frequency and ClockBoost factor. Figure 3 shows an example of instantiating the CLKLOCK primitive in a VHDL design. This technique works in MAX+PLUS II VHDL, Cadence Synergy, and Mentor AutoLogic. A similar technique works with Verilog designs in Cadence Synergy.
Figure 3. VHDL Instantiation of CLKLOCK primitive
ENTITY design IS
END design;
ARCHITECTURE structure OF design IS
SIGNAL locked_clock : STD_LOGIC;
BEGIN
-- Through the rest of the design, locked_clock will be used for the clock
For designs created using Synopsys or Viewlogic ViewSynthesis tools, Altera provides a utility called gencklk. Using gencklk, a user can generate a black box which represents the ClockLock or ClockBoost circuit. This black box is instantiated into VHDL or Verilog HDL code. When MAX+PLUS II reads the resulting EDIF file, it interprets the name of the black box to turn on the ClockLock circuit with the appropriate parameters. Gencklk also generates a simulation model of the ClockLock circuit for pre-synthesis simulation.
When using gencklk, the user will enter the expected input frequency and the ClockBoost factor. The user also specifies the format for the black box and models: Verilog HDL, VHDL, or Viewlogic VHDL. Gencklk will then create the black box for instantiation and the appropriate simulation models. Figure 4 shows an example of instantiating a gencklk -generated model into VHDL code.
Figure 4. VHDL Instantiation of gencklk-generated Model
ENTITY design IS
END design;
ARCHITECTURE structure OF design IS
SIGNAL locked_clock : STD_LOGIC;
COMPONENT clklock_2_40 -- Name is generated by gencklk
BEGIN
u1 : clklock_2_40 port map (inclk => clkin, outclk => locked_clock);
-- locked_clock will be used as the clock through the rest of the design
Figure 5 shows an example of instantiating a gencklk-generated model into Verilog HDL code.
Figure 5. Verilog HDL Instantiation of gencklk-generated Model
module design (clkin,...);
clklock_2_40 u1 (.inclk(clkin), .outclk(locked_clock)); Finally, Altera has created schematic symbols for the CLKLOCK primitive for use with Viewlogic ViewDraw, Cadence Concept, and Mentor Design Architect. These symbols are included with MAX+PLUS II. For more details on using the ClockLock and ClockBoost circuits with a third-party tool, consult the MAX+PLUS II Software Interface Guide for that particular tool.
Details of ClockLock Usage
The clock pin that drives the ClockLock circuit may not drive any other logic in addition to the ClockLock circuit. In most cases, this will not present a problem. The user will want all registers to be clocked with the ClockLock-generated Clock, and the ClockLock-generated Clock will not drive logic.. However, if the ClockBoost feature is used to clock some registers in the design, but not all, then the user may want the same clock pin to provide a multiplied and non-multiplied Clock throughout the design. In this case, the user will drive the Clock signal into the device on two pins: one pint will drive the ClockBoost circuit, and the other will drive the non-multiplied clock signal.
Finally, the ClockLock circuit locks only onto the rising edge of the incoming clock. The rising edge of the ClockLock circuit's output must be used throughout the design.
Figure 6 shows examples of illegal ClockLock and ClockBoost configurations.
Figure 6. Illegal Uses of ClockLock and ClockBoost
Figure 7 shows how to successfully use a multiplied and non-multiplied version of the same clock within a design.
Figure 7. Using Multiplied and Non-Multiplied Clocks in the Same Design Timing Analysis In FLEX 10K devices, when using the ClockLock or ClockBoost, the clock delay will be reduced. Additionally, the skew (difference in delay to different points in the device) will be eliminated. The Timing Analyzer in MAX+PLUS II will show these changes. There are three modes in the Timing Analyzer:
Delay Matrix
When using a PLL to reduce clock delay, a negative clock-to-output delay is possible. However, Altera has designed the ClockLock circuit to ensure that the clock-to-output delay is always positive. In fact, a minimum output data hold time is specified in the data sheet.
Setup/Hold Matrix
TSU = TDATA + TREG_SU - TCLOCK
TDATA is the data delay, TREG_SU is the setup time of the register, and TCLOCK is the clock delay. The ClockLock circuit reduces clock delay. Due to the reduced clock delay, the setup time at the pin is increased. To minimize setup time when using the ClockLock circuit, the designer can use the I/O registers to register the input.
Registered Performance When using the ClockBoost circuit in a MAX 7000S device, clock delay is unchanged. Therefore, the Delay Matrix and Setup/Hold Matrix results will be unchanged. However, in FLEX 10K and MAX 7000S, the Registered Performance result will change. The Registered Performance analysis will report the speed of the multiplied clock; the user can divide this by the ClockBoost factor to find the maximum speed at which the pin can be clocked.
If some registers in a design are clocked by the multiplied clock, and some are clocked by the non-multiplied clock, the Timing Analyzer will not compute the maximum performance of registers bridging between the multiplied and non-multiplied domains. The Timing Analzyer cannot compute this performance because it does not know the relationship between the two clocks. The user can approach this in one of two ways. One method is to use the Delay Matrix to analyze the delays between registers. Another method is to use a third-party timing analysis tool which can analyze multi-clock systems.
Simulation The MAX+PLUS II simulation model acts as a silicon PLL and must sample the incoming clock before lock-on. The model won't begin to generate clocks until it has sampled three incoming clocks. Before the model locks on, it will output a logic low signal. If the incoming clock changes frequency or otherwise violates the specifications, the model will lose lock. Once lock is lost, the model will output a logic low, and will not attempt to re-acquire lock. Typically, this practice is not an issue; a designer will generally simulate with a stable clock.
When performing a functional simulation in MAX+PLUS II, timing effects are ignored and all delays are assumed to be zero. When used without clock multiplication, the ClockLock circuit affects only the timing of the circuit, not the functionality. Therefore, no difference will be seen in the Functional Simulator when using only the ClockLock circuit, other than the lock-on process. However, the Functional Simulator will simulate the operation of the ClockBoost circuit, as that circuit affects the functionality of the design.
A designer can simulate the operation of ClockLock and ClockBoost circuits using VHDL and Verilog simulators. To perform a pre-synthesis compilation before MAX+PLUS II compilation, a designer can use the netlist output of gencklk. During compilation, MAX+PLUS II generates VHDL and Verilog models of the ClockLock and ClockBoost circuits when they are used in the design. When used in a simulation, the models require 3 clock cycles to lock onto the incoming clock. Also, if the incoming clock changes frequency or otherwise violates the specifications, the model will lose lock. When the model is not locked, it will output a logic low.
The designer can also used a gate-level simulator to simulate the operation of the ClockLock and ClockBoost circuits. A VHDL or Verilog HDL simulator can be used in conjunction with a gate-level simulator to simulate the operation of the ClockLock and ClockBoost circuits. Mentor QuickSim and Viewlogic ViewSim simulators are support via this technique. A gate-level simulator can simulate the operation of the circuit before or after MAX+PLUS II compilation. For more details on simulation in a third-party tool, consult the appropriate Software Interface Guide.
ClockLock Status
To monitor the LOCK signal, the user can use an option in MAX+PLUS II. The Enable LOCK Output Device Option turns on the LOCK signal. The Report File will indicate which pin is the LOCK pin. The data sheet also lists which pin is the LOCK pin on all FLEX 10K package types. The LOCK signal can then be externally monitored. For instance, an external circuit could reset the device whenever the LOCK signal goes low and then reasserts. The LOCK signal can not be internally monitored; the internal logic will experience incorrect operation once lock is lost, and must be externally controlled.
FLEX devices are configured upon power-up. The ClockLock configuration information is near the beginning of the configuration data stream, so the ClockLock may lock onto the incoming clock while the rest of the device is configuring. If the system clock is applied to the CLK1 pin during configuration, the ClockLock circuit will be locked onto that clock before the FLEX 10K device finishes configuration.
MAX devices begin operation as soon as VCC reaches the operating level. When using the ClockLock circuit, the user's system should monitor the LOCK signal and reset the MAX device once the ClockLock circuit is locked to the incoming clock.
For FLEX or MAX devices, the user's circuit should monitor the LOCK signal. If the LOCK signal goes low, then anything in the device clocked by the ClockLock circuit may have been incorrectly clocked, resulting in erroneous results. For best results, the system should reset the Altera device after LOCK asserts again.
System Startup Issues Figure 8. Control Signal
The most obvious way to generate this control signal is to use a toggle flip-flop driven by the multiplied clock. However, this may not always work; the control signal could be inverted from the system clock, resulting in system malfunctions. Another approach is to create a control circuit that is clocked by the 1x and 2x clocks; the output will not become active until both clocks are active.
The first approach uses two registers connect to asynchronously clear each other. When the non-multiplied clock clocks the first register, it goes high, driving the control signal high. When the multiplied clock clocks the second register, it goes high, since its D input is connected to the output of the first register. The second register's output is inverted and drives back into the CLRN input of the first register, clearing it. When the first register is cleared, it drives the control signal low. When the control signal is driven low, it asynchronously clears the second register, releasing the clear on the first register. The non-multiplied clock will restart the process when it clocks the first register. This approach will always give a control signal synchronized to the non-multiplied clock, even if the multiplied clock begins to clock before the non-multiplied clock. Additionally, if there is a glitch on either clock, the circuit will reset itself when the clocks become regular again. A disadvantage of this approach is that the clock-to-output delay for the control output from the multiplied clock is longer, because it goes through the clear input of the first flip-flop. Figure 9 shows this circuit.
Figure 9. Control Signal Circuit
Another approach uses a chain of registers. The first is a DFF clocked by the multiplied clock. This drives a DFF clocked by the non-multiplied act, which drives a TFF clocked by the multiplied clock. The output of the TFF is the control signal. This circuit ensures that the control signal is not generated until both clocks are operating. The toggle input to the TFF will not be driven high until both registers have been clocked, meaning that both clocks are operating. A disadvantage of this approach is that if the multiplied clock has a glitch, the output of the toggle flip-flop will be inverted. Figure 10 shows this circuit. In an alternative implementation, the LOCK signal could drive the first flip-flop. If lock is lost, the control signal will stop toggling. Once lock is regained, the control signal will restart.
Figure 10. Control Signal Circuit
A final approach is for external logic to synchronously clear the 1x-clocked and the 2x-clocked systems once LOCK has asserted, showing that the ClockBoost circuit has locked onto the incoming clock.
Multi-clock System Issues There are two cases to consider: Case 1
Figure 11. Clock Skew Example
In this case, the maximum frequency possible between the two registers will be slowed. The effective TCO of the source register is increased due to the increased clock delay. In this example, the clock cycle time is computed with the following equation:
tCYCLE = (tDELAY1 - tDELAY2) + tCO + tDATA + tSU
For the example shown in Figure N, the minimum cycle time without clock skew is 20 ns. The skew between the two clocks raises this cycle time to 23 ns. The difference in clock delays decreases the maximum performance possible between the two registers. However, this will only impact system performance if the critical path for the system lies between the two registers. If this path is slowing system performance, the designer can speed it up with the usual techniques, such as pipelining, timing-driven synthesis, or cliquing.
The Altera-provided cycle-shared macrofunctions CSFIFO and CSDPRAM do not experience this slowdown. The critical path on those macrofunctions is not a case where the source register is clocked by the regular clock and the destination register is clocked by the ClockBoosted clock.
Case 2
Figure 12. Clock Skew Example
In this case, there is a possibility of a functional issue. If the sum of tCO, tLOGIC, and tH is less than the difference in the clock delays, then the new data from the source register will reach the destination register before the clock reaches the register. On FLEX 10K devices, the register tH is 0 ns. This case should be considered when the source register and the destination register are in the same LAB with no intervening logic cells. When both registers are in the same LAB, the sum of tCO and tDATA will be tCO + tSAMELAB + tLUT from the timing model. In the current -3 speed grade, this computes to 2.1 ns. The difference in delay between the two clock paths is about 3 ns. In this case, there is a potential for a functional problem. However, if there is another logic cell between the registers, it will introduce an additional 2.5 ns delay. Or, if the two registers are in different LABs, there will be an additional row delay of at least 2.5 ns. In either case, the delay is sufficient to ensure that the data delay exceeds the difference in clock delays, and the circuit will function as expected.
In the Altera-provided cycle-shared macrofunctions CSFIFO and CSDPRAM, the delay path between registers clocked with the 2x clock and registers clocked with the 1x clock always exceeds the difference in the clock delays. Therefore, there is no possibility of a functional issue with clock delay differences with these macrofunctions.
The designer should use the MAX+PLUS II Timing Analyzer, or a third-party timing analyzer, to analyze the timing of the system to ensure that neither of these two cases will affect a design using the ClockBoost feature.
ClockLock and ClockBoost Specifications
Figure 13. ClockLock and ClockBoost Waveforms Duty Cycle Clock Deviation Clock Stability
The ClockLock circuit is designed so that commercially available clock generators and oscillators can easily meet all requirements for the incoming clock. Commercially available clock generators and oscillators specify their precision in terms of parts-per-million, which far exceeds the requirements of the ClockLock circuit.
Lock Time Jitter Clock Delay
Board Layout
All devices with ClockLock circuitry have special VCC and GND pins which provide power to the ClockLock circuitry. The power and ground connected to these pins must be isolated from the power and ground to the rest of the Altera device, or to any other digital devices. These pins are named VCC_CKLK and GND_CKLK. There is one VCC_CKLK and one GND_CKLK pin on all Altera devices with ClockLock. The report file generated by MAX+PLUS II will show these pins. Also, the data sheet describing the device will show these pins.
Methods of isolating ClockLock power and ground include:
The designer of a mixed-signal system will have already partitioned the system into analog and digital sections, each with its own power and ground planes on the board. In this case, the VCC_CKLK and GND_CKLK pins can be connected to the analog power and ground planes. Most systems using Altera devices are fully digital, so there is not already separate analog power and ground planes on the board. Adding two new planes to the board may be prohibitively expensive. Instead, the board designer can create islands for the power and ground. Figure 14 shows an example board layout with analog power and ground islands.
Figure 14. ClockLock Board Layout
The analog islands still need to be connected to power and ground. They can be connected to the digital power and ground through a lowpass power filter consisting of a capacitor and an inductor. Typically, ferrite inductors are used for power filtering. The ferrites act as shorts at DC, allowing power to drive the ClockLock circuit. The ferrites' impedance increases with frequency, filtering out high-frequency noise from the digital power and ground planes. The board designer should choose capacitance and inductance values for high impedance at frequencies of 50 MHz or higher. Figure 15 shows an example of power filtering between the digital and analog power planes.
Figure 15. Isolating ClockLock Power Due to board constraints, it may be impossible even to provide a separate power and ground island for the ClockLock circuit. In that case, the designer may run a trace from the power supply to the VCC_CKLK and GND_CKLK pins. This trace must be wider than a normal signal trace, and should be bypassed with a .2 mF capacitor as close to the VCC_CKLK and GND_CKLK pins as possible.
Conclusion
The MAX+PLUS II development software makes taking advantage of the ClockLock and ClockBoost features easy by providing an integrated solution for in-chip clock distribution. The combination of easy-to-use software and advanced on-chip clock management gives designers high performance at high densities. Design success with the ClockLock and ClockBoost circuits can be ensured by following the guidelines covered in this paper.
References Home | Product of the Week | Tech Note | AppReview | Vendor Tools | Feedback
|
|||||||||||||||||||||||||||||||||
|
Copyright © 2003 ChipCenter-QuestLink About ChipCenter-Questlink |
||||||||||||||||||||||||||||||||||