|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||
|
|
A Cost Effective Image Acquisition, Real-Time Processing and Display Architecture Using FPGA Technology
By John Smith, Principal Engineer, VisiCom (jsmith@visicom.com) and
Sheldon Liebman, Industry Advisor
Introduction
This paper will provide an overview of FPGA technology and describe how it is being
used in VisiCom's new VigraVisioný product family. The paper will also give an
example of an image processing function as it is implemented using FPGA technology
and describe how two companies have used earlier versions of this technology to create
or enhance OEM products.
Overview of FPGA Technology
Although the technology was originally developed as an alternative to PALs and used
for glue logic, there were early visionaries who perceived that the potential for FPGA
technology was much greater than that. Even in the early stages of FPGA development.
Papers were published that suggested this technology could be used for complex
applications such as imaging.
VisiCom has made a strong commitment to this technology and has created an
environment to explore the full potential of FPGA technology. As a result of these efforts
and continued development of the chips by companies like Xilinx, FPGA chips can now
be used as a major computing component.
The key to FPGA technology is that it is reconfigurable. At any time, new software can
be loaded into the chip that completely changes its character and function. Although the
original FPGAs were relatively simple devices, this class of chip has grown in size and
complexity to the point that today, complex algorithms can be implemented using
FPGAs.
The programming tools for these products, however, have not advanced to the same
level as other, more mature technologies. As a result, creating software to run in an
FPGA environment requires a high level of skill. Developers create schematics or a
Hardware Description Language (HDL) representation of a design. The design is then
compiled into a bit-stream which is loaded into the chip, rather than building a physical
circuit.
Advancements in FPGA technology have allowed it to become a viable alternative to
other general purpose and specialized processors. FPGA represents the next step in
computer design and control. When more power and specialization were required in the
personal computer and workstation markets, the introduction of RISC chips provided an
alternative to the traditional CISC architecture. For real-time computation, DSP
technology provided even more specialization and power. FPGAs continue to advance
this process. For many applications, the use of FPGAs offers a faster, less expensive
solution that is easier to upgrade as technology continues to move forward.
There are many advantages to using FPGA technology instead of DSPs in a real-time
imaging environment. In addition to higher speed and lower costs, the implementation
of an FPGA solution requires fewer chips on a board. This allows a smaller footprint to
be achieved as well as creating a highly customizable product. In addition, FPGAs
can be upgraded in the field by simply sending new code to run the chips.
There are also a few negatives to consider when using this technology. As mentioned
above, the programming tools for FPGA chips are not as advanced as those of other
technologies. You can't just write a C++ program and have it magically translated into
FPGA code. This technology is also new to the areas of complex computing and
imaging. Offering an FPGA-based solution requires a significant amount of education of
the market and the customer. Especially in the area of image processing, the use of
FPGA technology continues to "push the envelope." This impacts both the development
and the adoption of FPGA-based products by traditional OEM customers.
In spite of the obstacles, VisiCom has been a pioneer in the development of FPGA-based
imaging solutions. VisiCom has been able to create and market imaging products that
use FPGA technology. A few years ago, the company introduced the Falcon-PCIý, a
high-speed capture, processing and display solution that included FPGA control logic.
For certain customers, real-time image processing functions were implemented in the
FPGA. The next level was achieved in 1996 with the introduction of the Falcon-XLý.
This product included expanded FPGA capability for grayscale image processing and
data acquisition.
Description of VigraVision
Figure 1 - VigraVision Block Diagram
Figure 1 (Click to see full-size version.) The VigraVision product, as illustrated in Figure 1, consists of three basic areas representing the image acquisition, processing and display functions. A closer look at this diagram provides a complete overview of the product.
Image Acquisition
VigraVision also supports two digital input ports that can be used to connect with a variety of digital cameras. Currently, interfaces are available for the Kodak MEGAPLUS Model 1.4i, 1.6i, 4.2i and ES 1.0 high-resolution digital cameras and the Pulnix Model TM1000 digital camera. By reprogramming the FPGA, the product can adapt to custom cameras or custom digital signals at speeds up to 40 MHz. This capability allows images as large as 2048x2048 to be captured with the product.
Real-Time Image Processing
The FPGA provides very fast math capability for both image processing and control. In addition, the connection between the FPGA and the DRAM is fast enough to allow the DRAM to act as working memory for the image processing functions. Due to the nature of the FPGA, new code can be downloaded at any time to change the functions performed by the chip. To assist in creating OEM programs for the VigraVision, VisiCom provides the VigraVision ToolBox ý(VTB). This high-level subroutine library contains a full set of image processing functions that can be used to develop imaging applications. The software developer selects whether the functions are implemented directly in the FPGAs or through other methods. VisiCom's RtX-Windowsýis also available for those requiring a real-time X Window interface under VxWorks. Xvideo extensions provide a convenient wrapper for VTB functions for RtX-Windows and Solaris OpenWindows. See Appendix A for a list of real-time functions implemented in FPGA.
Accelerated 2D/3D Graphics
The Virge is capable of displaying 135 million pixels per second at up to 1280x1024 resolution. It contains 4 MB of SGRAM for both image and overlay. One of the nice features about this memory is that the non-destructive overlay and the video display do not have to be programmed at the same bit depth. This means that an 8-bit overlay can be used with a 16- or 24-bit video display, greatly increasing the quality of the combined display. As with all VisiCom imaging products, the overlay is non-destructive. The display section of VigraVision also supports standard NTSC or PAL video output in composite or component (Y/C) format. This allows the output of the system to be fed back into a video display monitor. For some applications, the ability to do video in and video out allows for a more compact design.
Summary of Features
An Example of FPGA Programming
Figure 2 - A 3x3 Median Filter
Figure 2 Xilinx XC4000 FPGAs are organized as a rectangular array of Configurable Logic Blocks (CLBs). Each CLB contains two programmable 16x1 lookup tables (arbitrary 4 input function generators), two registers, and some dedicated high-speed arithmetic carry logic. In the case of the median filter, the high-speed carry logic is used to implement an efficient min/max function. The carry logic in each CLB is set up for an A-B subtract function, while the function generators are used to implement a 2:1 mux. The mux is controlled by carry out of the subtraction. Nodes where both outputs are used may be implemented in 9 CLBs (8 for the mux, 1/2 for carry chain initialization, 1/2 for carry out examine). Nodes where one output is discarded require 5 CLBs. Figure 3 illustrates how a single bit of the min/max function is implemented in ý of a CLB.
Figure 3 - Carry Logic
Figure 3 When the circuit is implemented, pipelining is used to speed up and reduce the circuit's size (see Figure 4). Clocking in three pixels at once eliminates two of the three full sort node groups at the top of the graph. Total CLB usage for the real-time median sort circuit is 85 CLBs, less than one sixth of a Xilinx XC4013E chip. For real-time performance, this circuit can comfortably be clocked at 25MHz. By comparison, the same median operation performed on a general purpose RISC or DSP processor requires more cycles and a higher clock speed. To implement this filter using a
Figure 4 - Using Pipelining
Figure 4 DSP, the code might be as follows:
Load r1, P1
ý ; Load 9 Pixels
Load r9, P9
c1: Compare r1, r2 ; Comparison 1
JumpLessThan c2
Swap r1, r2
c2: Compare r2, r3 ; Comparison 2
JumpLessThan c3
ý
Store r5, Median ; Store result
There are some assumptions in trying to compare the results.
Assume that all instructions are a single cycle except that each
jump taken requires 2 cycles. Also assume zero overhead
looping, all data is found in the cache and there is no
overhead for the calculation of addresses for the 9 pixels.
With these assumptions in place, the calculation of each median on the DSP processor would require 67 cycles. To match the speed of an FPGA running at 25 MHz, the DSP would have to run at 25 x 67 = 1,675 MHz. For this image processing application, theFPGA is thus about 17 times more powerful than a 100 MHz DSP processor. If the DSP were to be clocked at the same 25 MHz, the speed difference is 4 x 17 or over 65 times more powerful.
Customer Case Studies
American Science and Engineering
The company serves two principal markets - detection of contraband for Customs agencies and security for both high-risk government facilities and executive offices of Fortune 500 companies. The first product that AS&E adapted to use the Falcon and FPGA combination is their Model 66Z for MailSearchý and LobbySearchý applications. The combination of the Falcon-PCI and FPGA technology improved the Model 66Z in a number of areas. First and foremost was the ability to implement custom algorithms designed specifically for this application. AS&E submitted function definitions to VisiCom that were programmed into the Xilinx FPGA by VisiCom's engineers. These functions included a new edge enhancement algorithm and zoom function. Other image enhancement features, such as pan, scroll and density expand, were already present in VisiCom's Falcon Software Toolbox and were easily implemented into AS&E's existing code. The image enhancement filtering specified by AS&E uses a 7x7 convolution. This high-pass filter enhancement highlights areas of the image with high spatial frequency components, such as circuit boards, wire bundles, guns and knives and other inorganic materials. The density expand function of the system is used to identify organic materials such as plastic explosives and flammable liquids. With the new product, the ability to identify these materials is enhanced. Another advantage to the new system is that it is much more compact. Prior to moving to FPGA technology, AS&E used seven boards in their Model 66Z. This included four boards controlling two frame buffers and (RS-170 video) displays, one edge enhancement board and two preamplifier boards. In the new system, a single Falcon board controls both (VGA) displays and contains a Xilinx FPGA for edge enhancement. A separate data acquisition board connects to the Falcon for a total of only two boards in the new system. By reducing the number of boards from seven to two, AS&E realized across the board savings in time and money for their Model 66Z. The new systems, which were first installed in 1996, take less time to set up and test. The cabling is simplified and maintenance is easier. Power issues and bus traffic conflicts have been eliminated. The reduction in the number of boards allows a smaller passive backplane to be used as well as reducing inventory requirements. In addition to the cost and configuration savings, the quality of the product has also been improved with the shift to the Falcon. Instead of using 512x480 interlaced (RS-170) video images, the new system utilizes 640x480 non-interlaced VGA displays. This provides more resolution while also decreasing eyestrain for the system operator.
JEOL USA
One of the products created by JEOL USA is the Xvision Plus option for the company's Electron Scanning Microscopes. Xvision Plus is a computerized control for the microscopes that includes high-resolution image acquisition, processing, storage and database management. Introduced approximately two years ago, Xvision Plus utilizes VisiCom's Falcon-PCI with Xilinx FPGA technology. One of the requirements for the Xvision Plus was that it had to support recursive frame averaging in real-time (See Appendix B). This process removes noise and creates a sharper, higher resolution image. To meet this goal, VisiCom and JEOL worked closely to program and debug the FPGA based on JEOL's spec. As a result, JEOL was the only company that produced a true 1K x 1K recursive-averaged image.
Summary
Investing in FPGA technology for imaging applications allows custom functions to be implemented and additional functionality to be introduced at a later date without the need to replace hardware in the field. The result is that FPGAs and VigraVision offer OEMs a powerful, affordable, high capability platform on which to develop and deliver custom imaging solutions. APPENDIX A Real Time Function List This list presents some of the real-time functions being implemented in FPGA. Inter-Frame Functions
Linear Spatial Filters
( ~180 CLBs )
The advantage of the fixed convolvers over the programmable one is that they require fewer FPGA resources to implement. The fixed convolvers generally require ~40 CLBs.
Non-Linear Spatial Filters 1) Rank Filters (3x3)
2) Other
Histogram and LUT Operations
Blob Functions
1) Single Level Miscellaneous Functions
APPENDIX B VigraVision Image Processing Xilinx Configuration 1 Useful for temporal frame averaging coupled w/ elementary 3x3 convolution. Summary The process is temporal frame averaging pipelined into a simple 3x3 convolution. The frame-averaging section is controlled by the IIR filter: Pm = a Im + ( 1 - a ) Pm-1 Defining N = 1/a , N corresponds to the number of frames being averaged together. The value of N is supplied by a counter, which may be initialized to 1 and then counts frames up to a programmable value. The maximum programmable value is NMAX = 256. The 3x3-convolution mask allows simple blurring, sharpening, and edge detecting filters. Frame Averaging and Line Buffer for Convolver
Figure 11 (Click to see full-size version.)
Programmable 3x3 Convolver
Figure 12 (Click to see full-size version.) Home | Product of the Week | Tech Note | AppReview | Vendor Tools | Feedback
|
|||||||||||||||||||||||||||||||||
|
Copyright © 2003 ChipCenter-QuestLink About ChipCenter-Questlink |
||||||||||||||||||||||||||||||||||