FPGA versus DSP design reliability and maintenance

DSP-FPGA.com — July 25, 2007

Due to advances in semiconductor technology, we are seeing ever more complex (DSP) algorithms, protocols, and applications become realizable. This, in turn, is rapidly increasing the complexity of these systems and products. As the complexity of systems increase, system reliability is no longer solely defined by the hardware platform reliability, typically quantified in MTBF (mean time between failure) calculations. Today, system reliability is being increasingly determined by hardware and software architectures, development and verification processes, and the level of design maintainability.

One of the fundamental architecture issues is the type of DSP platform. Digital functions are commonly implemented on two types of programmable platforms; DSPs and Field Programmable Gate Arrays (). DSPs are a specialized form of microprocessor, while the is a form of highly configurable hardware. In the past, the usage of DSPs has been nearly ubiquitous, but with the needs of many applications outstripping the processing capabilities (MIPS) of DSPs, the use of FPGAs has become very prevalent. Currently, the primary reason most engineers choose use an FPGA over a DSP is driven by the MIPS requirements of an application. Thus, when comparing DSPs and FPGAs, the common focus is on MIPs comparison – certainly important, but not the only advantage of an FPGA. Equally important, and often overlooked, is the inherent advantage that FPGAs have for product reliability and maintainability. This second advantage is the focus of this discussion.

Impact of design methodology on product reliability Nearly all engineering project managers can readily quote the date of the next product software update or release. Most technology companies have a long, internal list of software bugs or problem reports, and which software release will contain the associated patch or fix. It has generally come to be expected that all software, (DSP code is considered a type of software) will contain some bugs and that the best one can do is to minimize them.

By comparison, FPGA designs tend to be much less frequently updated, and it is generally a rather unusual event to for a manufacturer to issue a field upgrade for an FPGA configuration file. The reason is that reliability and maintainability is much better in FPGA implementations compared to those using a DSP.

Why? The engineering development process for DSPs and FPGAs are dramatically different. There is a fundamental challenge in developing complex software for any type of processor. In essence, the DSP is basically a specialized processing engine being constantly reconfigured for many different tasks, some digital signal processing, others more control or protocol oriented tasks. Resources such as processor core registers, internal and external memory, DMA engines, and IO peripherals are shared by all tasks, often referred to as “threads” (see Figure 1). This creates ample opportunities for the design or modification of one task to interact with another, often in unexpected or non-obvious ways. In addition, most DSP algorithms must run in “real-time”, so even unanticipated delays or latencies can cause system failures. Common causes of bugs are due to:

  • Failure of interrupts to completely restore processor state upon completion
  • Blocking of a critical interrupt by another interrupt or by an uninterruptible process
  • Undetected corruption or non-initialization of pointers
  • Failing to properly initialize or disable circular buffering addressing modes
  • Memory leaks or gradual consumption of available volatile memory due to failure of a thread to release all memory when finished
  • Unexpected memory rearrangement by optimizing memory linkers/compliers
  • Use of special “mode” instruction options in core
  • Conflict or excessive latency between peripheral accesses, such as DMA, serial ports, L1, L2, and external SDRAM memories 
  • Corrupted stack or semaphores
  • Mixture of “C” or high level language subroutines with assembly language subroutines

Microprocessor, DSP, and Operating Systems (OS) vendors have attempted to address these problems with different levels of protection or isolation of one task or “thread” from another. Typically the operating system, or kernel, is used to manage access processor resources, such as allowable execution time, memory, or to common peripheral resources. However, there is an inherent conflict between processing efficiency and level of protection offered by the OS. In DSPs, where processing efficiency and deterministic latency are often critical, the result is usually minimal or no level of OS isolation between tasks. Each task often requires unrestricted access to many processor resources in order to run efficiently.

21
Figure 1: levels of protection or isolation of one task or 'thread' from another

Compounding these development difficulties is incomplete verification coverage, both during initial development and during regression testing for subsequent code releases. It is nearly impossible to test all the possible permutations (often referred to as “corner cases”) and interactions between different tasks or threads which may occur during field operation. This makes software testing arguably the most challenging part of the software development process. Even with automated test scripts, it is not possible to test all possible scenarios. This process must be repeated after every software update or modification to correct known bugs or add new features. Occasionally, a new software release also inadvertently introduces new bugs, which forces yet another release to correct the new bug. As products grow in complexity, the number of lines of code will increase, as will the number of processor cores, and an ever greater percentage of the development effort will need to be devoted to software testing.

Improving the development process So exactly how does the FPGA development process improve on this unhappy state of affairs?    

The actual design of each task (or thread) is more or less equivalent in complexity whether using a DSP or FPGA implementation. Both FPGA and DSP implementation routes offer the option of using 3rd party implementations for common signal processing algorithms, interfaces, and protocols. Each offers the ability to reuse existing IP in future designs. But that’s where the similarity tends to end. An FPGA is a more native implementation for most digital signal processing algorithms. Each task is allocated its own resources and runs independently (see Figure 2). This intuitively makes more sense, to process an often continuously streaming signal in an assembly-line like process, with dedicated resources for each step. And, as Henry Ford discovered nearly 100 years ago, the result is a dramatic increase in throughput. As the FPGA is inherently a parallel implementation, it offers much higher digital signal processing rates in nearly all applications.

22
Figure 2: An FPGA is a more native implementation for most digital signal processing algorithms; each task is allocated its own resources and runs independently
(Click graphic to zoom by 1.3x)

FPGA resources assigned can be tailored to the requirement of the task. The tasks can be broken up along logical partitions. This usually makes for a well defined interface between tasks, and largely eliminates unexpected interaction between tasks. Because each task can run continuously, the memory required is often much less than in a DSP, which must buffer the data and process in a batch fashion. As FPGAs distribute memory throughout the device, each task is most likely permanently allocated the necessary dedicated memory. This achieves a high degree of isolation between tasks. The result is modification of one task being unlikely to cause an unexpected behavior in another task. This, in turn, allows developers to easily isolate and fix bugs in a logical, predictable fashion.

Design verification The link between product reliability and design methodology is often underappreciated. Commonly, discussions about development tools emphasize engineering productivity increases. But as product complexities increase, an ever greater portion of the overall engineering process is dedicated to testing and verification. This is where FPGA design methodology offers large advantages compared to software-based design verification.

Fundamentally, FPGA design and verification tools are closely related to ASIC development tools. In practice, most ASIC designs are prototyped on FPGAs. This is a critical point, because bugs are most definitely not tolerated in ASICs. Unlike software, there is remote possibility of field upgrades to remedy design bugs in an ASIC. As time and development costs are very high, ASIC developers go to extreme lengths to verify designs against nearly all scenarios. This has led to test methodologies that provide nearly complete coverage of every gate under all possible inputs, accurate modeling of routing delays within the devices, and comprehensive timing analysis. Since FPGA verification tools are closely related cousins of their ASIC counterparts, they have benefited enormously from the many years of investment in the ASIC verification.  

The use of FPGA partitioning, test benches, and simulation models makes both integration and on-going regression testing very effective for quickly isolating problems, speeding the development process, and simplifying product maintenance and feature additions. These are crucial advantages in the FPGA vs. DSP development process and will become increasingly important as the complexity of designs and the size of development teams increase.

An FPGA vendor provides a comprehensive set of in-house and third party tools to provide a unified tool flow for architecture, partitioning, floor planning, facilitate design intent, simulation, timing closure, optimization, and maintainability. In particular, architectural partitioning is integral to the design entry process. This partitioning, which normally includes chip resources required within the partition, is extended during timing closure and ongoing maintenance phases of the development, which guarantees a high degree of isolation. Each logical partition, as well as the overall design, can have independent test benches and simulation models. The development of the test benches during the FPGA design cycle can be reused to verify proper functionality in later changes. This will make product maintenance much simpler. There is nothing as frustrating as having a new release, which in the process of fixing certain bugs, inadvertently creates new ones.

There is a large industry ( or industry) continually driving the development of FPGA and ASIC test and verification tools. There is not a comparable counterpart in the software verification space. This may change as the industry realizes the enormous costs and challenges in software verification, but for now, the practical solution in the software world is to keep downloading the latest patch.

Many engineering managers intuitively understand this. The rate of software updates to remedy bugs far exceeds the rate of comparable FPGA updates. It is expected and normal to roll out bug fixes on on a regular basis. With the availability of both low cost and high end DSP-optimized FPGA devices, extensive IP cores, as well as the availability of high level design entry methods and the inherent robustness of the design and verification processes, FPGAs will increasingly be the preferred choice for implementing digital signal processing.