# ARCHITECTURE OF A HIGH PERFORMANCE IMAGE PROCESSING SYSTEM

Hiroshi Takenaga<sup>\*</sup>, Yoshiki Kobayashi<sup>\*</sup>, Masao Takatoo<sup>\*</sup>, Yoshiyuki Okuyama<sup>\*</sup> Shuichi Miura<sup>\*</sup>, Tadashi Fukushima<sup>\*</sup>, Kazuyoshi Asada<sup>\*\*</sup>, Kazunori Fujiwara<sup>\*\*</sup>

\*Hitachi Research Laboratory, Hitachi, Ltd. 4026 Kuji-cho, Hitachi, Ibaraki, 319-12 Japan \*\*Omika Works, Hitachi, Ltd. 5-2-1 Omika-cho, Hitachi, Ibaraki, 319-12 Japan

# ABSTRACT

This paper describes the architecture of a high performance, compact image processing system. The system feature is that an image processor is constructed by employing eight kinds of high speed VLSIs, including real-time video image processing LSI(ISP-II). These VLSIs are developed while realizing both compactness and easy system extensions. The ISP-II is a VLSI for gray scale image processing. It includes two line buffer memories and has a time-shared processing function. So one ISP-II can carry out a 3 x 3 spatial convolution at 6 MHz without additional circuitry. Using three ISP-IIs, a 3 x 3 spatial convolution can be executed at 24 MHz. The system's image processor is realized on one board, including three gray 8-bit and three binary image memories having 512 x 512 pixels, A/D and D/A converters, etc. The image processing speed is 6 MHz in processing a 3 x 3 spatial convolution, labeling, etc., and 12 MHz in executing an affine transform, inter-image processing, etc.

### 1. INTRODUCTION

There are four considerations to make when developing an image processing system: environment tolerance of lighting changes, high speed processing, compactness, and low cost. Gray scale image processing, VLSI technology and parallel processing are essential technologies for these. So an Image Signal Processor (ISP) fabricated by the 3  $\mu$ m CMOS process was developed [1] and an image processing system was developed by using the ISP. The ISP is in the same category as partially parallel processing devices and three ISP LSIs can perform, in real-time, local operations onto 256 x 256 non-interlace TV images.

But miniaturization and high speed image processing have been desired for a wider range of applications of image processing system. Therefore the Image Signal Processor-II(ISP-II) has been developed as the successor device of ISP.[2][3] Additionally basic elements such as image memory addressing, feature extraction, etc., to construct an image processing system have been realized on eight VLSIs, including the ISP-II. Using these VLSIs, a compact and high performance image processing system, the HITACHI-IP/200, has been developed.

# 2. DESIGN PRINCIPLES

Generally, pattern recognition entails the following process (Fig. 1). First an image is taken from a TV camera, and preprocessing such as noise reduction, binarization, labelling etc., is carried out on that image. Some features are then extracted from the processed image and pattern recognition is carried out using them. To make the system compact and high performance, it is divided into three blocks, i.e. image memory block, image preprocessing block and feature extraction block, according to the above pattern recognition flow and the basic functions derived from each block are realized by the eight kinds of VLSIs. Table 1 describes these VLSIs.

These VLSIs are developed according to two design principles. One is that each VLSI can execute an operation at the video signal rate. The other is that the various types of image processing systems, for instance, a compact system, or a expandable system etc., can be developed easily by using the VLSIs. As a result, the following three features are realized.

- 1. Affine transformation as well as raster scan can be carried out at video rate.
- Single ISP-II can perform, in real-time, local operations onto 256 x 256 non-interlace TV images.
- Image features, such as area, gravity, histogram and others, can be extracted in parallel.[4]

The architecture of this ISP-II is discussed further in the next section.

### IAPR Workshop on CV - Special Hardware and Industrial Applications OCT.12-14, 1988, Tokyo



# 3. THE ISP-II LSI

The ISP-II LSI, which is fabricated by the  $1.8 \ \mu m$  BiCMOS process, has the same basic architecture as the preceding LSI, the ISP. Additionally, two line memories are integrated on the chip and a time-shared method is implemented. So the ISP-II has three added features.

- The spatial image operations such as a 3 x 3 spatial convolution can be carried out without additional circuitry.
- The ISP-II can operate at 24 MHz (instead of 6 MHz with the ISP ) and three ISP-IIs can accomplish, in real-time, operations onto 512 x 512 non-interlace TV images.
- The ISP-II has the basic image processing functions, such as spatial convolution, pattern matching, inter-image processing, etc., and the kernel of local operations which can be executed is expandable two-dimensionally.

# 3.1 Fundamental Architecture

The ISP-II is developed to realize spatial convolution on one chip. Spatial convolution is a processing which multiplies each pixel in a local image by a weighting coefficient for image smoothing, contour emphasis, etc.

Fig. 2 shows a block diagram of the ISP-II, when it is executing a 3 x 3 spatial convolution. Incoming pixel data are distributed to processor elements (PEs) via two line memories (LMs) and 3 x 3 shift registers (SRs). Weighting coefficients are also supplied to the PEs from a data memory (DM) and multiplied by pixel data in the PEs in parallel. The processed data are then linked together by a linkage unit (LU).

Although the preceding ISP had only processing circuitries (enclosed by dotted lines in Fig. 2), the ISP-II also has two 1,024-word x 8b line memories. As a result, 3 x 3 kernel data can be taken from the input image without additional

| Category        | VLSI                                                    | Speed                | Functions                                                                              |
|-----------------|---------------------------------------------------------|----------------------|----------------------------------------------------------------------------------------|
| Image           | ISP-II<br>(Image Signal Processor)                      | Custom<br>24 MHz     | •Convolution<br>•Inter-image processing<br>•Pattern matching                           |
| processing      | BFP<br>(Binary Figure Processor)                        | Gate Array<br>12 MHz | <ul> <li>Erosion, Dilation, etc.</li> <li>Density conversion</li> </ul>                |
| processing      | BLP<br>(Binary Labelling Processor)                     | Gate Array<br>6 MHz  | *Labelling                                                                             |
| Feature         | PFP<br>(Parallel Feature Processor)                     | Gate Array<br>12 MHz | Parallel extraction<br>of area, gravity, etc.<br>Maximum/minimum<br>density extraction |
| extraction      | HP<br>(Histogram Processor)                             |                      | <ul> <li>Density distribution</li> <li>Projection distribution</li> </ul>              |
| Image<br>memory | AP/IAM<br>(Address Processor/<br>Image Address Manager) | Gate Array<br>12 MHz | •Affine transform<br>•Contour tracking                                                 |
| control         | IDM<br>(Image Data Manager)                             | Gate Array<br>12 MHz | -Plow control of<br>image data and address                                             |

Table 1 Features of image processing VLSIs

 (Parallel Feature Processor)
 12 MHz
 Maximum/minimum

 extraction
 HP
 Gate Array
 Gate Array

 (Histogram Processor)
 12 MHz
 \*Projection
 distribution

 Image
 AP/IAM
 Gate Array
 \*Projection
 distribution

 Image
 AP/IAM
 Gate Array
 \*Affine transform

 (Address Processor)
 12 MHz
 \*Contout tracking

 control
 Image Address Manager)
 12 MHz
 \*Contout tracking

 control
 Image Data Manager)
 12 MHz
 \*Gate Array

 control
 Image Data Manager)
 12 MHz
 \*Gate Array

 control
 Image Data Manager)
 12 MHz
 `oniout tracking

 control
 Image Address Manager)
 12 MHz
 `oniout tracking

 control
 Image Data Manager)
 Image Address
 Address

ISP-II can carry out a 3 x 3 spatial convolution at

about 8 MHz (one-thirds of 24 MHz (maximum

# 3.2 Extensibility

speed of ISP-II)).

To enhance performance or to expand kernel size of the spatial operations, a multi ISP-II configuration is used. In this configuration, shown in Fig. 3, many types of local operations are possible by programming the number of steps of the time-shared method and configuration of the LMs.

Fig. 3(a) shows the structure for enhancing the performance or expanding 2,048 pixel data horizontally. In this case, the PEs process pixel data which are horizontally adjacent. Only one line memory is required for the vertical expansion of a kernel, so the word length of the line memory is expanded to 2,048 words by cascading two 1,024-word line memories. The system can carry out a 3 x 3 spatial convolution at 24 MHz without the time-shared method.



Fig. 3(b) shows the structure for expanding the kernel size, for instance, to a 7 x 7 kernel size. In this case, the PEs process pixel data which are vertically adjacent and a 7 x 7 spatial convolution can be executed in a time-shared processing of seven cycles. So the system can execute it at 3.4 MHz (one-seventh of 24 MHz).

# 3.3 Performance and specifications

Fig. 4 relates of processing speed and the numbers of ISP-IIs. 6 MHz processing speed is required for processing, in real-time, onto 256 x 256 pixel non-interlace TV images. In this case, one ISP-II can carry out a 3 x 3 spatial convolution, and four ISP-IIs are needed for a 5 x 5 kernel and six ISP-IIs for a 7 x 7 kernel.

For 512 x 512 pixel non-interlace TV images, the processing speed is 24 MHz. In this case, three, nine or seventeen ISP-IIs are required for 3 x 3, 5 x 5 or 7 x 7 kernel sizes.

Performance with one ISP-II is a maximum of 150 million operations per second (MOPS) in a 3 x 3 spatial convolution.

Table 2 shows the image processing of ISP-II. Its specifications are summarized in Table 3 and a photomicrograph is shown in Fig. 5.



Fig. 4 Performance of Multi ISP-II System

10

Number of ISP-IIs

15

20

### 4. IMAGE PROCESSING SYSTEM

The image processing system is incomplete with only the preprocessing functions of the ISP-II and it needs image memory address generation, feature extraction functions, etc. So, the eight kinds of VLSIs in Table 1 are developed.

Using these VLSIs, various types of image processing systems, e.g. compact, high performance, etc., will be realized easily. The new image processing system, the HITACHI-IP/200, has been developed, which has more advanced compatibilities than the preceding systems, the HITACHI-IP/10,20, and is more compact and offers higher performance than them. It has a small size of 430W x 110H x 480D mm, a high processing speed of 83 ns per pixel and multi-functions of gray scale image processing and pipeline processing.

Fig. 6 shows an image processor block diagram of the HITACHI-IP/200. It has three

Table 2 Representative Image Processing of ISP-II

| Source Image | Image Processing                                                                                                                                     |  |  |
|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| Binary       | Pattern matching     Pixel-to-pixel operations     Binarization     Pixel-to-pixel operations     Spatial convolution     *Non-linear spatial filter |  |  |
| Gray-scale   |                                                                                                                                                      |  |  |
| Color        | <ul> <li>Gray scale conversion</li> <li>Color system conversion</li> <li>⇒Classification by color-distance</li> </ul>                                |  |  |

## Table 3 Specifications of ISP-II

| Technology          | 1.8 µm BiCMOS                   |  |
|---------------------|---------------------------------|--|
| Total device number | 164k                            |  |
| Gate count          | 15k gates                       |  |
| RAM capacity        | 16k bit                         |  |
| Chip size           | $10.2 \times 10.2 \text{ mm}^2$ |  |
| Operation cycle     | 24 MHz                          |  |
| I/O interface       | TTL compatible                  |  |
| Supply voltage      | 5 V                             |  |
| Package             | 120 pin PGA                     |  |



Fig. 5 Photomicrograph of ISP-II

## IAPR Workshop on CV - Special Hardware and Industrial Applications OCT.12-14, 1988, Tokyo

image memories. Each memory has an IAM LSI and an IDM LSI which control both image memory address flow and image data flow. An AP LSI, which generates the image memory address, controls the whole image processor. Image data from a TV camera or image memory are processed by the preprocessing LSI, ISP-II, BFP and BLP. A maximum of six features are extracted simultaneously from the processed image by HP and PFP. These processes are executed in pipeline. The image processor shown in Fig. 6 is realized on one board of 290 x 290 mm<sup>2</sup>.

Table 4 shows the image processing functions and performance speed. Spatial convolution, histogram processing, etc., can be carried out at 6 MHz as usual, but inter-image processing, Affine transform which is a new function, etc., can be done at 12 MHz which is double that of the conventional system. Additionally parallel extraction of the features is realized and the whole performance of the system is improved two to five times.

On the other hand, for easy algorithm development and versatile processing, the software environment shown in Fig. 7 has been constructed. Abundant commands, approximately 250 types in all, are available under the operating system (OS-9/68000). They are used in application programs written in C or BASIC. Additionally, for programmingless pattern recognition processing from image input to recognition results, multi-screen software is supported which constructs an algorithm development in an interactive mode using a mouse.



Fig. 6 Block Diagram of HITACHI-IP/200

|                  |         | Multi-Screen Software |             |
|------------------|---------|-----------------------|-------------|
| 05 9/68000       | Command | Library               | User        |
| 0.00.510.0000000 |         | BASIC                 | Application |

Fig. 7 Software Configuration

### 5. CONCLUSION

The architectures of image signal processing VLSIs, especially the ISP-II, and a high performance image processing system were presented here. The ISP-II is the successor device of the ISP and has the same fundamental architecture with the addition of two line memories. The ISP-II was fabricated by the 1.8  $\mu$ m BiCMOS process. As a result, spatial image operations such as a 3 x 3 spatial convolution can be carried out, in real-time, on one ISP-II without additional circuitry, onto 256 x 256 pixel non-interlace TV images. Other basic elements which construct image signal processing system are also realized on seven VLSIs.

The compact and high performance image processing system, the HITACHI-IP/200, was developed by using these VLSIs. The system's image processor was realized on one board, including three gray 8-bit and three binary image memories of 512 x 512 pixel, A/D, D/A, etc. The image processing speed was 6 MHz in executing a 3 x 3 spatial convolution, labelling, etc., and 12 MHz in executing an affine transform, interimage processing, etc.

#### REFERENCES

- T. Fukushima, et al., "An Image Signal Processor," IEEE ISSCC83, 1984
- [2] T. Fukushima, et al., "Architecture of an Image Signal Processor-2 (ISP-2)," Proc. of 8th ICPR, pp.38-41, 1986
- [3] Y. Kobayashi, et al., "A BiCMOS Image Signal Processor with Line Memories," IEEE ISSCC87, 1987
- [4] H. Takenaga, et al., "Architecture of a Feature Extraction Processor for Image Processing System," Proc. of Int. Workshop on Ind. Appl. of MVMI, pp.152-155, 1987

| Processing<br>Function                                                                                                                                            | Description                                                 | Speed  |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|--------|
| Convolution · Image smoothing<br>Laplacian, etc.                                                                                                                  |                                                             | 6 MHz  |
| Feature<br>Extraction                                                                                                                                             | Erosion<br>Outline detection<br>thinning , etc.             |        |
| Feature Code<br>Extraction                                                                                                                                        | Direction code<br>End point code<br>Intersection code, etc. | 12 MHz |
| Histogram Object's area<br>Perimeter<br>Processin Projection distribution, etc.                                                                                   |                                                             | 6 MHz  |
| Density<br>Conversion                                                                                                                                             | Gamma conversion<br>Enhancement stripping conversion        | 12 MHz |
| Labelling                                                                                                                                                         | Number of independent objects on screen                     | 6 MHz  |
| Inter-image Binary inter-image processing (AND, OR, EOR),<br>Binary image gray-scale inter-image processing<br>Processing Gray-scale inter-image processing, etc. |                                                             | 12 MHz |
| Affine<br>Transformation                                                                                                                                          | Enlargement<br>Rotation, etc.                               | 12 MHz |
| Pattern Matching                                                                                                                                                  | 8 x 12 pixel template matching                              | 6 MHz  |

Major Functions and Performance Speeds of the System