Distributed Processing for Large Medical Image Database
Fumio AOKI, Haruyuki TATSUMI, Hiroki NOGAWA, Hirofumi AKASHI, Nozomi NAKAHASHI
Information Center of Computer Communication, Sapporo Medical University
Email: kaku, tatsumi, nogawa, hakashi, firstname.lastname@example.org
Abstract: To use large medical image database for medical education, we need powerful computers to handle the large data sets of tens giga-bytes. In this article, we propose a distributed processing solution implemented on low-cost PCs connected with high performance network. Each of the distributed processors handles a subset of the image data on memory and works independently. Experiments on 35 PCs proved the proposed solution can response in about 2 seconds to retrieve a 12MB image from about 15GB image database, this is over 1000 times faster than a straightforward method on one PC.
Keywords: network computing, parallel processing, medical imaging, VHP, distributed systems
In the field of handling high resolution medical images, the problem was the storage space to keep the data in the past, but currently the problem is the real-time processing power to visualize them as we want. To retrieve arbitrary dissection images from large image database, we implemented a distributed processing system which has a controller layer, called Gboss which communicates with clients and manipulates the parallel processors, and a parallel data processing layer, called Gserver which is fully controlled by Gboss to generate image pieces.
2 Gboss & Gserver
Gboss creates control procedures to start the process of each Gserver according to the requests from clients, it sends message to Gserver including command string (Gslice), the dissection Stype (Trans, Long or Sagit), and two ends of the dissection line (Slice1, Slice2). After Gservers finish their process, Gboss receives the image pieces and collects them together to create large image for sending back to clients. Because the large data stream goes from Gboss to clients, a data size packet comes before the RGB24 image data stream (Fig.1). Gboss communicates with Gservers via TCP/IP socket connection to send the messages include Gload of loading image data subset onto each Gserver's memory, Gslice of extracting dissection image pieces and Abort of terminating Gservers' executions. Fig.2 shows the message exchange procedure between Gboss and Gservers.
Fig.1 Messages between Gboss and Viewer
Fig.2 Messages between Gboss and Gservers
All the Gservers have the same server program running, althrough they process different subset of image data. The program are working according to the processing task messages Gtask from Gboss via socket connection. Gtask message defines the way to dissect images with Stype, Slice1, Slice2, when Stype is Trans, only one Gserver will be started to get transverse image, when Stype is Long or Sagit, all the Gservers are started to do an arbitrary dissection for longitude or sagittal images based on the start and end position Slice1 and Slice2.
The data generation for Gslice message is done on Gservers' memory parallelly for the best responsibility. Parallel projection algorithm is used to ensure the algorithm simplicity and to maintain the same width between original image and the generated images. This simple algorithm for the dissection along arbitrary direction performs linear transformation between space coordinate and image coordinate to pickup the nearest pixel value.
In our experiments, we used 35 G3 Macintosh computers (Apple Inc.) with MacOS X server software (BSD Unix). Each has PowerPC G3/450MHz CPU, 512MB main memory and 9GB harddisk. They are connected with FastEther 100Mbps network. Gboss and Gserver programs are implemented using Unix-C to increase the portability to other OS platforms. 1878 original images (VHP Male) about 15GB are divided into 35 subsets, each Gserver keeps 50 or 100 images of amount about 373MB or 747MB.
Table 1: Performance Measurement
|Operation||Process Time||Data Transfer Time||Gboss Involved Data||Gserver Involved Data
373.5MB - 747.1MB
307,200B - 614,400B
307,200B - 614,400B
182,400B - 364,800B
182,400B - 364,800B
Process Time for Gserver includes the time of getting task message and extracting dissection data, Data Transfer Time for Gboss includes the time of reading data from Gservers, generating large image and sending it to client via socket connection. Gboss Involved Data is for the packet transfer data size to clients, and Gserver Involved Data is for the data size extracted in each Gserver (Table.1). With a straightforward method of retrieving arbitrary dissection imags by 1CPU including loading and processing image data from local disk, it costs 2,139 seconds for longitude dissection and 2,129 seconds for sagittal dissection. By our system, it only takes 1-2 seconds, that means the system has about 1,000 times faster performance with 35CPUs than with 1CPU.
In this article, a distributed processing solution for high resolution medical images has been proposed. The high performance comes from the ideas of (a) high parallel on memory processing by Gservers connected with fast networks, (b) concentrated management to the parallel processors by Gboss to provide simply interface for clients, (c) simply retrieval algorithm and portable implementation on independent UNIX computers, (d) improvement of response time by changing previous NFS data connection into TCP/IP socket connection. As the future work, we have designed a parallel computer with 50 processor of running Linux OS on low-cost PC core, which can do the above work more efficiently.
 Aoki, Tastumi, Nogawa, Akashi, Nakahashi, Guo: A Parallel Approach for VHP Image Viewer, Internet Workshop 2000, Intl. Workshop on APAN and its Applications, Proceedings pp.209-214, Feb. 2000.
 Tatsumi, Nogawa, Aoki, Nakamura, Nakahashi, Akashi: Next Generation Internet and Medical Virtual Private Network, The 19th Joint Conference on Medical Informatics, Proceedings pp.36-39, Nov.1999.
 Nogawa, Tatsumi, Nakamura, Kato, Takaoki: An Application of an End-User Computing Environment for the Visible Human Project, The Second Visible Human Project Conference, Proceedings pp.99-100, Oct.1988.
要旨: 高解像度医療画像を教育に活用するため， 任意切断面の抽出と表示をリアルタイムで行う必要がある。しかし，これらの画像データは膨大なため， 通常のコンピュータ上での実現が難しい。本研究では， 高速ネットワーク上のPCを利用した高度分散処理によるこれらの画像データをオンメモリで処理する手法を提案する。これをもとに35CPUで構成するプロトタイプシステムによる本提案の有効性を検証した結果， 従来の手法を使って1CPU による任意切断面の抽出処理は2,100 秒がかかったことに対して， 本手法では2 秒以内で処理できた。