abstract from
Illinois-Intel Multithreading Library (IML):
Multithreading Support for iA-based Multiprocessor Systems
Powerful desktop multiprocessor systems based on the Intel
Architecture (iA) offer a formidable alternative to
scientific/engineering and commercial application developers at an
attractive cost-performance ratio. However, the lack of adequate
compiler and runtime library support for multithreading and parallel
processing on Windows NT makes it difficult or impossible to fully
exploit the performance advantage of these multiprocessor systems. In
this paper we describe the design, development and initial performance
results of the Illinois-Intel Multithreading Library (IML) which aims
at providing an efficient and powerful (in terms of types of
parallelism it supports) API for multithreaded application developers.
IML implements a parallel execution environment, which creates,
enqueues/dequeues, binds and schedules user-level threads on Windows
NT threads and fibers. One of the unique and novel features of IML is
its support for both loop-level (data) parallelism and task-level
(functional) parallelism, as well as nested parallel threads.
Although loop-level parallelism is most useful in scientific and
engineering applications, functional parallelism is often the norm in
multimedia, internet and Java applications. IML upgrades the
multithreading support available on the iA-based Windows NT platforms
to levels that are comparable or superior to those found on high-end
parallel and supercomputers. Multithreading a number of diverse
benchmarks (ranging from POV-Ray to SPECfp95 and the BLAS routines)
using IML resulted in significant speedups on a quad-Pentium Pro
system.
Future releases of IML will support several loop scheduling schemes as
well as controlled thread migration for the purpose of dynamic load
balancing. The programmer or the compiler would thus be able to
customize scheduling on a per loop basis taking into consideration
performance sensitive characteristics such as branches inside loops
and data locality. The Intel FORTRAN compiler and the Parafrase-2
experimental parallelizing compiler are being enhanced in order to
automatically generate the IML API, thereby facilitating the
development of multithreaded application codes that fully exploit the
performance potential of iA-based multiprocessor servers and desktops.