Performance Comparison of Commercial and Open-source Compilers using SPEC CPU2006
Sergio Aldea López , Diego R. Llanos Ferraris and Arturo González-Escribano
The work of the compiler is fundamental to exploit the hardware capabilities of a system running a particular application, not only to improve the sequential execution time but also to have the possibility of automatic parallelizing part of the code. This article explores the relative performance of the code generated by three popular compilers (Intel C++/Fortran Compiler 10.0, Sun Studio 12 and GCC 4.1.2). To perform the comparison, we ran the reference tests provided by the SPEC CPU2006 benchmark suite using these compilers. Both sequential and automatic parallel performance obtained is analyzed, using different hardware architectures and configurations. The study includes a detailed description of the different problems encountered while running SPEC CPU2006 benchmarks with these compilers. Our evaluation show that, in general terms, Sun compiler obtains the best results with integer benchmarks, while Intel is the best choice for floating-point applications. With respect to the autoparallelization options, the commercial compilers considered only improve noticeably the performance obtained by sequential code in few applications, while the GCC suite evaluated does not support this feature.
The content of this website is the set of reports used in the comparison and the compilation flags activated.
The following list shows the optimization flags used in the different compilers in order to obtain the reports:
- GCC
- without optimizations: -O2
- with optimizations: -O3 -funroll-loops -fno-inline-functions ftree-vectorize
- INTEL
- without optimizations: -O2 -no-for-main (C and Fortran at once)
- with optimizations (32-bit): -O3 -ipo -xT -axT -no-prec-div -funroll-all-loops -nofor-main (C and Fortran at once)
- with optimizations (64-bit): -O3 -ipo -xW -axW -no-prec-div -funroll-all-loops -nofor-main (C and Fortran at once) -parallel (parallel version)
- SUN
- without optimizations: -xO3 -library=stlport4 (C++ Benchmarks except 453.povray)
- with optimizations (32-bit): -fast -xarch=sse3 -library=stlport4 (C++ Benchmarks except 453.povray)
- with optimizations (64-bit): -fast -xarch=sse3 -m64 -library=stlport4 (C++ Benchmarks except 453.povray) -xautopar -xreduction (parallel version)
Sequential 32-bit | Sequential 64-bit | |
---|---|---|
GCC | INT /FP | INT/ FP |
INTEL | INT /FP | INT /FP |
SUN | INT /FP | INT /FP |
Sequential 64-bit | Parallel 64-bit | |
---|---|---|
GCC | INT /FP | - |
INTEL | INT /FP | INT /FP |
SUN | INT /FP | INT /FP |