Comparison of Java and C++ - Performance

Performance

In addition to running a compiled Java program, computers running Java applications generally must also run the Java virtual machine (JVM), while compiled C++ programs can be run without external applications. Early versions of Java were significantly outperformed by statically compiled languages such as C++. This is because the program statements of these two closely related languages may compile to a few machine instructions with C++, while compiling into several byte codes involving several machine instructions each when interpreted by a JVM. For example:

Java/C++ statement C++ generated code (x86) Java generated byte code
vector++; mov edx,

mov eax,
inc dword ptr

aload_1

iload_2
dup2
iaload
iconst_1
iadd
iastore

Certain inefficiencies are inherent to the Java language itself, primarily:

  • All objects are allocated on the heap. For functions using small objects this can result in performance degradation as stack allocation, in contrast, costs essentially zero. However, this advantage is obsoleted by modern JIT compilers utilising escape analysis or escape detection to allocate objects on the stack. Escape analysis was introduced in Oracle JDK 6.
  • Methods are virtual by default. This slightly increases memory usage by adding a single pointer to a virtual table per each object. It also induces a startup performance penalty, since a JIT compiler must perform additional optimization passes even for de-virtualization of small functions.
  • A lot of casting required even using standard containers induces a performance penalty. However, most of these casts are statically eliminated by the JIT compiler, and the casts that remain in the code usually do not cost more than a single CPU cycle on modern processors, thanks to branch prediction.
  • Array access must be safe. The compiler is required to put appropriate range checks in the code. The naive approach of guarding each array access with a range check is not efficient, so most JIT compilers generate range check instructions only if they cannot statically prove the array access is safe. Even if all runtime range checks cannot be statically elided, JIT compilers try to move them out of inner loops to make the performance degradation as low as possible.
  • Lack of access to low-level details prevents the developer from improving the program where the compiler is unable to do so. Programmers can interface with the OS directly by providing code in C or C++ and calling that code from Java by means of JNI.

In contrast, various optimizations in C++ are either too difficult or impractical to implement:

  • Pointers make optimization difficult since they may point to arbitrary data. However, in some cases this is obsoleted as new compilers introduced a strict-aliasing rule and because of support of the C99 keyword restrict.
  • Java garbage collection may have better cache coherence than the usual usage of malloc/new for memory allocation, as its allocations are generally made sequentially. Nevertheless, arguments exist that both allocators equally fragment the heap and neither exhibits better cache locality.
  • Due to the lack of garbage collection in C++, programmers must supply their own memory management code, often in the form of reference-counted smart pointers.
  • Since the code generated from various concretisations of the same templated class in C++ is not shared, excessive use of templates may lead to significant increase of the executable code size.
  • Run-time compilation can potentially use additional information available at run-time to improve code more effectively, such as the processor on which the code will be executed. However, this claim is effectively made obsolete as most state-of-the-art C++ compilers generate multiple code paths to employ the full computational abilities of the given system
  • Run-time compilation allows for more aggressive virtual function inlining than is possible for a static compiler, because the JIT compiler has complete information about all possible targets of the virtual call, even if they are in different dynamically loaded modules. Currently available JVM implementations have no problem in inlining most of the monomorphic, mostly monomorphic and dimorphic calls, and research is in progress to inline also megamorphic calls, thanks to the recent invoke dynamic enhancements added in Java 7. Inlining can allow for further optimisations like loop vectorisation or loop unrolling, resulting in a huge overall performance increase.
  • Because dynamic linking is performed after code generation and optimisation in C++, function calls spanning different dynamic modules cannot be inlined.
  • Because thread support is provided by libraries in C++, C++ compilers have no chance to perform thread-related optimisations. In Java, thread synchronisation is built into the language, so the JIT compiler can, with the help of escape analysis, easily elide or coarse locks, significantly improving performance of multithreaded code. This technique was introduced in Sun JDK 6 update 10 and is named biased locking.

Read more about this topic:  Comparison Of Java And C++

Famous quotes containing the word performance:

    So long as the source of our identity is external—vested in how others judge our performance at work, or how others judge our children’s performance, or how much money we make—we will find ourselves hopelessly flawed, forever short of the ideal.
    Melinda M. Marshall (20th century)

    The way to go to the circus, however, is with someone who has seen perhaps one theatrical performance before in his life and that in the High School hall.... The scales of sophistication are struck from your eyes and you see in the circus a gathering of men and women who are able to do things as a matter of course which you couldn’t do if your life depended on it.
    Robert Benchley (1889–1945)

    Kind are her answers,
    But her performance keeps no day;
    Breaks time, as dancers,
    From their own music when they stray.
    Thomas Campion (1567–1620)