java performance

A Cost Model for Java Performance

by Glen McCluskey
<[email protected]>

Glen McCluskey is a consultant with 15 years of experience and has focused on programming lan-guages since 1988. He specializes in Java and C++ perfor-mance, testing, and technical documenta-tion areas.

In his book Programming Pearls, Jon Bentley discusses the use of cost models in connection with programming languages. A cost model identifies the relative execution times of specific language constructs. For example, you can determine roughly how much time is required to perform a divide operation as compared to an add operation, or find out how expensive function calls are. In this column we'll look at a simple cost model for the Java language.

Determining the cost of language features does have some pitfalls. One is that a compiler optimizer may optimize away simple usage, for example, summing two numbers millions of times in a loop.

Related to this problem is the issue of Java just-in-time compilers that perform dynamic compilation and optimization. Such a compiler might, for example, dynamically inline a function.

This analysis uses Sun's Java Development Kit Java interpreter, without optimization and with a switch specified that disables just-in-time compilation:

java -Djava.compiler=NONE file

where file.class exists and was created from file.java.

This discussion includes a series of benchmark programs and concludes with a summary of results. Download the benchmarks from <ftp://ftp.glenmccl.com/pub/free/cost.zip>.

Empty Loops and Arithmetic

The initial set of benchmarks focuses on the cost of looping and arithmetic. The first program measures the cost of an empty loop:

public class cost_loop { static final int N = 70000000; public static void main(String args[]) { long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) ; long elapsed = System.currentTimeMillis() - start; System.out.print("cost_loop: "); System.out.println(elapsed * 1000000 / N); } }

In these programs, the value of N is adjusted so that each program takes approximately the same amount of time to run. Elapsed time is computed in milliseconds and then scaled up to nanoseconds.

The next three programs measure the cost of integer addition, integer division, and floating-point division:

public class cost_add { static final int N = 50000000; public static void main(String args[]) { int a = 37; int b = 47; int c; long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) c = a + b; long elapsed = System.currentTimeMillis() - start; System.out.print("cost_add: "); System.out.println(elapsed * 1000000 / N); } }

public class cost_divide { static final int N = 32000000; public static void main(String args[]) { int a = 659; int b = 47; int c; long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) c = a / b; long elapsed = System.currentTimeMillis() - start; System.out.print("cost_divide: "); System.out.println(elapsed * 1000000 / N); } }

public class cost_fp { static final int N = 20000000; public static void main(String args[]) { double a = 659.84; double b = 47.37; double c; long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) c = a / b; long elapsed = System.currentTimeMillis() - start; System.out.print("cost_fp: "); System.out.println(elapsed * 1000000 / N); } }

Method Calls

The next set of programs measures the costs of method invocation. The Java language features several kinds of method calls. The first one is a static call, a call of a class method without reference to a particular object:

public class cost_static { static final int N = 26000000; static void f() {} public static void main(String args[]) { long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) f(); long elapsed = System.currentTimeMillis() - start; System.out.print("cost_static: "); System.out.println(elapsed * 1000000 / N); } }

The second is an instance method call. It requires more work because the actual method to be called must be determined dynamically, based on the type of the object. (This type of call often goes by the name "virtual function.")

public class cost_virtual { static final int N = 24000000; void f() {} public static void main(String args[]) { cost_virtual obj = new cost_virtual(); long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) obj.f(); long elapsed = System.currentTimeMillis() - start; System.out.print("cost_virtual: "); System.out.println(elapsed * 1000000 / N); } }

The last type of method call is a call through an interface reference, which requires even more work:

interface A { public void f(); } public class cost_interface implements A { static final int N = 17500000; public void f() {} public static void main(String args[]) { A obj = new cost_interface(); long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) obj.f(); long elapsed = System.currentTimeMillis() - start; System.out.print("cost_interface: "); System.out.println(elapsed * 1000000 / N); } }

Array Indexing

The Java language specifies that all array subscripts are range-checked dynamically, with an exception thrown for out-of-bounds indices. So it's interesting to examine the cost of array lookup:

public class cost_array { static final int N = 40000000; public static void main(String args[]) { int vec[] = new int[10]; for (int i = 0; i < 10; i++) vec[i] = i; int x; int j = 5; long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) x = vec[j]; long elapsed = System.currentTimeMillis() - start; System.out.print("cost_array: "); System.out.println(elapsed * 1000000 / N); } }

String Operations

The Java language has a built-in operator, +, for concatenating strings. Java strings are instances of the java.lang.String class and are more sophisticated (and expensive) than simpler C-style strings. Here is a program that measures string operation time:

public class cost_string { static final int N = 360000; public static void main(String args[]) { String a = "abc"; String b = "def"; String c; long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) c = a + b; long elapsed = System.currentTimeMillis() - start; System.out.print("cost_string: "); System.out.println(elapsed * 1000000 / N); } }

Exception Handling and Synchronized Statements

Java exception handling is a mechanism for raising, propagating, and trapping exceptions that represent error conditions. Exception handling tends to be relatively costly, given the work in unwinding the stack, trying various exception handlers, and so on. A program that evaluates this feature is:

public class cost_eh { static final int N = 18000000; public static void main(String args[]) { Throwable exc = new Throwable(); long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) { try { throw exc; } catch (Throwable e) { } } long elapsed = System.currentTimeMillis() - start; System.out.print("cost_eh: "); System.out.println(elapsed * 1000000 / N); } }

A synchronized statement — a statement executed after obtaining a lock on an object — is used in thread programming. This program times such synchronization:

public class cost_sync { static final int N = 13000000; public static void main(String args[]) { Object obj = new Object(); long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) synchronized (obj) {} long elapsed = System.currentTimeMillis() - start; System.out.print("cost_sync: "); System.out.println(elapsed * 1000000 / N); } }

Casting

A Java class reference refers to objects of a given class type and also can refer to objects of subclasses of the type. If B is a subclass of A, then an A reference can refer to A objects or to B objects. If I have an A reference, I can cast it to a B reference, but the cast must be dynamically checked, because the A reference might really refer to an A object, which cannot be cast to a B.

Here's a program that measures the cost of casting:

class A {} class B extends A {} public class cost_cast { static final int N = 30000000; public static void main(String args[]) { A aref = new B(); B bref; long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) bref = (B)aref; long elapsed = System.currentTimeMillis() - start; System.out.print("cost_cast: "); System.out.println(elapsed * 1000000 / N); } }

Creating New Objects

The final area we'll examine is the cost of creating new objects and calling the constructor for each created instance. This particular benchmark also includes the cost of garbage collection, given that you don't explicitly deallocate no-longer-used objects:

class A { A() {} } public class cost_new { static final int N = 6000000; public static void main(String args[]) { A aref; long start = System.currentTimeMillis(); for (int i = 1; i <= N; i++) aref = new A(); long elapsed = System.currentTimeMillis() - start; System.out.print("cost_new: "); System.out.println(elapsed * 1000000 / N); } }

Timing Results

We can divide the timing results into sections, according to the above organization. The times include the "overhead" cost of loop iteration, which is around 138 nanoseconds per iteration. The numbers in parentheses are adjusted nanosecond time values, with the 138 ns loop cost subtracted from the raw value in each case.

Empty Loops and Arithmetic cost_loop: 138 (0) cost_add: 175 (37) cost_divide: 290 (152) cost_fp: 485 (347)

Method Calls cost_static: 350 (212) cost_virtual: 370 (232) cost_interface: 542 (404)

Array Indexing cost_array: 192 (54)

String Operations cost_string: 25119 (24981)

Exception Handling and Synchronized Statements, cost_eh: 545 (407) cost_sync: 721 (583)

Casting cost_cast: 299 (161)

Creating New Objects cost_new: 1704 (1566)

Using the adjusted numbers, we can make some observations about the costs of various operations. For example, calling a method through an interface reference is about twice as costly as through a normal object reference. Integer division costs about four times as much as integer addition, and floating-point division costs about nine times as much.

Creating a new object is more than 40 times as expensive as a simple operation like addition. The most expensive operation of all is string concatenation, which involves multiple method calls, creation of a new string, copying, and so on.

Summary

Cost models are a useful way to gain performance insights about the features of a programming language so that you can better gauge the relative costs of different language features.