java performanceMemory Fragmentation![]()
by Glen McCluskey
Glen McCluskey is a consultant with 15 years of experience and has
focused on programming languages since 1988. He specializes in Java and
C++ performance, testing, and technical documentation areas.
This problem has an obvious solution, which is to represent the data structure in pieces and map indices into the structure into the appropriate piece. Such a technique will work with any programming language, but languages like C++ and Java make it easier, because the details of representation and mapping can be encapsulated within a class. To see how this works, consider a Java class to represent very large or sparse arrays of objects: import java.util.*;
public class SparseArrayList extends AbstractList
{
// actual page size that is set
// pages
// default constructor
// constructor specifying page size
// number of slots currently allocated
// set an array slot to a given value and return
the old value int p = index / pagesz; // if page array not big enough, expand
if (p >= pages.length) { // need to allocate a new page?
if (pages[p] == null)
Object old = pages[p][index %
pagesz];
return old;
// get the value of a given array slot, null if
none
int p = index / pagesz;
return pages[p][index % pagesz]; Java 2 contains a set of classes known as a "collections framework," and the SparseArrayList class is based on this framework. Collection and List are top-level interfaces (an interface is a specification of a set of methods that an implementing class must declare and implement) in the framework, and the abstract classes AbstractCollection and AbstractList partially implement the functionality of these interfaces. SparseArrayList need only define constructors and the size(), get(), and set() methods in order to fully implement the interfaces. SparseArrayList represents an array as a series of pages, each containing a default of 1024 elements. An array index is split apart to find the page and offset within the page. A page is allocated only if needed. An array of 1 million elements will use 977 pages (1 million / 1024), so the array will be represented as a page array 977 long, together with a series of object arrays each 1024 long. Note that specifying very small page sizes may result in frequent reallocation and copying of the pages array, hurting performance. One solution to this problem would be to define a constructor that specifies the maximum size of the array in advance, and another solution would be to grow the pages array by more than one page slot at a time. A simple program that exercises this class is: import java.util.*;
public class testarray1 { for (int i = 1; i <= 1000000; i++) { // generate a random number int r = rn.nextInt(1000000);
// set its array slot and
then
sal.set(r, new
Integer(r)); There are obvious advantages to using a class to encapsulate a data structure. But what about the costs of doing so? Accessing an array through a method is slower than using primitive machine operations. One answer to the question is to come up with a simple benchmark: import java.util.*;
public class testarray2 {
for (int i = 1; i <= 1000000;
i++) { This program sets and then gets 1 million elements from a SparseArrayList structure. It requires around 1.35 seconds to run on a 300MHz Pentium, or 1.35 microseconds per iteration. For 8 million elements, the time increases to 1.7 microseconds per iteration, reflecting the extra time to reallocate the page table. One final point about SparseArrayList. Because it is designed to operate within the collections framework described above, an object of type SparseArrayList can be specified as a method argument wherever a Collection or List is called for, and standard operations such as iterators are automatically available. For example, the following program creates a SparseArrayList and then sets the values of the first few elements. Then the element values are dumped out via an iterator. import java.util.*;
public class testarray3 {
sal.set(0, new Integer(0));
Iterator iter =
sal.iterator(); The output is:
0 The iterator mechanism uses size() and get(), and thus the values for the whole page are reported.
|
![]() Last changed: 16 Mar. 1999 jr |
|