SAGE - Sage feature


Effective Perl Programming:
Simplicity Is a Good Object

hall_joseph

by Joseph N. Hall
<[email protected]>

Joseph N. Hall is the author of Effective Perl Programming (Addison-Wesley, 1998). He teaches Perl classes, consults, and plays a lot of golf in his spare time.


In my previous column, I discussed Perl's framework for object-oriented programming. In this column, I discuss how some common object-oriented programming paradigms can be addressed with Perl. As before, I'm assuming you have a Perl reference handy to look up any features that you haven't already encountered.

Member Variables

By and large, objects in Perl are implemented using blessed hashes. Used this way, Perl hashes are analogous to "structs" or "records" in other programming languages. By extension of the analogy, each key-value pair in a hash represents a member variable. In reality, there are considerable differences in implementation: C++ structs/objects have a fixed number and type of members and a fixed layout in memory, while Perl hashes have an unconstrained composition and size. However, from a user's standpoint, things seem pretty much the same:

// C++ version:

joseph = new Person("Joseph", "Hall");
first_name = joseph->first;

# Perl version:

$joseph = new Person('first' => 'Joseph', 'last' => 'Hall');
$first_name = $joseph->{first};

The analogy works up to the point where you introduce the notion of public versus private member variables. Many object-oriented programming languages support a partitioning of the namespace of member variables such that some member variables, the "private" ones, are visible only within methods belonging to that object's class, or perhaps to methods derived from that class. Put another way, the private member data are hidden. But "public" member variables are accessible everywhere. Although Perl has some namespace-related features, like packages, Perl does not have any features that directly address object data hiding. So long as Perl objects are implemented as hashes, member variables (which are really just key-value pairs, remember) are always visible to any code using those objects. This becomes a fundamental rule in Perl, at least as the language now stands: in Perl, all member variables are public.

You can struggle against this rule, and you can do various things to try to work around it (and I'll even show you one later), but you can't alter Perl's fundamental nature ­ not by writing Perl code, anyway. In a bit, I'll discuss ways of dealing with the lack of data-hiding features. First, though, let's turn away from member variables and look at class variables.

Public and Private Class Variables

In the lingo of object-oriented programming, a "class variable" is a variable whose single instance is shared by all members of that class. In Perl, classes are packages, so you might think that a Perl package variable would be the natural representation of a class variable in Perl. If so, good thinking! For example, suppose we want to extend our Person class from the last article to include a count of all the Person objects created so far. We need only add a package variable and modify the constructor accordingly:
package Person;
sub new {# constructor for class Person
my $class = shift;
my $self = { @_ };
$count++;# $Person::count contains the count
bless $self, $class; # return a new Person
}

Within package Person, $count contains the number of times the Person constructor has been called. Outside Person, we can still access the value with the qualified name $Person::count. We have created a public class variable. Suppose, however, that we would like to make $count private ­ in other words, visible only to methods of the Person class. Once again, Perl lacks features allowing us to precisely address this data-hiding need, but in this case, we can come pretty durn close to what we want by using another feature. So long as we can group the methods of class Person into a single file or portion of a file, we can use a my variable to hide $count from the rest of the world:
{# put all Person methods
# inside these braces
package Person;
my $count; # $count is visible only within
# the braces
sub new {
my $class = shift;
my $self = { @_ };
$count++;
bless $self, $class;
}
sub get_count { # add a class method to return
# the value of $count
my $class = shift;# should be 'Person' or subclass;
# we ignore it anyway
$count; # return value of $count to the
# outside world
}
} # $count no longer visible
# beyond here
# some intervening code ...
print "the current count is ", Person->get_count, "\n";
# example usage

A more conventional and advisable practice is to write Person as a module (modules are, alas, a topic for another day). Then all of Person goes into a single file, which provides a scope for the my variables and eliminates the need for the enclosing braces. Sometimes you may want the braces anyway, but as part of a BEGIN block:
BEGIN {# all code in here is executed at compile time
package Person;
my $count = 1000;# initialize $count to 1000 before new is
# called
sub new {
#.rest same as above; methods work the same
# enclosed in BEGIN
}

Putting my $count = 1000 inside a BEGIN block ensures that $count is initialized before any of the methods of Person are called by the outside world, even if (for example) the code for Person appears textually after the first code that calls Person::new. The same thing is happening if Person is implemented as a separate module incorporated with the use directive ­ the code for Person is included as if it were all surrounded by a BEGIN block.

Private Member Variables, Sort of

Okay, so Perl doesn't really support private member variables. But no doubt if you've been using Perl for long, you've seen all kinds of weird language gymnastics that make seemingly impossible things possible. What about this case? Is there something we can do to implement private member variables in Perl? My answer is still no, not really. There just isn't a way to hide private member variables efficiently in Perl. This doesn't stop you from taking stabs at it. For example, the perltoot man page suggests using a closure (yet another topic for another day) as an object. True, data can be hidden very thoroughly within a closure, but (1) closures make very bulky, space-inefficient objects and (2) you run into a Godel-Escher-Bachish mess trying to come up with a way to reveal the contents of the closure to class methods without once again revealing them to the rest of the world. The perltoot example is not well written, and in any event the underlying concept is flawed.

If you have a limited amount of private data for each object, you might try something along the following lines. I've written some code that associates a unique identifier with each Person object:
{# for the my variables (or use a
# separate file)
package Person;
my $count; # $count is visible only within the braces
my %PRIVATE_ID;# also visible only within the braces
sub new {
my $class = shift;
my $self = bless { @_ }, $class;
# bless before hash assignment
$count++;
$PRIVATE_ID{$self} = $count;
$self;
}
sub get_id {
my $self = shift;# should be reference to Person object
$PRIVATE_ID{$self};# look up id for this object
}
sub DESTROY {# destructor cleans up as necessary
my $self = shift;
delete $PRIVATE_ID{$self};
}
}

Here I've used an object reference, $self, as an index into a hash, %PRIVATE_HASH. The hash is shared by the methods of Person and is inaccessible elsewhere. After a few moments of study, the way this code works may seem clear to you, but be careful. The code $PRIVATE_ID{$self} probably doesn't work like you think it does. Perl cannot use a reference directly as a hash key because hash keys must be strings. The reference $self is converted to a string by Perl; the string will look something like "Person=HASH(0x18d76b4)". This string is guaranteed to remain unique so long as the object it refers to hasn't been destroyed. It's not a bad idea to employ a destructor (the subroutine named DESTROY, which is automatically called when objects of class Person are destroyed by Perl) to ensure that the contents of the hash remain consistent with the objects currently in existence.

This technique is speedy enough for many uses and is certainly space-efficient. Is it a hack nonetheless? I'd have to say yes, it is.

Keep It Simple, Stupid

The simplest and most efficient way to implement private instance variables is do it by convention. Give private variables special names: for example, begin them with an underscore. Or, perhaps, provide a method-based interface for manipulating instance variables:
package Person;
# ... stuff from above ...
sub get_first {# return value of first
my $self = shift;
$self->{first}
}
sub get_last { shift->{last} }# shorter version

Or even:

sub first {
 my $self = shift;
 $self->{first} = shift if @_;
 $self->{first};
}

The latter version permits a single method to function as a means of setting as well as retrieving an object value: $joseph->first to get $joseph's first name, and $joseph->first('Joe') to change it.

The one thing I don't recommend is creating complex boilerplate or conventions to impose your object-oriented desires on Perl. Perl has a simple object-oriented framework, and it is best enjoyed and appreciated for what it is. Perl doesn't make a very good Smalltalk or Eiffel or (especially) C++. That works both ways, though. Those languages make lousy Perls, and I'll take simplicity over complexity every time I can get away with it. By the way, I ran out of space for my CPAN subclassing example, but you can expect to see it in a forthcoming column.


?Need help? Use our Contacts page.
6th November 1998 jr
Last changed: 6th November 1998 jr
Issue index
;login: index
SAGE home