effective perl programmingPerl and SQL Databases: A Tasty TiDBIt![]()
by Joseph N. Hall
Joseph N. Hall is the author of Effective Perl Programming (Addison-Wesley, 1998). He teaches Perl classes, consults, and plays a lot of golf in his spare time.
For years, programmers have used text files as databases. UNIX is rife with examples: the passwd and group files, for instance, as well as many others in /etc. Text files work well for small amounts of data a dozen rows or perhaps a hundred or more but become cumbersome at larger sizes. They're slow to access, they can't be written simultaneously by multiple users, and they're tedious to edit. The lack of any inherent structuring in their contents also limits the usefulness of text files. If you're keeping a database in text files, and things aren't working out, the obvious alternative is a "real" database, which nowadays means an SQL database. (There are a few intermediate alternatives, like DBM files, but not many problems fit that niche.) However, in years past, an SQL database wasn't an attractive solution for an everyday problem. Database servers were expensive and not really designed with small- to medium-sized chores in mind. But all this has changed! If you are working on a standard UNIX platform, you can build and install any one of several open-source SQL database servers in an hour or two. Even better, you can talk to it directly with Perl through a straightforward "DBI" interface. Nowadays, using SQL databases from within your Perl scripts isn't just possible it's a good idea.
DBI and DBD
Each different database has its own DBD (Database Driver) module. Oracle has DBD::Oracle, Sybase has DBD::Sybase, MySQL has DBD::mysql, and so on. Each DBD provides an interface between the corresponding database client library and the database-independent DBI module. We'll use two different DBDs in the examples that follow: DBD::MySQL and a simpler alternative called DBD::CSV, which applies the DBI interface to text files in CSV (Comma Separated Value) format. The examples assume that you know some basic SQL. If you don't happen to know any SQL, there are many good books on the topic one of my favorite introductory texts is The LAN Times Guide to SQL. You can also find some SQL tutorials on the Internet.
Installing MySQL
% scripts/mysql_install_db You can now start the MySQL server. Because we're just playing around with it, let's run it on a different port and socket for now:
% setenv MYSQL_TCP_PORT 4001
With the server running, create a database called "test_foo," which we'll use later:
% bin/mysqladmin -p create test_foo
You can see how the server is doing by running the MySQL client. (If you do this later, you'll need the environment variables set.) Try the status command:
% bin/mysql
That's all you have to do for now. This installation has no security, but that's a problem you can resolve later if you decide to use MySQL for real.
Installing DBI
First, fire up the CPAN shell from the appropriate copy of Perl: % /whereever/my/binary/is/perl -MCPAN -e shell You may have to configure CPAN if this is your first time using it. Make your life easier by setting the prerequisites_policy config variable to follow. Once in the CPAN shell, verify that you can find the DBI module:
cpan> i DBI
If so, go ahead and build and install it: cpan> make DBI ... output omitted ... cpan> install DBI ... output omitted ... So far, so good. Now, install some DBDs. First do DBD::CSV. (We'll do the Text::CSV_XS module, which is a prerequisite, first.)
cpan> install Text::CSV_XS
Even if you haven't got MySQL working you'll be able to get a feel for DBI with DBD::CSV. Speaking of which, to build the MySQL DBD: cpan> make Bundle::DBD::mysql When you are asked which database to install support for, answer "1" for MySQL only (unless you also happen to have mSQL installed). When asked for the host and port, if you are running MySQL on an alternative port as suggested above, respond with "localhost:4001" (or whatever value you used). Assuming the make went smoothly, test and install the MySQL DBD: cpan> install Bundle::DBD::mysql NOTE: The DBD bundles reinstall DBI, at least in some cases. This is normal, if seemingly boneheaded.
So It's Installed Now What?
We'll start with a very simple schema consisting of a single column, HOST, containing approved host names. To make this even simpler, let's start with DBD::CSV. Here is a Perl program that will "connect" to the "database" (really it's just a bunch of files) and create a table for us:
use DBI;
The argument to the connect method is the "data source" (DSN) string. This tells DBI which driver to use (CSV in this case). It also supplies additional arguments that are passed into the driver itself. In this example, we've supplied the argument f_dir=csv, which instructs the CSV driver to create its text files in the subdirectory csv. If connect fails, it will return false, and we die because there is no particular point in continuing. The connect method returns a database handle, which we store in the variable $dbh. Database handles represent active connections. The do method is one of several ways of executing SQL statements. It takes a string and passes it to the driver for execution. Again, it returns true or false indicating success or failure, respectively. Note that we've quoted the argument to do with the generalized single quote syntax q() this isn't strictly necessary, but it makes the code easier to read. After this program runs, the csv directory will contain a file named APPROVED_HOST, named after the APPROVED_HOST table. It won't contain anything other than a single line with the name of the table's (single) column, HOST, but we'll fix that in a moment. Now, let's write a program, called approve, to insert an approved host name in the table. This is also straightforward:
use DBI;
my $host = shift or die "usage: approve host\n";
Use approve like this: % approve foo.bar.com Here we are using the multi-argument form of the do method. The second argument is a hashref of "attributes" that isn't often needed (just put undef in it). The remaining arguments are "bind values" that are bound to placeholders in the SQL argument. Each question mark in the first argument is a placeholder. When DBI executes the SQL statement, it replaces the placeholders in it with their corresponding bind values (SQL escaping them in the process). In this example, there is a single placeholder (the value in the INSERT statement) and a single bind value that gets plugged into it (the $host variable). Our last simple example is a program called ok, which prints "yes" or "no" depending on whether or not its argument is an approved host:
use DBI;
($h) = $dbh->selectrow_array(q(
The selectrow_array method is a convenient way to run an SQL query statement when you need only the first row of the result. The row is returned as a list. If the query returns zero rows, selectrow_array returns an empty list. We use this to determine whether the host was found in the table and then print out "yes" or "no" accordingly. We could have more sensibly used a COUNT here, but the CSV driver, which is very basic, doesn't support it.
Connecting with DBD::mysql
my $dbh = DBI->connect("DBI:mysql:database=test_addr;"
The first part of the DSN string has changed from DBI:CSV to DBI:mysql. The rest of the DSN string is DBD-specific. The MySQL DBD allows quite a few different options. By default it connects to a MySQL server running on the local host through a UNIX socket. Because we started the server on a different (nonstandard) socket, we have to specify a value for mysql_socket. Setting the MYSQL_UNIX_PORT variable would also work, as would using a config file. Other optional arguments for the connect method include user and password. We're using the defaults, which is fine for our test database. Let's change the schema while we're at it. We'll make HOST a primary key, and add some DATETIMEs so that we can keep track of when approvals are created and expire them after a period of time.
$dbh->do(q(DROP TABLE IF EXISTS APPROVED_HOST));
The call to disconnect is a very good idea and avoids inconsistent operation and warning messages. To save space, though, I won't always show it. Next, let's look at a revised version of the approve program:
use DBI;
my $host = shift or die "usage: approve host\n";
$dbh->do(q(INSERT INTO APPROVED_HOST
The localtime and POSIX strftime functions are handy when converting UNIX times to formats that can be understood by databases. I insert a "now" date as well as a date 180 days in the future. The dates are in "YYYY-MM-DD" format, which is readily understood by both humans and MySQL. Next, the ok program: # use statements and connect omitted ...
my $host = shift or die "usage: ok host\n";
This works like the previous version of ok, except that it also checks to see that the approval hasn't expired, and it uses a count of the matching rows (there should be only one anyway). Next, let's look at a program called approved that lists all the currently approved hosts: # use statements and connect omitted ...
my $now_td = strftime("%Y-%m-%d", localtime);
This program is the first we've looked at that uses a query that will return multiple rows. There are several ways of working with such queries. In general, you first "prepare" the SQL statement into a statement handle. Then you execute the prepared statement and iterate over the rows in the result. The prepare method returns a statement handle object ($sth in this case). After calling the execute method on the statement handle, we read the resulting table with the fetchrow_array method. There are a number of alternative ways of handling query results see the DBI documentation for more information. Our last program, which accomplishes the promised task of "approving" mail messages, requires that you have the Mail::Internet bundle:
use DBI;
my $now_td = strftime("%Y-%m-%d", localtime);
We read the message from standard input, then use a few lines of Mail::Internet voodoo to extract the host name from the From: line. Then we look up the host name in the database and return an exit status of 0 or 1 depending on whether or not it is approved.
Databases: Free and Easy!
Meanwhile, I hope that with these examples I've shown you that nowadays
SQL databases are both inexpensive (free!) and easy to use. If you need
a safe, organized place to store some data no matter whether you
have a little or a lot consider doing it with Perl and an SQL
database.
|
![]() Last changed: 17 Apr. 2000 mc |
|