;login: The Magazine of USENIX & SAGEOpen Source

 

 

source code unix

Learning with Source Code UNIX

gray_bob

by Bob Gray
<[email protected]>

Bob Gray is co-founder of Boulder Labs, a software consulting company. Designing architectures for performance has been his focus ever since he built an image processor system on UNIX in the late 1970s. He has a Ph.D. in computer science from the University of Colorado.

 

THANKS TO TOM POINDEXTER AND JANET BRACCIO.

My sister's high-school-age son is applying to the computer science department of Carnegie Mellon University. When I expressed my surprise, she said, "Oh, Evan is very good with computers." I proceeded to dig myself into trouble by asking what Evan does with computers.

As I suspected, he is an end user of applications. I said that if I were on the admissions board, I'd be looking for something more substantial to demonstrate an interest in the field. I suggested that if he were really interested in computers, he could install a copy of Linux on his computer, start writing programs, and learn Java. Then he would have something to show for his time in front of the screen.

To learn computer systems and programming, it is essential to acquire some starting knowledge, design and implement a solution to a problem, and then have your work reviewed by an expert. Ideally, the evaluation step is an ongoing, interactive process with one or more experienced mentors. A supplemental method is having Source Code UNIX in your corner. The running operating system, its utilities, and its thousands of ported applications, backed up with manuals and other printed documentation, provide a powerful surrogate for a real instructor.

In this month's article, I'll

  • elaborate on the extensive learning material available.

  • discuss the importance of coding style and conformity.

  • provide some personal examples of how I've learned from and leveraged Source Code UNIX systems.

    Learning Material
    Consider mainstream, shrink-wrapped software. Load it and use it. You can only surmise what is happening inside. Need altered behavior? You're out of luck. In contrast, ported applications are built from the source code. (See my October 1998 article, <http://boulderlabs.com/4.ports.html>, for a tour.) So if you need altered behavior, it may be relatively simple to achieve. You may need to develop a program that has a number of similarities to existing UNIX code. Why not learn how others have solved various aspects of your problem and leverage the experience from their working base?

    Books can provide valuable in-depth design discussion for even the most thoroughly documented code. My favorite example is The Design and Implementation of the 4.4BSD Operating System, by McKusick, Bostic, Karels, and Quarterman (Addison-Wesley, 1996). For those curious about the internals of BSD variants, this is the gospel, and it is best summed up in the book's dedication:

    This book is dedicated to the BSD community. Without the contributions of that community's members, there would be nothing about which to write.

    FreeBSD, NetBSD, OpenBSD, and BSDI all derive their base code from 4.4BSD, which is now about five years old. Fortunately, McKusick, Bostic, Karels, Leffler, and Greenman have signed a contract with Addison-Wesley to produce a follow-on version for The Design and Implementation of the FreeBSD Operating System, due out about mid-2001.

    The state of computer science was greatly advanced when Gary Wright and the late W. Richard Stevens published TCP/IP Illustrated Volume 2 (Addison-Wesley, 1995). This book explains the workings of the 4.4BSD networking code. The data structures, algorithms, and thought processes behind much of the work are explained in detail. Given that TCP/IP is now the universal network protocol, this book is indispensable.

    Other resources for understanding source code include Web sites, tutorials, newsgroups, Frequently Asked Questions (FAQs), mailing lists, and search engines. General system-level documentation that helps explain source code can be found with the search engines. Some particularly valuable Web sites:

    <http://sunsite.unc.edu/mdw/HOWTO/Installation-HOWTO.html>

     <http://www.dejanews.com>

     <http://www.freebsd.org/handbook>

     <http://www.freebsd.org/search.html>

     <http://www.freebsd.org/tutorials>

     <http://www.linux.org/help/howto.html>

    If you want to be into the action, subscribe to mailing lists such as <[email protected]> for daily or even hourly activity <http://www.freebsd.org/support.html#mailing-list>). By the way, before bothering busy people on particular lists with your questions, be polite by checking if your question has already been answered. Learn how to use <http://www.dejanews.com> or archival search engines such as <http://www.freebsd.org/search/search.html> for mailing lists and newsgroups.

    As an outside interest and hobby, I work with maps, GPS, and astronomy. The huge body of knowledge and source code available on these topics can be found with a Web search engine. Subscribe to newsgroups such as <sci.geo.satellite-nav> to be connected to the group of contributors. Niels Elgaard Larsen has implemented software that places GPS track points on maps (<http://www.diku.dk/users/elgaard/eps>). Its GNU General Public License allows me access to the Java source code, and I can make the modifications I need for my project.

    Interactions with individuals often are the best way to learn. Make the effort to attend conferences in areas that interest you. Go to the Birds-of-a-Feather (BOF) sessions to meet with workers and disciples. From those sessions, you'll be able to establish one-to-one relationships that can continue with email and telephone calls.

    Coding Style and Conformity
    Programming is one of those areas where unorthodox style is generally not appreciated because, invariably, other people will need to look at and understand your code. The problem is that the reader would need to put himself into an unnatural frame of reference to comprehend your phrases, and his basic assumptions regarding indentation or other common practices cannot be used. Imagine how much harder it would be for the home remodeler to accomplish her work if she could not rely on conventions about stud spacing in walls and electrical practices.

    Steve Bourne, the original author of /bin/sh, used the C preprocessor to give an Algol feel to his 1979 code. Constructs such as:

     #define IF if(
     #define THEN ){
     #define ELSE } else {
     #define FI ;}

    led to implementation code looking like this:

     WHILE (c = *s++, !any(c,ifsnod.namval) && c)
     DO *argp++ = c OD
     IF *cmdadr=='-' ANDF (input=pathopen(nullstr, profile))>=0
     THEN IF c
      THEN continue;
      ELSE return(count);
      FI
     ELIF c==0
     THEN s--;
     FI

    Granted, it is cute and interesting, but I claim that he did the community a disservice with that style. As a reader, I am constantly forced to look up the meanings of his constructs. For example, C statements are semicolon-terminated, but Bourne's code (e.g., the DO . . . OD construct) confuses this principle.

    How did I come to this opinion? By looking at hundreds of thousands of lines of code over years. You develop a feel for what is good style, and you easily become annoyed by "individuals" who want to express themselves. The best styles are those that don't seem to have any style at all, like the national TV news anchor who seems to have no accent at all. You should be able to look at a body of code and not find any surprises with indentation, braces, or idioms.

    I believe the best way to learn good programming and good style is to design and implement a solution yourself first. Then get feedback and comments from others. You'll eventually notice a consensus. Kernighan and Pike's The Practice of Programming (Addison-Wesley, 1999) is a gem for improving one's code. The authors show various solutions to problems in various languages and analyze the strengths and weaknesses of each. The beauty of their work is that they lead you along a normal solution path and show how simplicity, clarity, and generality can be gained along the way.

    For those wishing for a historical perspective on an operating system design and style, John Lions, in 1977, published two books: A Commentary on the UNIX Operating System and its companion source-code listing for his course at the University of New South Wales. After years of suppression (as trade secrets) by various owners of the UNIX code, the books were rereleased (Peer to Peer Communications, 1996). Greg Rose, one of John's students, wrote:

    John introduced a course in Operating Systems, and decided to study the Unix operating system. One of his motivations in doing this was to introduce the students to code which was well written by other people — at the time, this was not a common practice, although it is now well accepted.

    Personal Examples
    The disadvantage of learning from books is that the problems tackled are seldom the ones you are faced with. That's why running a Source Code UNIX system is important — you're likely to find some kernel facility or user application that largely overlaps with your problem. For example: in 1991, when designing my passive solar house, I wanted software to tell me the exact solar sky for my location at any time of the day throughout the year.

    Table 1 shows the output from my program. You see that for January 1, at 12:00 the sun rises to only 27 degrees elevation and is pointing almost due south (179 degrees). As expected, this date has the fewest hours of sunlight, with the sun sweeping the lowest arc in the sky. (Of course, if I printed daily activity, December 22 would show as the shortest day of the year.)

    Hour of day (local standard time)
    mm/dd7891011121314151617
    10/01104,11115,22127,32143,40162,45184,47205,44223,38237,29249,18260,7
    11/01113,5124,15136,24151,31167,35185,35202,33218,27231,19242,9
    12/01128,8139,17152,23167,27183,28198,26213,21225,14236,5
    01/01126,5137,14150,21164,25179,27194,26209,21222,15233,6
    02/01120,8132,17145,25160,30176,33193,32209,28223,21235,12245,2
    03/01104,4114,15126,25140,33156,39176,42196,41214,36229,29242,19252,9

    Table 1. Azimuth and elevation in degrees of the sun for the first day of each winter heating month at latitude 40.0, longitude 105.0.

    The code (<http://boulderlabs.com/dailySun.C>) is leveraged from a friend's spherical-navigation code, but I could have easily worked with Bill Randle's public calentool package. My sunrise, sunset program (<http://boulderlabs.com/riseset.tz>), which uses calentool code, presents everything I want to know about both the sun and the moon patterns, including Julian days, local sidereal time, and declination of the earth. I developed a curiosity about things like the equation of time and found an excellent discussion on the Web at <http://susdesign.com/sunangle>. Further, in a Java FAQ, I once saw that extensive libraries were implemented for calendars and date calculations, so I grabbed the Java source code from <http://java.sun.com> and studied the fascinating code and comments in Date.java, GregorianCalendar.java, and TimeZone.java.

    I often record radio talk shows on my computer because it's easy to schedule (crontab), and it's easy to gain random access to the content when I later play it back. Most computer systems come with a GUI player, but for my needs that kind of an interface is clumsy. I found some audio source code and in a couple of hours added the features I wanted for command-line control. Simply, I wanted periodic printing of the time-code and file byte offset and an easy way to skip and maneuver within the file.

    I urge you to take advantage of the knowledge embedded in Source Code UNIX. Whenever possible, add to the body by making your own software clear, robust, and available under some kind of general public license.

    To end, I would like to highlight a huge event and honor three heroes in the history of UNIX.

    Up through about 1991, all UNIX users had to be under some kind of a license arrangement to access the source code. This was in spite of the fact that most of the Bell Labs UNIX code base over time had been replaced with better, more functional software from the huge body of public contributors. Keith Bostic, Mike Karels, and Kirk McKusick at Berkeley realized that most of the BSD UNIX system could be released to the public without the traditional AT&T/USL/Novell license, because it was publicly developed. They boldly proceeded to freely redistribute the system, resulting in USL initiating a lawsuit for an injunction to stop the software release. In 1994 the pioneers from Berkeley prevailed, and now anyone can have the 4.4BSD system or its derivatives, FreeBSD, OpenBSD, and NetBSD. For a great story and more details see <http://www.oreilly.com/catalog/opensources/book/kirkmck.html>.

  •  

    ?Need help? Use our Contacts page.
    Last changed: 13 Apr. 2000 mc
    Issue index
    ;login: index
    USENIX home