Broken Paradigm
by Hal Miller
Hal Miller is president of the SAGE STG Executive Committee. Today I placed an order for over $5 million in computing equipment. There will be many times that to come -- certainly the biggest single project I've ever worked on. What did I buy? Mostly storage: 8 terabytes. At the current growth predictions (much of which is already funded), I will approach, if not exceed, a petabyte in the next four years. "Neat!" you say. So did I. Then I realized: "bandwidth between disk and servers." Then I thought, "Oh no! Backups!" Technology continues to advance. So does the demand for it. There is a significant gap between those rates of growth, and the future looks difficult for those of us tasked with using the former to supply the latter. Let's look at my real-life situation as an example, then see what we might do about at least breaking the problem down into solvable chunks, if not solve it as a whole. With that much disk online, in a heavy-use environment (read-write all over, 7 x 24 x 52, lots of users, 90-day-long jobs), getting data back and forth between storage media and CPU servers is a problem. With that many heads and spindles to manage (over 1000!), the seek time for a given bit of information can be long. Given that it's all random-access filesystems (well, there is a database, too, just for complications, but it's "relatively" small), there isn't much of a way to index around and cut down search time. This is all UNIX filesystem. We have known for years that the UFS is nearing an upper limit on directory size, and it appears to have other limits as well. How do we deal with data integrity? There are RAID5 and mirroring solutions, among others. Who wants to pay for the extra disk (let alone computer room space, power, and air conditioning) for my mirror? Storage Area Networking is a solution for some of the bandwidth issue. But, as with the other points, how long, at this rate, before we outgrow that? Probably just about the time we finish our first backup. Speaking of which, the backup paradigm we all know dictates copying either blocks or files to tape, in some pattern to allow for restoral of data after hardware failures in some "reasonable" amount of time (plus, in some places, to allow for restoral data after user error). I'm putting 16 DLT7000s into this. Filling the tape library cabinet costs nearly $50,000 retail and will cover a week or so. Filling the tapes with data may take more than that week. That means I need to change what "backup" means. I can't take anything offline to dump, so I need either to "break" a mirror or back up a "snapshot." What technology I apply is really not the issue here (nor am I looking for those other large sites out there to pick on my scenario) -- whatever that technology is, we have already outgrown it, or will soon. Enough on disk and backup. How about security? My site is pretty well hidden. There isn't a lot of reason for people to come looking for us except that our router answers up on the Net like everyone else's. We are under scan or more concerted attack every few hours, perhaps more; I have no control over my current network and can't really see effectively. Fighting this, recovering from those incidents we've had (Linux mountd and NT, all boxes I didn't know were brought in and connected), is a full-time job, and I don't have anyone to apply to it. Tool building for IDS and other parts of the game proceeds, but not fast enough. Technology advances have been staving off "defeat" for a while and will continue to attempt that, but we as an industry are losing the battle. Demand continues to skyrocket. Paradigms are stretched to the point of breaking throughout the computing world. So what do we in SAGE do? Hard question without obvious answers. Let's start with what we can't do, and see what's left. Then, remembering our job as sysadmins seems to include "performing magic" to solve whatever odd problems nobody else dealt with, we will try to pull yet another rabbit from the hat. We can't develop better hardware solutions. We can't fund new hardware products. Most of us aren't advanced hardware engineers. Maybe the vendors can do these things, but they aren't likely to do so of their own volition, since they make good money selling what they have to offer us now. We need to apply our reputation and efforts to convincing vendors to join us in a "consortium" type of effort to devise new long-term solutions. We can sponsor workshops calling for work in progress, brainstorming sessions, or joint work proposals. We can fund our own members to work on software tools if they will return benefit to our community. We can put our collective experience together into reviewing what the requirements really are and designing methods to meet them. We can get vendors to build it if we show them what we want. This year I would like to see the formation of a SAGE Development Fund, and a SAGE Vendor Liaison function. I hope they make some progress before I add the next few dozen terabytes to my backup system.
|
![]() 15 Apr. 1999 jr Last changed: 15 Apr. 1999 jr |
|