Small Office/Home Networking - Part 3

File/Print Servers

by James R. Williams Zavada, February 2004
Presentation Goal:

To provide enough information to allow a first-time sysadmin to make
the necessary hardware and software choices to set up a file/print server
that best suites his needs.

Introduction:

First, this presentation will not give extensive details about how to
configure samba, nfs, appletalk, lpr/lprng, cups, etc.  Likewise, 
although we will talk about hardware choices one might make to set up 
a new file/print server, we will not discuss how one installs the 
various pieces of hardware.

Rather, we will cover what each of these various software and hardware 
components may be used for and why you might choose one over another.  


Layout of presentation: 

1. Off-the-shelf vs. Do-it-yourself

   A. Print Servers

      i. Off-the-shelf

         a. advantages
            o Usually easier to install, configure and maintain
            o smaller and easier to locate (networkable)

         b. disadvantages
            o Not very "customizable"
              Often leave file format processing up
              to the printer
            o Not very flexible or resilient:
              Little or no spooling,
              do not multitask well

      ii. Do-it-yourself

         a. advantages
            o Very "customizable"
            o Flexible and resilient:
              spooling, multitasking

         b. disadvantages
            o Harder to install, configure and maintain
            o Larger, thus harder to locate

      iii. Third Alternative: Combine i and ii

           Although much more complicated, it combines many of the
           advantages of i and ii.  This is what I do.  I have a
           Netgear print server that allows me to place a Canon BJC250 
           color inkjet printer on the family computer desk, while the
           file/print server is remotely located in a closet.
           Because all the workstations use the file/print server as
           the network printer, I can spool jobs, massage file
           formats, etc., then send the data to the Netgear print
           server.  I also have a HP Laserjet4 connected to the
           parallel port on the file/print server.  This printer is
           also located in the computer closet, but is used much
           less than the inkjet printer.

   B. File Servers

      i. Off-the-shelf

         OTS here means buying a commercially built server, for
example, from Dell, HP, etc., that has all the hardware (and
software) put together by the manufacturor.

         a. advantages
            o much less work for the sysadmin: she only has to specify
the hardware/software components (if possible), and the vendor does all 
the assembly work.  The sysadmin winds up with a server that is
ready to go out of the box (almost).

         b. disadvantages
            o components may not be the optimum choices
            o hardware/software compatibility issues may arise
            o aftermarket configuration and customization may take
              a good deal of effort (the poor sysadmin may have to undo
              or redo a bunch of stuff)

      ii. Do-it-yourself

          DIY here means you specify the needed components, buy
or download them, and assemble them yourself.

         a. advantages
            o The sysadmin ensures that all components meet
specified requirements.  No hardware/software compatibility issues.

         b. disadvantages
            o quite a bit of time and energy must be invested in
component assembly/customization and QC/troubleshooting.

2. Server Hardware Issues
   A. Disk power
      Can we ever get enough disk space?  
      Keep in mind though, that RAID can make a series of smaller disks 
      provide more space.  However, I recommend buying the biggest disks 
      you can afford.

      i. SCSI or IDE
      SCSI is better at multitasking (part of the design) but more
      expensive.  best for server that gets heavy use or simultaneous 
      use by multiple users.
      IDE is cheaper, but sucks at multitasking. best for lightweight use,
      and by no more than one or two users at a time.

      ii. RAID: Hardware or Software
      (Note-to-self: The brief RAID tutorial goes here)

      For reliability, a server should at minimum use mirrored drives.

      Hardware RAID is much easier to setup, but much more expensive
      They usually have a menu-driven BIOS/CMOS setup. They cost $500+ for 
      SCSI RAID adapters or you can spend even more to get a dedicated
      RAID device that looks like a single drive to the OS.  The real
      problem is this: Can you manage and monitor the RAID from the OS.
      And, is it compatible with the OS.  These are questions that you
      need answered BEFORE you invest in a hardware RAID solution.

      Software RAID is much cheaper, but can be much harder to set up: 
      Some Linux distros (RedHat) let you create a RAID array, install
      the OS to it, and boot directly from the array, whereas others have 
      to be mightily hacked to allow booting from a RAID (Debian).  Another
      advantage is that the array is integrated into the OS toolset, and
      thus can be easily monitored, and can be maintained without booting
      into CMOS.

   C. Memory power
      Memory = disk buffer = speed increase
      The more users and the heavier the use, the more memory you should have.

   D. Network power
      Better network cards rely much less on the system's CPU to do their 
      processing, and are usually designed for greater efficiency/throughput.
      However, they are always more expensive, and any network card that is 
      compatible with the OS will work.  Just keep in mind that the more
      your server will be pounded on by users, the greater
      your need for a quality network card (SMC, Intel, etc.)

   E. CPU power
      On a Linux file/print server, the CPU is not nearly as important as
      the disk, memory and network subsystems.  Older/slower CPUs are more 
      than adequate, and always cost less than the latest and greatest.  
      Keep in mind though, that if you scrimp on the other subsystems, the 
      CPU you buy will make a difference.  I highly recommend, however, that
      the wisest dollars are better spent on disk, memory and network cards.

   G. Backup power
      Although this item is often overlooked or under-emphasized, the first
      time you experience a catastrophic failure you'll wish you'd invested!
      Most folks use tapes for backup, recount my tape repairman's story
      about disks being the best backup devices.
      Whatever you use, make sure it is big enough, or can automatically feed
      itself.  If not, you'll relegate backups to the back burner, and come
      to regret it!  Also, test your backup system for adequate restores
      before you need it.  You may think you've got backups when you don't!!
      I'm in the process of re-evaluating my Backup system.  I used to use
      a custom shell script and afio, and have toyed with using Amanda.

   H. Electric power (UPS is a must!)
      A crashed or dead file/print server causes no end of misery to
      a sysadmin.  Do yourself a favour and buy a UPS, and install and TEST UPS
      software.  How long after the power goes out do you want your server
      running?  Long enough to shutdown cleanly?  Until the power comes back?
      Maybe you need a generator!

   I. Upgradeability
      Keep in mind that you may want to add more disk space,
      printers, and memory down the road, and buy accordingly.


3. Server Software Issues
   A. Print Servers
      i. Lpr
         Practically speaking, for Linux, Lpr is dead.  Most Linux distributions
         that have Lpr are using Lprng.  And if you're going to use another
         Unix, I'd highly recommend one of the other two.

      ii. Lprng
          This is what I've used in the past.  Very difficult to set
          up and configure, and administrate (NoteToSelf: research web/gui
          interface options)  If you are an old hack like me, you'll find that
          this one is the most like the good old Lpr we all know and
          love (hate).

      iii. CUPS
           This is what I'm moving to. MacOSX uses this, and it has a web
           admin interface that makes it easier for Sysadmins and end-users to
           deal with.  If starting fresh, this is what I recommend, as it
           is based on the new IPP (Internet Printing Protocol) standard.
           One caveat, however: After setting up a Wireless network environment,
           the MacOSX laptop has had several problems accessing the file/print
           server's CUPS server.  This may be due to my having misunderstood and
           mis-configured something, though.

   B. File Servers
      i. NFS
         This is the ideal in an all-Unix environment, but be prepared to deal
         with different configuration setups/syntax on the different Unices.

      ii. Samba
          This works best for an all-Windows or mixed Windows/Unix
          environment, or if you are using MacOSX. 

      iii. Appletalk
           This works for an all-Apple environment or if you need to
           support older Macs.

      iv. Web
          If your users only need read-only access to files, you
          may want to consider using a web server.

   C. Backup Software
      i. Dump/restore
         Dump and restore are the traditional system-level Unix programs for 
         doing backups.
         1. Advantages
            - Inherently does incremental backups (dump levels).
            - As a system-level program, it is well-integrated with the 
            traditional Unix filesystem.
            - Newer versions can use compression inherently.
         2. Disadvantages
            - Only backs up entire filesystems, SysAdmin cannot pick
            and choose what gets backed up and what doesn't on the
            filesystem.
            - May not work with newer, non-traditional filesystem types, 
            i.e. Reiserfs, XFS, etc.
            - Doesn't do error-checking/-correction to/from the backup
            medium.
            - Older versions cannot use compression inherently.

      ii. Tar
          Tar is also a regularly-used user-leve Unix program for doing 
          backups.
          1. Advantages
             - As a user-level program, it doesn't care about filesystem
             types, thus works with all of them.
             - SysAdmin can pick and choose what gets backed up and what
             doesn't.  New versions can backup links, pipes, sockets, etc.
             - Newer versions can use compression using external programs.
          2. Disadvantages
             - No incremental backups inherent to program.
             - Doesn't do error-checking/-correction to/from the backup
             medium.
             - Older versions cannot use compression using external programs.

      iii. Cpio
           Cpio is also a user-level Unix program for doing backups. It has
           many of the same advantages and disadvantages as tar, however,
           newer versions can do limited error-checking on files.

      iv. Afio
          Afio is a user-level program that is often overlooked.  It has
          many of the same advantages and disadvantages as tar, however, it
          is designed to do error-checking and correction.  Also, you will
          most likely have to find and compile the sources for this one, as
          it doesn't usually come with most Linux distributions (or any
          other Unix, for that matter).

      v. Amanda
         Amanda is a user-level program that is designed to provide a way
         to administer single local backups or multi-system backups across
         a network.  It relies on dump/restore and tar to do the underlying
         backups, so the features and problems of each apply to it as well.

      vi. Other Software
          There are several open source and commercial software packages
          for doing backups, none of which I'm familiar enough to evaluate,
          other than to say that the obvious disadvantage of the commercial
          types is that they cost money, and frequently the disadvante of
          the open source types is lack of "easy-to-use" documentation.
        
      vii. Summary
           I've used several combinations of the aforementioned programs.  I
           currently use afio with a customised shell script wrapper, but am
           evaluating the idea of creating a "backup server" that uses Amanda,
           so that I can backup several of the machines on my network.  Because
           there are so many alternatives, it is really hard for me to recommend
           a particular choice that will best meet your needs.  Additionally,
           this is one of those issues that SysAdmins like to argue about.  What
           it boils down to is this: Find the solution that works best for you,
           BUT make sure you test the restoration process BEFORE you need to
           rely on it!

4. System Administration Issues
   A. DNS
      Make sure your workstations can "see" the file server on the
      network.  You'll need to use your ISP's DNS publicly, or a 
      private local DNS server, or set up /etc/hosts on every machine.
      If your file/print server will double as a DNS server, beef up
      your resources accordingly.

   B. NTP
      File timestamps can be extremely important to some users.  If you
      need accurate system time, set up a local timeserver on your network.
      If your file/print server will double as the time server, add to your
      server's resources accordingly.

   C. Security
      Answering the following questions will help you get an idea on what
      you need in a file/print server:

      If the local network on which your file/print server resides is
      connected to the Internet, have you made sure your network is
      secured?!?

      How much do you trust your users?  Have you trained them to act
      securely?  Do they choose secure passwords?  Do they give their
      passwords out to others?  Are they likely to want to crack into
      your network or server(s)?  Do you need to keep track of what your
      users are doing on your servers?

      When your server gets cracked (not if), will you need to do offline 
      forensics on it?  Do you have adequate resources to do a complete 
      install and restore from backups?  To a spare machine?  

      If a hard drive dies, how long can you wait for a new one?  What if
      a second dies while you're waiting?  What if your network card dies?
      Do you want to have spares?  Can you afford them?  Can you afford not
      to?  How much downtime _can_ you afford?

      Security is a tradeoff between convenience and lowering risk, so there
      is no magic answer that is right for everyone.  You have to answer
      these questions, then act accordingly.  (I highly recommend Bruce 
      Schneir's "Beyond Fear" as a first reading, then "Practical Unix & 
      Internet Security after that.  Info on both is listed below.)


5. Resources
   A. Books
      "Essential System Administration" - Æleen Frisch. ISBN: 0596003439
      - I have this one, it's worth putting on your bookshelf.

      "Linux Network Administrator's Guide" - Kirch, Dawson. ISBN: 1565924002
      - I have read this one, but recomment TCP/IP Network Administration instead,
        as it's a broader scope (much more than just a "Linux" perspective).

      "Practical Unix & Internet Security" - Garfinkel, Spafford, Schwartz.
        ISBN: 0596003234
      - I have read this one, it's worth putting on your bookshelf.

      "TCP/IP Network Administration" - Craig Hunt. ISBN: 0596002971
      - I have read this one, it's worth putting on your bookshelf.

      "Unix System Administration Handbook" 
        (aka The Red Book, The Yellow Book, The Purple Book, etc.) 
        - Evi Nemeth, et al.
        Purple Book ISBN: 0130206016
        (Just Linux) Green Book ISBN: 0130084662
      - I have the Yellow Book, and have perused the Red Book, both are worth 
        putting on your bookshelf.  Although I haven't checked out the Purple
        book, if it only covers Linux, I recommend the Red Book instead).

      "Beyond Fear" - Bruce Schneier. ISBN: 0387026207
      - I have this book, it's an excellent discussion of what security is
        and how to evaluate "security measures".  It's about security in its
        broadest sense, not just computer security.
      
      "Managing NFS and NIS" - Stern, Eisler, Labiaga. ISBN: 1565925106
      - I have this book, but unless you're doing alot of NFS or NIS stuff,
        stick with TCP/IP Network Administration.

      "Managing RAID on Linux" - Derek Vadala. ISBN: 1565927303
      - I've not read this one, so I can't give an evaluation: caveat emptor.

      "Network Printing" - Gast, Radermacher. ISBN: 0596000383
      - I've not read this one, so I can't give an evaluation: caveat emptor.

      "Unix Backup & Recovery" - W. Curtis Preston. ISBN: 1565926420
      - I've not read this one, but parts of it (pertaining to Amanda) are
        on the Web, which I have read, and found to be somewhat useful, but
        it didn't go in depth enough to cover my need for advanced 
        configuration info.

      "Using Samba" - Ts, Eckstein, Collier-Brown. ISBN: 0596002564
      - I've not read this one, so I can't give an evaluation: caveat emptor.


   B. Websites
      RAID - http://www.acnc.com/04_01_00.html
           - http://www.prepressure.com/techno/raid.htm

      SAMBA - http://www.samba.org

      APPLETALK - http://netatalk.sourceforge.net

      APACHE - http://www.apache.org

      CUPS - http://www.cups.org

      LPRNG - http://www.lprng.com

      AMANDA - http://www.amanda.org

      AFIO - http://freshmeat.net/projects/afio/


Conclusion:

It is my sincere hope that by now you've accumulated enough information that 
you can decide what you need to implement a file/print server that best suites
your needs, and/or that you are aware of the various information resources
available to help you with your decision and implementation processes.