Backups and Removable Media
Data backups in Linux were traditionally done by running commands to archive and compress the files to back up, then writing that backup archive to tape. Choices for archive tools, compression techniques, and backup media have grown tremendously in recent years. Tape archiving has, for many, been replaced with techniques
for backing up data over the network, to other hard disks, or to CDs, DVDs, or other low-cost removable media.
This chapter details some useful tools for backing up and restoring your critical data. The first part of the chapter details how to use basic tools such as tar, gzip, and rsync for backups.
Backing Up Data to Compressed Archives
If you are coming from a Windows background, you may be used to tools such as WinZip and PKZIP, which both archive and compress groups of files in one application. Linux offers separate tools for gathering groups of files into a single archive (such as tar) and compressing that archive for efficient storage (gzip, bzip2, and lzop).
However, you can also do the two steps together by using additional options to the tar command.
Creating Backup Archives with tar
The tar command, which stands for tape archiver, dates back to early Unix systems. Although magnetic tape was the common medium that tar wrote to originally, today tar is most often used to create an archive file that can be distributed to a variety of media.
The fact that the tar command is rich in features is reflected in the dozens of options available with tar. The basic operations of tar, however, are used to create a backup archive (-c), extract files from an archive (-x), compare differences between archives (-d), and update files in an archive (-u). You can also append files to (-r or -A) or delete files from (-d) an existing archive, or list the contents of an archive (-t).
NOTE: Although the tar command is available on nearly all Unix and Linux systems, it behaves differently on many systems. For example, Solaris does not support -z to manage tar archives compressed in gzip format. The Star (ess-tar) command supports access control lists (ACLs) and file flags (for extended permissions used by Samba).
As part of the process of creating a tar archive, you can add options that compress the resulting archive. For example, add -j to compress the archive in bzip2 format or –z to compress in gzip format. By convention, regular tar files end in .tar, while compressed tar files end in .tar.bz2 (compressed with bzip2) or .tar.gz (compressed with gzip). If you compress a file manually with lzop (see http://www.lzop.org), the compressed tar file should end in .tar.lzo.
Besides being used for backups, tar files are popular ways to distribute source code and binaries from software projects. That’s because you can expect every Linux and Unix-like system to contain the tools you need to work with tar files.
NOTE: One quirk of working with the tar command comes from the fact that tar was created before there were standards regarding how options are entered. Although you can prefix tar options with a dash, it isn’t always necessary. So you might see a command that begins tar xvf with no dashes to indicate the options.
A classic example for using the tar command might combine old-style options and pipes for compressing the output; for example:
Make archive, zip it and output
$ tar c *.txt | gzip -c > myfiles.tar.gz
The example just shown illustrates a two-step process you might find in documentation for old Unix systems. The tar command creates (c) an archive from all .txt files in the current directory. The output is piped to the gzip command and output to stdout (-c), and then redirected to the myfiles.tar.gz file. Note that tar is one of the few commands which don’t require that options be preceded by a dash (-).
New tar versions, on modern Linux systems, can create the archive and compress the output in one step:
Create gzipped tar file of .txt files
$ tar czf myfiles.tar.gz *.txt
Be more verbose creating archive
$ tar czvf myfiles.tar.gz *.txt
In the examples just shown, note that the new archive name (myfiles.tar.gz) must immediately follow the f option to tar (which indicates the name of the archive).
Otherwise the output from tar will be directed to stdout (in other words, your screen). The z option says to do gzip compression, and v produces verbose descriptions of processing.
When you want to return the files to a file system (unzipping and untarring), you can also do that as either a one-step or two-step process, using the tar command and optionally the gunzip command:
Unzips and untars archive
$ gunzip -c myfiles.tar.gz | tar x
Or try the following command line instead:
Unzips then untars archive
$ gunzip myfiles.tar.gz ; tar xf myfiles.tar
To do that same procedure in one step, you could use the following command:
$ tar xzvf myfiles.tar.gz
The result of the previous commands is that the archived .txt files are copied from the archive to the current directory. The x option extracts the files, z uncompresses (unzips) the files, v makes the output, and f indicates that the next option is the name of the archive file (myfiles.tar.gz).