I don't want to repeat things that other people have said and done, so I've just written about using data compression around Oracle, so far.
I outlined the rationale and a fews pros and cons of using data compression in general under the general heading of compression, so if you need to reduce disk usage - keep a watchful eye on CPU usage.
This section is based on experience using compression for files outside a database. Some databases also allow various types and degrees of compression inside the database itself, this is quite well documented by the vendors and is not covered here.
The simplest way to reduce disk occupancy for Oracle export dump files and so on, is simply to compress the files after they have been produced. This has the advantage that the compression job can be run outside the regular backup and export job stream and its success or failure is only important in terms of the disk space being occupied.
Here's a simplified example in an oracle environment:
exp / $EXPORTARGS file=$DUMPFILE # error checking removed compress $DUMPFILE 2>zipdump.log &
Now the data base has been dumped and you can compress the dump file - this could happen immediately or could be postponed untill all the databases have beed dumped.
nohup nice -10 compress $DUMPFILE 2>zipdump.log &
The nohup is a precaution for situations when your backup is run interactively instead of from cron. The nice simply reduces the impact on the rest of the system.
If there isn't enough space for the intermediate .dmp file, it becomes necessary to compress the data on the fly, using a named pipe as the output from the export and running the compression process in the background.
PIPE=/tmp/oracle_export_pipe.$$ mknod $PIPE p nohup nice -3 compress -c < $PIPE 2>zipdump.log > $DUMPFILE & exp / $EXPORTARGS file=$PIPE # error checking removed wait
The wait command at the end ensures that the compression job in the background is not prematurely terminated if the calling script which includes the export ends quickly (before the compression is complete.
If your system backups may run at the same time as the Oracle export, it is useful to name the compressed export as $DUMPFILE.part and rename it when it is complete, thus:
PIPE=/tmp/oracle_export_pipe.$$ mknod $PIPE p (compress -c < $PIPE 2>zipdump.log > $DUMPFILE.part ; \ mv $DUMPFILE.part $DUMPFILE ) & exp / $EXPORTARGS file=$PIPE # error checking removed wait
This means that while the export is running, the data ends up in a file called $DUMPFILE.part, then when the file is finished, it is immediately renamed as $DUMPFILE and we avoid any confusion about whether the file is complete or partial.
Recovering the data stored in a compressed dump file involves an extra step, compared with a normal import, which is uncompressing the dump file. The file may be uncompressed in situ and imported as normal, or uncompressed into a named pipe (in the background) and run the import using the named pipe instead of the usual dump file.
PIPE=/tmp/oracle_import_pipe.$$ mknod $PIPE p zcat < $DUMPFILE > $PIPE 2>zcat_error.log & imp / $IMPORTARGS file=$PIPE # error checking removed
The zcat command may be swapped with uncompress -c if you prefer. The input redirection (the < symbol) is used to avoid problems with files larger than 2 Gb.
I have used the compress command in these examples, but you could also use gzip, bzip or another data compression program.
If any of this doesn't make sense you can reach me by email at
© Copyright 2005 Colin MacKellar. All rights reserved.