This guide will explain how to split a large file into multi-part .tar files, and how to put them back together again.
Sometimes when you want to store your backup or any other large set of files online or want to share them someone else you need to find a way to compress and split the files into chunks of 100 or more Megabytes. I felt the need for this as well recently when I wanted to store my backups online and the online storage service had a cap of 100 MB per file. I found a really neat solution based on the tar command. Using this method I split my backup of about 1 GB into 10 chunks of 100 MB each with incremental filenames.
The 1 GB file I wanted to split was called dbbackup.db. Here’s the command I ran to create multiple tar files of 100 MB each out of it:
# tar -cf – dbbackup.db | split -b 100m – db_backup.tar
This command took a long time to run. Once it was done running I was left with ten files, 100 MB each named db_backup.taraa, db_backup.tarab, db_backup.tarac, and so on and so forth.
Now I can copy these files to my external storage or ship them with ease. To stitch the 1GB file back together from the multi part tar files all I need to do is to run the following command:
# cat db_backup.tara* | (tar x)
And voila, I get my original file again.
Ok, but how about if I have tgz files… The splitting works great but how can I use cat to join the archives again ?
I tried:
cat archive.tgza* | (tar x)
tar: Archive is compressed. Use -z option
tar: Error is not recoverable: exiting now
Then i did:
cat archive.tgza* | (tar z)
tar: You must specify one of the `-Acdtrux’ options
Try `tar –help’ or `tar –usage’ for more information.
And then finally:
cat archive.tgza* | (tar xz)
worked but it started unpacking the archive… How can I simply join it back into one archive ?
Thanks !
Perhaps a better example would show the same operation used on multiple folders
# tar -cf – /var/www /var/ftp | split -b 100m – my_backup.tar
To guard against missing or damaged pieces, install “par2cmdline”, and create parity files so that you can repair missing pieces.
To create parity files with a 15% redundency:
# par2 create -r15 my_backup.tara*
To verify:
# par2 verify my_backup.taraa.par2
To repair missing pieces:
# par2 repair my_backup.taraa.par2
To combine the pieces and extract:
# cat my_backup.tara? | (tar x)
Perhaps I am missing something. For a single large file why not simply use the split command by itself? Executing a single command might be faster than executing both tar and split.
split –bytes=1024b sdbackup.db db_backup_
cat db_backup_* > joined_file.db