Introduction
Compression Algorithms
Files inside zip archives can be compressed using different algorithms. The ZipArchive
Library supports
deflate and
bzip2 algorithms to be used during compression
of zip archives. You should choose the compression algorithms depending on your
needs. To select it, use the
CZipArchive::SetCompressionMethod()
method before compressing a file. There is no need to use this method when decompressing.
The decompression process automatically detects the algorithms used.
Sample Code
CZipArchive zip;
zip.Open(_T("C:\\Temp\\test.zip"), CZipArchive::zipCreate);
zip.SetCompressionMethod(CZipCompressor::methodBzip2);
zip.AddNewFile(_T("C:\\Temp\\file1.dat"));
zip.Close();
Deflate
It is the most frequently used algorithm in zip archives and supported by all standard
zip utilities. The implementation of this algorithm is provided by the Zlib library
(see
Acknowledgements: Credits and Used Third-Party Code Licensing Information for more information).
Bzip2
It compresses files more efficiently than the deflate algorithm, but is slower.
It is supported by PKZIP since version 4.6 and by WinZip since version 10.0. Earlier
versions of these programs will not decompress archives that use the bzip2 algorithm.
The implementation of this algorithm is provided by the bzip2 data compressor (see
Acknowledgements: Credits and Used Third-Party Code Licensing Information for more information).
Enabling Bzip2 Functionality
The bzip2 algorithm is available in the Full Version of the Library only and is
enabled by default. If you don't need it, you can disable it by commenting out the
_ZIP_BZIP2 definition in the file
_features.h.
Using External Bzip2 Library
The ZipArchive Library comes already with source files for bzip2 algorithm from
the original bzip2 library distribution, but you can also use the bzip2 library
if it comes with your system - it is usually true for Linux/Mac OS X systems.
- Under Windows, the ZipArchive Library uses by default internal bzip2 sources.
- Under Linux/Mac OS X, the ZipArchive Library uses by default the
libbzip2 library that comes with the system.
To use the bzip2 sources that come with the ZipArchive Library, make sure, that
_ZIP_BZIP2_INTERNAL is defined in the
_features.h file while compiling the library. Undefine it, to use the
bzip2 library that comes with your system.
Compiling with Bzip2 under Linux/Mac OS X
- If you use the makefile provided with the ZipArchive Library,
you may control whether the ZipArchive Library compiles with internal or external
bzip2 library with the INTERNAL_BZIP2 variable defined
inside the makefile. Don't adjust the
_ZIP_BZIP2_INTERNAL definition in the _features.h
file, but:
- To use the bzip2 sources provided with the ZipArchive Library, define the INTERNAL_BZIP2 variable (uncomment the appropriate line),
- To use the bzip2 library that comes with your system, make sure this variable is
not defined.
- If you use your own makefiles
- make sure that _ZIP_BZIP2_INTERNAL is not defined in
the _features.h file,
- link your program with the system's bzip2 library (use -lbz2
compiler option).
Easy Single File Compression
To quickly add a file to an archive, use the
CZipArchive::AddNewFile(CZipAddNewFileInfo&)
method or one of its overloads. You need to specify the file to compress. Additionally,
you may specify:
Sample Code
CZipArchive zip;
zip.Open(_T("C:\\Temp\\test.zip"), CZipArchive::zipCreate);
zip.AddNewFile(_T("C:\\Temp\\file1.dat"));
zip.AddNewFile(_T("C:\\Temp\\file2.dat"), _T("renamed.dat"));
zip.AddNewFile(_T("C:\\Temp\\file3.dat"), 0);
zip.AddNewFile(_T("C:\\Temp\\file4.dat"), -1, false);
zip.Close();
Callbacks Called
The methods for easy compression can call the following callbacks to notify about
the progress:
To read more about using callback objects, see
Progress Notifications: Using Callback Objects.
Easy Multiple Files Compression
To quickly add a file to an archive, use the
CZipArchive::AddNewFiles()
method or this method overload. You need to specify the directory that contains
the files to compress. Additionally, you may filter the files and specify:
- whether the subfolders are also searched for files,
- the compression level (you may request no compression as well),
- whether to trim the root directory from the path stored in the archive,
- the smartness of the compression process with one of CZipArchive::Smartness
values.
Using Filters
To have more control over which files are added to an archive, you can use the filters
with the
CZipArchive::AddNewFiles() method.
Sample Code
using namespace ZipArchiveLib;
class CSizeFileFilter : public CFileFilter
{
ZIP_FILE_USIZE m_uMinSize;
ZIP_FILE_USIZE m_uMaxSize;
public:
CSizeFileFilter(ZIP_FILE_USIZE uMinSize, ZIP_FILE_USIZE uMaxSize, bool bInverted = false)
:m_uMinSize(uMinSize), m_uMaxSize(uMaxSize), CFileFilter(bInverted)
{
}
bool Accept(LPCTSTR, LPCTSTR lpszName, const CFileInfo& info)
{
return info.m_uSize >= m_uMinSize && info.m_uSize <= m_uMaxSize;
}
};
void EasyMultiCompress()
{
CZipArchive zip;
zip.Open(_T("C:\\Temp\\test.zip"), CZipArchive::zipCreate);
CGroupFileFilter groupFilter;
groupFilter.Add(new CSizeFileFilter(0, 20 * 1024));
groupFilter.Add(new CNameFileFilter(_T("*.tmp"), true));
groupFilter.Add(new CNameFileFilter(_T("*.dat"), true));
groupFilter.Add(new CNameFileFilter(_T("*.zip"), true));
groupFilter.Add(new CNameFileFilter(_T("*tmp*"), true, CNameFileFilter::toDirectory));
groupFilter.Add(new CNameFileFilter(_T("*temp*"), true, CNameFileFilter::toDirectory));
zip.AddNewFiles(_T("C:\\Temp"), groupFilter);
zip.Close();
}
Filtering Directories
To match directories, use the
_T("*")
pattern in the name filter (
ZipArchiveLib::CNameFileFilter). The
_T("*.*")
pattern would only match directories with the dot character in the name. Also, use
the
ZipArchiveLib::CNameFileFilter::toAll type.
To ignore empty directories with this filter, include the CZipArchive::zipsmIgnoreDirectories
in the iSmartLevel
parameter of the CZipArchive::AddNewFiles()
method.
Sample Code
CZipArchive zip;
zip.Open(_T("C:\\Temp\\test.zip"), CZipArchive::zipCreate);
CNameFileFilter filter(_T("*"), false, CNameFileFilter::toAll);
zip.AddNewFiles(_T("C:\\Temp\\Input1\\"), filter);
zip.AddNewFiles(_T("C:\\Temp\\Input2\\"), filter, true, -1, true,
CZipArchive::zipsmSafeSmart | CZipArchive::zipsmIgnoreDirectories);
zip.Close();
Additional Considerations
- The filters classes are located in the
ZipArchiveLib
namespace. Use
this namespace when using filters.
- To use directory traversing functionality in your application, you can reuse the
ZipArchiveLib::CDirEnumerator class.
Callbacks Called
When adding multiple files, the following callbacks are called:
To read more about using callback objects when performing multiple operations, see
Progress Notifications: Using Callback Objects.
Advanced Compression: More Control Over How Data is Written
The
CZipArchive::AddNewFile(CZipAddNewFileInfo&)
method and its overrides do most of the work for you, however you may want to have
more control over this process. To manually compress a file follow these steps:
- Prepare a CZipFileHeader template and fill it with
required data. To read what data is reused from the template when adding a file,
see the CZipArchive::OpenNewFile() method documentation.
- Open a new file record in the archive with the CZipArchive::OpenNewFile()
method.
- Compress the data by repeatedly calling the CZipArchive::WriteNewFile()
method.
- When you have no more data to compress, call the CZipArchive::CloseNewFile()
method. If an exception was thrown while compressing data, call this method with
the
bAfterException
parameter set to true
to prevent further
exceptions being thrown when closing.
Sample Code
CZipArchive zip;
zip.Open(_T("C:\\Temp\\test.zip"));
CZipFileHeader templ;
templ.SetFileName(_T("data.txt"));
templ.SetSystemAttr(FILE_ATTRIBUTE_READONLY);
zip.OpenNewFile(templ, 9);
LPCTSTR data1 = _T("This is data\r\n");
LPCTSTR data2 = _T("to be written");
zip.WriteNewFile(data1, (DWORD)(_tcslen(data1) * sizeof(TCHAR)));
zip.WriteNewFile(data2, (DWORD)(_tcslen(data2) * sizeof(TCHAR)));
zip.CloseNewFile();
zip.Close();
Adding Directories
You can add a directory in two ways:
- To add an existing directory, you can use any of the
CZipArchive::AddNewFile()
methods.
- To add a non-existing directory, use the CZipArchive::OpenNewFile()
method.
Sample Code
CZipArchive zip;
zip.Open(_T("C:\\Temp\\test.zip"), CZipArchive::zipCreate);
zip.AddNewFile(_T("c:\\windows"), CZipCompressor::levelStore);
CZipFileHeader header;
header.SetSystemAttr(ZipPlatform::GetDefaultDirAttributes());
header.SetFileName(_T("empty dir"));
header.SetTime(time(NULL));
zip.OpenNewFile(header, CZipCompressor::levelStore);
zip.CloseNewFile();
zip.Close();
Other Functionality
Adding Files From Other Archives
If you wish to add to your archive files from other archives and you would like
to avoid extracting and then compressing them again, use one of the following methods:
- These methods copy data from the source archive to the destination archive without
decompressing it.
- If an encryption method and a password are set, the data will be encrypted while
the getting process. Data will not be encrypted, if it already is encrypted. See
Encryption Methods: How to Best Protect Your Data for information about setting an encryption
method and a password.
Sample Code
CZipArchive zipDest;
zipDest.Open(_T("C:\\Temp\\testDest.zip"), CZipArchive::zipCreate);
CZipArchive zipSource;
zipSource.Open(_T("C:\\Temp\\test.zip"));
CZipIndexesArray indexes;
indexes.Add(0);
indexes.Add(1);
zipDest.GetFromArchive(zipSource, indexes);
zipSource.Close();
zipDest.Close();
Multithreaded Compression
Although compression to a single archive from multiple threads is not possible,
you can perform multithreaded compression to some extent using the following steps:
- Compress files to separate archives. Use one thread per archive.
For example, if you have six files to compress, you can put three files to one archive
and the other three to another archive. In this case, you would use two threads.
- When all files are compressed, create a new (destination) archive and use one of
the CZipArchive::GetFromArchive() methods to copy compressed
data from existing archives to the new archive. Perform this step in a single thread.
In the example above, you would copy the compressed data to a
new archive and delete the two existing archives.
Finalizing Archives and Preventing Archive Corruption
During an archive modification, the central directory is removed from the archive
and kept in memory. It is written back when you call
CZipArchive::Close().
However, if a crash occurs before the central directory is written, the archive
will be unusable. You can request writing the central directory back to the archive
after each modification with the
CZipArchive::SetAutoFinalize()
method or perform it manually with the
CZipArchive::Finalize()
method. You should use the finalizing methods sparingly otherwise the performance
can be degraded.
The
CZipArchive::Finalize() (called manually or automatically)
will not execute when there are any pending changes. See
Modification of Archives: Replacing, Renaming, Deleting and Changing Data
for more information.
To flush file buffers alone without writing the central directory to the disk, call
the
CZipArchive::FlushBuffers() method.
When removing files, you can remove them only from the central directory for safety.
See
CZipArchive::RemoveFile for more information (set
the
bRemoveData
parameter to
false
).
Segmented Archives
If you finalize a segmented archive in creation, it will not be closed, but its
state will be changed from "an archive in creation" to "an existing segmented archive".
Finalize a segmented archive, when you have finished adding files to it and you
want to begin extracting or testing it. This means that you can finalize a segmented
archive only once. However, if after finalizing a segmented archive it turns out
that the archive is one segment only, the archive is converted to a normal archive
and you can use it as such. If you want to know what is the state of the archive
after finalizing it, call the
CZipArchive::GetStorage()
and then the
CZipStorage::IsSegmented() method. The
method will return
true
if the archive was converted to a normal archive.
Committing Modification Changes
To prevent archive corruption you may also want to adjust the commit changes mode.
See
Modification of Archives: Replacing, Renaming, Deleting and Changing Data for more information.
- Auto-Finalize cannot be enabled, if there are any uncommitted changes pending.
System Compatibility
- Each file inside an archive has a flag set which tells for which platform the file
is intended. This can be one of ZipCompatibility::ZipPlatforms
values.
- The compatibility settings affect the conversion of file attributes - file attributes
are defined differently across the platforms.
- When opening an existing archive, the ZipArchive Library assumes the system compatibility
of the whole archive to be the same as the compatibility of the first file in the
archive (if the one is present). This will affect only newly added files. In other
cases the current system value is assumed which is determined by the ZipPlatform::GetSystemID() method call during creating or opening an
archive.
- You can set the system compatibility with the CZipArchive::SetSystemCompatibility()
method.
Setting Compressor Options
You can adjust the options of the
Deflate
or the
Bzip2 compressor by calling the
CZipArchive::SetCompressionOptions() method providing
as an argument an appropriate options object derived from the
CZipCompressor::COptions class.
Use
ZipArchiveLib::CDeflateCompressor::COptions and
ZipArchiveLib::CBzip2Compressor::COptions
, respectively.
Please refer to the sample code below and the documentation of these classes.
Sample Code
#include "DeflateCompressor.h"
#include "Bzip2Compressor.h"
using namespace ZipArchiveLib;
void SetOptions()
{
CZipArchive zip;
CDeflateCompressor::COptions deflateOptions;
deflateOptions.m_iBufferSize = 4 * 65536;
zip.SetCompressionOptions(&deflateOptions);
CBzip2Compressor::COptions bzip2Options;
bzip2Options.m_iBufferSize = 65536;
zip.SetCompressionOptions(&bzip2Options);
}
Additional Considerations (Windows Only)
When your system utilizes large amount of memory while extensive file operations,
see
Modification of Archives: Replacing, Renaming, Deleting and Changing Data for a possible solution.
See Also API Links