I wrote a small, zero-dependency wrapper for secure ZIP extraction.
https://github.com/barseghyanartur/safezip
What My Project Does
safezip is a zero-dependency wrapper around Python’s zipfile
module that makes secure ZIP extraction the default. It protects
against:
- ZipSlip protection: Blocks relative paths, absolute paths, Windows
UNC paths, Unicode lookalike attacks, and null bytes in filenames.
- ZIP bomb prevention: Enforces per-member and cumulative
decompression ratio limits at stream time — not based on untrusted
header values.
- ZIP64 consistency checks: Crafted archives with inconsistent ZIP64
extra fields are rejected before decompression begins.
- Symlink policy — configurable: REJECT (default), IGNORE,
or RESOLVE_INTERNAL.
- Atomic writes: Extracts to a temp file first and only moves it to
the destination if all checks pass. If something fails, you don’t end
up with half-extracted junk on your disk.
- Environment variable overrides: All numeric limits can be set via
SAFEZIP_* environment variables for containerised deployments.
It’s meant to be an almost drop-in replacement. You can just do:
from safezip import safe_extract
safe_extract("path/to/file.zip", "/var/files/extracted/")
If you need more control, there’s a SafeZipFile context manager that
lets you tweak limits or monitor security events.
from safezip import SafeZipFile
with SafeZipFile("path/to/file.zip") as zf:
print(zf.namelist())
zf.extractall("/var/files/extracted/")
Target Audience
If you’re handling user uploads or processing ZIP files from untrusted
sources, this might save you some headache. It’s production-oriented but
currently in beta, so feedback and edge cases are very welcome.
Comparison
The standard library’s zipfile module historically wasn’t safe to
use on untrusted files. Even the official docs warn against
extractall() because of ZipSlip risks, and it doesn’t do much to
stop ZIP bombs from eating up your disk or memory. Python 3.12 did
address some of this — extractall() now strips path components that
would escape the target directory — but it still leaves meaningful gaps:
no ZIP bomb protection, no stream-time size enforcement, no symlink
policy, no ZIP64 consistency checks, and no atomic writes. safezip
fills all of those. I got tired of writing the same boilerplate every
time, so I packaged it up.
-—
Documentation: https://safezip.readthedocs.io/en/latest/
Originally published as GitHub Gist #160a1cd9c12448447882f12b16e14d35