Open
Description
Bug report
Bug description:
Hey.
I think there exist a number of issues (which may, depending on how code is used, in principle even be security relevant).
First, AFAIU, ZIP files may contain (at least) regular files, directories (as "standalone" items in the archive, like an empty directory) and symbolic links.
For example using the zip
program:
$ mkdir empty
$ zip d.zip empty
adding: empty/ (stored 0%)
$ ln -s /dev/null bar
$ zip --symlinks s.zip bar
adding: bar (stored 0%)
$
When I open the one with the symlink in Python:
>>> import zipfile
>>> z = zipfile.ZipFile("s.zip", "r")
>>> z.namelist()
['bar']
>>> f = z.open("bar", "r")
>>> f.read()
b'/dev/null'
>>> i = z.getinfo("bar")
>>> i.is_dir()
False
>>> p = zipfile.Path(z, "bar")
>>> p.filename
PosixPath('s.zip/bar')
>>> p.is_dir()
False
>>> p.is_file()
True
>>> p.is_symlink()
True
>>>
- It's IMO debatable whether
z.open("<symlink>", "r")
should succeed or not. IMOzipfile.ZipFile.open()
quite clearly is ZIP'sopen()
, but that AFAIK, never opens symlinks but only follows them. - Even if that behaviour is desired (i.e. like a
os.readlink()
for ZIPs), then it's still completely unexpected and the ZipFile object has nois_symlink()
-method ... (onlyzipfile.Path
has such, so one needs to create that first, which seems quite unhandy). - Speaking of which
zipfile.Path
’sis_dir()
,is_file()
andis_symlink()
functions seem either buggy or semantically inconsistent and/or badly documented.
Usually, "file" means either any type of file (directory, symlink, device, etc.) or regular files (and sometimes also symlinks if they point to regular files).
Here, the symlink points to nothing, so it cannot be the latter case. Also - see below - a directory wouldn't returnTrue
foris_file()
, so the it's not the former either.
The docs also don't meantion what "file" means.
Now the same with d.zip
:
>>> import zipfile
>>> z = zipfile.ZipFile("d.zip", "r")
>>> z.namelist()
['empty/']
>>> f = z.open("empty/", "r")
>>> f.read()
b''
>>> i = archive_file.getinfo("empty/")
>>> i.is_dir()
True
>>> p = zipfile.Path(z,"empty/")
>>> p.filename
PosixPath('d.zip/empty')
>>> p.is_dir()
True
>>> p.is_file()
False
>>> p.is_symlink()
False
>>>
- IMO, that
zipfile.ZipFile.open()
succeeds on a directory (and gives an emptybytes
) seems pretty strange at best. It does so even if the directory isn't empty but contains files. - There's also that thing that sometimes that directory pathnames are suffixed by
/
and sometimes not. Maybe I've missed it but that doesn't seem to be documented, but may be crucial when e.g. matching filenames - and is IMO unexpected. - As mentioned above,
is_file()
here isFalse
, which would imply that the meaning of that function should be that a file is either a regular file or a regular file or a symbolic link pointing to such.
Cheers,
Chris.
CPython versions tested on:
3.13
Operating systems tested on:
Linux
Metadata
Metadata
Assignees
Projects
Status
No status