5

I'm trying to get a list of the CSV files in a directory with python. This is really easy within unix:

ls -l *.csv

And, predictably, I get a list of the files that end with .csv in my directory. However, when I attempt the Python equivalent using the Subprocess module:

>>> import subprocess as sp
>>> sp.Popen(["ls", "-l", "*.csv"], stdout = sp.PIPE)
<subprocess.Popen object at 0xb780e90c>
>>> ls: cannot access *.csv: No such file or directory

Can somebody please explain what's going on?

Edit: Adding shell = True removes the error, but instead of getting a list of just CSV files, I get a list of all the files in the directory.

0

4 Answers 4

4

If you want it to behave as it does at the shell, you need to pass shell=True (your mileage may vary here, depending on your system and shell). In your case the problem is that when you do ls -l *.csv, the shell is evaluating what * means, not ls. (ls is merely formatting your results, but the shell has done the heavy lifting to determine what files match *.csv). Subprocess makes ls treat *.csv literally, and look for a file with that specific name, which of course there aren't any (since that's a pretty hard filename to create).

What you really should be doing is using os.listdir and filtering the names yourself.

Sign up to request clarification or add additional context in comments.

9 Comments

I've followed yours and Weeble's suggestions, but now I get a list of all the files in the directory, not just the CSV files I want. Do you know what the problem is?
OK, I've figured out how to solve my problem using Python's glob module - much simpler overall. However, I still want to know what's going on.
Ah-hah! :-) I almost suggested using the glob module, but I didn't think it was quite what you wanted (but it addresses the same problem). In general you shouldn't trust subprocess to accurately replicate shell behaviour - it may on some systems, but not on others.
Thanks. I guess this goes as another example against using the shell and ignoring python's library. :D
@Nick, what does /bin/sh -c ls -l *.csv do for you when you type it at a shell prompt? for me (on Linux) it works just like ls (ignoring the rest of the args) -- and that /bin/sh command is what shell=True with a list (instead of a string) is specified as doing.
|
4

Why not use glob instead? It's going to be faster than "shelling out"!

import glob
glob.glob('*.csv')

This gives you just the names, not all the extra info ls -l supplies, though you can get extra info with os.stat calls on files of interest.

If you really must use ls -l, I think you want to pass it as a string for the shell to do the needed star-expansion:

proc = sp.Popen('ls -l *.csv', shell=True, stdout=sp.PIPE)

2 Comments

Yes, I discovered glob seconds before I saw your post. :P. However, I still want to figure out why Python can't produce the same output as the shell.
@Tom, sure it can -- "/bin/sh -c ls -l '*.csv'" (which as the docs say is the exact equivalent of shell=True with a list instead of a string) behaves just the same way, listing all file names -- try it! When you want the behavior you'd get by typing at the shell a plain string, you give Popen the same string (with shell=True), as I said.
1

When you enter ls -l *.csv at the shell, the shell itself expands *.csv into a list of all the filenames it matches. So the arguments to ls will actually be something more like ls -l spam.txt eggs.txt ham.py

The ls command doesn't understand wildcards itself. So when you pass the argument *.csv to it it tries to treat it as a filename, and there is no file with that name. As Nick says, you can use the shell=True parameter to have Python invoke a shell to run the subprocess, and the shell will expand the wildcards for you.

Comments

1
p=subprocess.Popen(["ls", "-l", "*.out"], stdout = subprocess.PIPE, shell=True)

causes

/bin/sh -c ls -l *.out

to be executed.

If you try this command in a directory, you'll see -- in typical mystifying-shell fashion -- all files are listed. And the -l flag is ignored as well. That's a clue.

You see, the -c flag is picking up only the ls. The rest of the arguments are being eaten up by /bin/sh, not by ls.

To get this command to work right at the terminal, you have to type

/bin/sh -c "ls -l *.out"

Now /bin/sh sees the full command "ls -l *.out" as the argument to the -c flag.

So to get this to work out right using subprocess.Popen, you are best off just passing the command as a single string

p=subprocess.Popen("ls -l *.out", stdout = subprocess.PIPE, shell=True)
output,error=p.communicate()
print(output)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.