The primary issue here -- which accounts for why, e.g., $PS1
is not reported by env
-- is that env
is reporting from a non-interactive environment. Processes are executed from a fork of your interactive shell, but there's a subtlety involved in how their environment is set: It's actually inherited via a native C level external variable set for all exec()
'd processes (see man environ
). Here's an illustration:
#include <stdio.h>
extern char **environ;
int main (void) {
int i;
for (i = 0; environ[i] != NULL; i++) {
printf("%s\n", environ[i]);
}
return 0;
}
What's interesting about this is, if you compile and run it, you'll find the contents of **environ
exactly match the one reported by env
:
$ gcc test.c
$ ./a.out > aout.txt
$ env > env.txt
$ diff env.txt aout.txt
68c68
< _=/bin/env
---
> _=./a.out
The only difference is the name of the executable. So where does **environ
come from and why doesn't it contain, e.g., $PS1
?
The fundamental explanation is that process are always created as children of other processes and they inherit **environ
, but PS1
was never part of it. At start up, a shell may source variables from standard places, and those places differ depending on whether the shell is interactive or not; see INVOCATION in man bash
. An aspect of this is that:
PS1 is set [...] if bash is interactive, allowing a shell
script or a startup file to test this state.
Now, notice in /etc/bashrc
something like this:
# are we an interactive shell?
if [ "$PS1" ]; then
Which is where your actual (fancy) prompt is set, and neither it nor the initial value of $PS1
were ever export
ed. The initial value was created by the shell at invocation because it was interactive, and then it sourced then that file -- but PS1
did not get put into **environ
. You can see this if you execute:
#!/bin/sh
echo $PS1
Nothing -- even though if you echo $PS1
in your interactive shell it's defined. This is because the **environ
of the executed #!/bin/sh
is the same as that of the parent interactive shell, but that does NOT contain PS1
. This implies each shell uses an internal table of global variables separate, but originally populated, from **environ
(this is confusing, since it means **environ
does not include many things referred to as environment variables).
The contents of **environ
are in in /proc/[PID]/environ
, and if you check that for your current interactive shell, cat /proc/$BASHPID/environ
, you'll see PS1
is not there.
But how does stuff get into "environ"?
The simple answer is, via system calls. For example, if we throw some stuff into the example C program from earlier:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern char **environ;
int main (void) {
int i;
if (putenv("MYFOO=whatbar?")) {
fprintf(stderr, "putenv() failed: %s\n", strerror(errno));
exit(1);
}
for (i = 0; environ[i] != NULL; i++) {
printf("%s\n", environ[i]);
}
return 0;
}
MYFOO=whatbar?
will be in the output (see man putenv
). Since the shell creates processes by fork()
ing (which duplicates the parent's memory stack) and then calling execv()
(which passes on the duplicated **environ
), we can see a mechanism by which environment variables may be export
ed to child processes.
If you throw a fork()
into that example, you'll see this is the case, and (to reiterate), this process of fork'ing and potentially exec'ing is how child processes are created and inherit **environ
from their ancestors. exec
calls replace the process image, but as per man execv
and man environ
(nb. some versions of the former do not refer to this), **environ
is passed on by the system.
Here's a literal fork and exec of /usr/bin/env
with MYFOO=whatbar?
exported via putenv()
:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
extern char **environ;
int main (void) {
pid_t pid;
if (putenv("MYFOO=whatbar?")) {
fprintf(stderr, "putenv() failed: %s\n", strerror(errno));
exit(1);
}
pid_t pid = fork();
if (!pid) execl("/usr/bin/env", "env", NULL);
return 0;
}
So where's the stuff that's not in "environ"?
It's private data of a particular shell instance. Bash will show you this + the inherited environ stuff via set
with no arguments. Note this output also includes sourced functions.
But, if I find, for example, LC_CTYPE using env | grep "LC_CTYPE", it sends no output. In general, locale shows me 13 LC_* variables and env only nine:
I get no LC_
variables at all from env
(just LANG
) but 13 from locale
. I would presume these are variables set by a locale
call and not exported; the fact that you get any from env
perhaps reflects a naive error in some configuration somewhere.