285

I am trying to understand pointers in C but I am currently confused with the following:

  • char *p = "hello"
    

    This is a char pointer pointing at the character array, starting at h.

  • char p[] = "hello"
    

    This is an array that stores hello.

What is the difference when I pass both these variables into this function?

void printSomething(char *p)
{
    printf("p: %s",p);
}
7
  • 8
    This would not be valid: char p[3] = "hello"; The initializer string is too long for the size of the array you declare. Typo?
    – Cody Gray
    Commented Apr 17, 2012 at 7:17
  • 24
    Or just char p[]="hello"; would suffice!
    – deepdive
    Commented Jul 2, 2014 at 10:03
  • possible duplicate of C: differences between char pointer and array
    – sashoalm
    Commented Feb 1, 2015 at 18:27
  • 1
    possible duplicate of What is the difference between char s[] and char *s in C? True, this also asks specifically about the function parameter, but that is not char specific. Commented Jun 5, 2015 at 7:44
  • 2
    you need to understand they are fundamentally different. the only commonality in this is that the base of the arry p[] is a const pointer which enabled to access the array p[] via a pointer. p[] itself holds memory for a string, whereas *p just points to address of first element of just ONE CHAR (ie., points to the base of already allocated string). To better illustrate this, consider below: char *cPtr = {'h','e','l','l','o', '\0'}; ==>this is an error,as cPtr is a pointer to only a character char cBuff[] = {'h', 'e','l','l','o','\0'}; ==>This is Ok, bcos cBuff itself is a char array
    – Ilavarasan
    Commented Oct 27, 2015 at 3:20

10 Answers 10

286

char* and char[] are different types, but it's not immediately apparent in all cases. This is because arrays decay into pointers, meaning that if an expression of type char[] is provided where one of type char* is expected, the compiler automatically converts the array into a pointer to its first element.

Your example function printSomething expects a pointer, so if you try to pass an array to it like this:

char s[10] = "hello";
printSomething(s);

The compiler pretends that you wrote this:

char s[10] = "hello";
printSomething(&s[0]);
4
  • 1
    Is something changed from 2012 to now. For a character array "s" prints entire array.. i.e., "hello"
    – Bhanu Tez
    Commented May 9, 2019 at 6:48
  • 5
    @BhanuTez No, how data is stored and what is done with the data are separate concerns. This example prints the entire string because that is how printf handles the %s format string: start at the address provided and continue until encountering null terminator. If you wanted to print just one character you could use the %c format string, for example.
    – jacobq
    Commented Jun 17, 2019 at 10:59
  • 1
    Just wanted to ask whether char *p = "abc"; the NULL character \0 is automatically appended as in case of char [] array?
    – asn
    Commented Jul 8, 2019 at 17:53
  • why i can set char *name; name="123"; but can do the same with int type? And after using %c to print name , the output is unreadable string: ?
    – TomSawyer
    Commented Apr 23, 2020 at 19:52
107

Let's see:

#include <stdio.h>
#include <string.h>

int main()
{
    char *p = "hello";
    char q[] = "hello"; // no need to count this

    printf("%zu\n", sizeof(p)); // => size of pointer to char -- 4 on x86, 8 on x86-64
    printf("%zu\n", sizeof(q)); // => size of char array in memory -- 6 on both

    // size_t strlen(const char *s) and we don't get any warnings here:
    printf("%zu\n", strlen(p)); // => 5
    printf("%zu\n", strlen(q)); // => 5

    return 0;
}

foo* and foo[] are different types and they are handled differently by the compiler (pointer = address + representation of the pointer's type, array = pointer + optional length of the array, if known, for example, if the array is statically allocated), the details can be found in the standard. And at the level of runtime no difference between them (in assembler, well, almost, see below).

Also, there is a related question in the C FAQ:

Q: What is the difference between these initializations?

char a[] = "string literal";   
char *p  = "string literal";   

My program crashes if I try to assign a new value to p[i].

A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:

  1. As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size).
  2. Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.

Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).

See also questions 1.31, 6.1, 6.2, 6.8, and 11.8b.

References: K&R2 Sec. 5.5 p. 104

ISO Sec. 6.1.4, Sec. 6.5.7

Rationale Sec. 3.1.4

H&S Sec. 2.7.4 pp. 31-2

3
  • In sizeof(q), why doesn't q decay into a pointer, as @Jon mentions in his answer?
    – garyp
    Commented Apr 21, 2016 at 19:23
  • @garyp q doesn't decay into a pointer because sizeof is an operator, not a function (even if sizeof was a function, q would decay only if the function was expecting a char pointer ).
    – GiriB
    Commented Aug 14, 2016 at 17:07
  • thanks, but printf("%u\n" instead of printf("%zu\n" , I think you should remove z.
    – Zakaria
    Commented Feb 17, 2018 at 20:11
61

What is the difference between char array vs char pointer in C?

C99 N1256 draft

There are two different uses of character string literals:

  1. Initialize char[]:

    char c[] = "abc";      
    

    This is "more magic", and described at 6.7.8/14 "Initialization":

    An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

    So this is just a shortcut for:

    char c[] = {'a', 'b', 'c', '\0'};
    

    Like any other regular array, c can be modified.

  2. Everywhere else: it generates an:

    So when you write:

    char *c = "abc";
    

    This is similar to:

    /* __unnamed is magic because modifying it gives UB. */
    static char __unnamed[] = "abc";
    char *c = __unnamed;
    

    Note the implicit cast from char[] to char *, which is always legal.

    Then if you modify c[0], you also modify __unnamed, which is UB.

    This is documented at 6.4.5 "String literals":

    5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...]

    6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

6.7.8/32 "Initialization" gives a direct example:

EXAMPLE 8: The declaration

char s[] = "abc", t[3] = "abc";

defines "plain" char array objects s and t whose elements are initialized with character string literals.

This declaration is identical to

char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };

The contents of the arrays are modifiable. On the other hand, the declaration

char *p = "abc";

defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

GCC 4.8 x86-64 ELF implementation

Program:

#include <stdio.h>

int main(void) {
    char *s = "abc";
    printf("%s\n", s);
    return 0;
}

Compile and decompile:

gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o

Output contains:

 char *s = "abc";
8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
f:  00 
        c: R_X86_64_32S .rodata

Conclusion: GCC stores char* it in .rodata section, not in .text.

If we do the same for char[]:

 char s[] = "abc";

we obtain:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

so it gets stored in the stack (relative to %rbp).

Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:

readelf -l a.out

which contains:

 Section to Segment mapping:
  Segment Sections...
   02     .text .rodata
1
11

You're not allowed to change the contents of a string constant, which is what the first p points to. The second p is an array initialized with a string constant, and you can change its contents.

9

For cases like this, the effect is the same: You end up passing the address of the first character in a string of characters.

The declarations are obviously not the same though.

The following sets aside memory for a string and also a character pointer, and then initializes the pointer to point to the first character in the string.

char *p = "hello";

While the following sets aside memory just for the string. So it can actually use less memory.

char p[10] = "hello";
2
  • codeplusplus.blogspot.com/2007/09/… "However, initializing the variable takes a huge performance and space penalty for the array"
    – leef
    Commented Mar 10, 2013 at 14:22
  • @leef: I think that depends where the variable is located. If it's in static memory, I think it is possible for the array and data to be stored in the EXE image and not require any initialization at all. Otherwise, yes, it certainly can be slower if the data has to be allocated and then the static data has to be copied in. Commented Mar 31, 2015 at 15:34
7

From APUE, Section 5.14 :

char    good_template[] = "/tmp/dirXXXXXX"; /* right way */
char    *bad_template = "/tmp/dirXXXXXX";   /* wrong way*/

... For the first template, the name is allocated on the stack, because we use an array variable. For the second name, however, we use a pointer. In this case, only the memory for the pointer itself resides on the stack; the compiler arranges for the string to be stored in the read-only segment of the executable. When the mkstemp function tries to modify the string, a segmentation fault occurs.

The quoted text matches @Ciro Santilli 's explanation.

1
  • It is put on stack only if it is local variable, i,e, declared inside function body, otherwise it will be in .data section. Commented Dec 27, 2023 at 13:44
3

As far as I can remember, an array is actually a group of pointers. For example

p[1]== *(&p+1)

is a true statement

1
  • 2
    I would describe an array as being a pointer to the address of a block of memory. Hence why *(arr + 1) brings you to the second member of arr. If *(arr) points to a 32-bit memory address, e.g. bfbcdf5e, then *(arr + 1) points to bfbcdf60 (the second byte). Hence why going out of the scope of an array will lead to weird results if the OS doesn't segfault. If int a = 24; is at address bfbcdf62, then accessing arr[2] might return 24, assuming a segfault doesn't happen first. Commented Feb 5, 2014 at 3:05
2

char p[3] = "hello" ? should be char p[6] = "hello" remember there is a '\0' char in the end of a "string" in C.

anyway, array in C is just a pointer to the first object of an adjust objects in the memory. the only different s are in semantics. while you can change the value of a pointer to point to a different location in the memory an array, after created, will always point to the same location.
also when using array the "new" and "delete" is automatically done for you.

0
0

It looks the same but there are subtle differences.

This declaration of an char array:

char p[] = "hello"

It means p is a unmodifiable lvalue, and you can use p as an alias to this array. In fact you can use p just as an address of the array (or the first element of an array): p is an alias for &p[0]

But you can't operate on it like you can on pointers, you cannot change its value:

p++;

But you can get its value:

printf("Address of 3rd character:%p:", (p + 2));

As all initialized (global) variables, it will be put in .data section of a ELF file:

Hex dump of section '.data':
  0x00004000 00000000 00000000 08400000 00000000 .........@......
  0x00004010 68656c6c 6f00                       hello.

    ...

In this case we declare char pointer and initialize it too:

    char *p = "hello"

But string literal used for initialization will be put in .rodata section while char pointer itself will be in .data section of ELF file:

Hex dump of section '.rodata':
  0x00002000 01000200 68656c6c 6f00              ....hello.
1
  • "It means p is a unmodifiable lvalue, and you can use p as an alias to this array..... p is an alias for &p[0]." -- This has a couple of problems. First, p is the array, not an alias, and p is not an alias for &p[0]. Rather, p decays to a pointer equivalent to &p[0] in most expressions. p is an lvalue that is not modifiable, as you say, but when p decays to a pointer the resulting pointer is not an lvalue (6.3.2.1 §3). This is the reason that p++ fails to compile; ++ requires an lvalue operand. Commented Jan 10, 2024 at 11:18
0

If you need ROM maps in memory that you will access in different compilation units you can allocate the memory in the first unit:

extern const unsigned char BPG_Arial29x32[] = {
    // Font Info
    0x00,                   // Unknown #1
    0x00,                   // Unknown #2
...
}

In the other unit there are two options to declare in the headers:

extern const unsigned char* BPG_Arial29x32;

OR

extern const unsigned char BPG_Arial29x32[];

The second is always working, and the first let the software 'hang' if you use it in that way:

inline static map<const unsigned char*, const char*> fontSoftToString = {
    {BPG_Arial29x32, "BPG_Arial29x32"}
};

inline static map<string, const unsigned char*> stringToSoftFont = {
    {"BPG_Arial29x32", BPG_Arial29x32}
};

But it works if you use it as a function parameter:

declare: SetTextFontRom(const unsigned char* font)
use: SetTextFontRom(BPG_Arial29x32);

Why is this and why this is not 'compatible'?

1
  • If you have a new question, please ask it by clicking the Ask Question button. Include a link to this question if it helps provide context. - From Review Commented Nov 13, 2024 at 20:35

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.