Ex11: Why won't it break?

I’m struggling to break the char array for this one.

My problem is that when I add a non-null char to the end of the array it simply won’t print out the expected garbage at the end of it (have tried moving the initialisation of the variables around to see if that helps.

My initial thoughts were:

  • I’m getting super lucky with where these variable are getting stored in memory and name jsut so happens to get stored each time just before a null character (e.g. location of (name[3] + 1) == ‘\0’ in memory).
  • the compiler is taking pity on me and adding a null character to the end of the array (seems to go against everything that I’ve been reading online).

I can’t seem to get to the bottom of it from the various forums online, and there are no references to point two anywhere that I can see (my next item to investigate was the cc release notes.

For reference

>> cc --version
>> cc (Ubuntu 9.3.0-10ubuntu2) 9.3.0)

I was initially following the book on Windows 10 with cygwin up to and including ex11. In case that was causing the problem (or in this case a lack of a problem!), however unlikely this is, I took the opportunity to finally get around to installing Ubuntu on my machine alongside Windows. I get the same result on both OS’s. By my thinking this significantly reduces the likelihood of point one causing the issue.

Any thoughts on where I’ve gone wrong?

ex11.c is below built with the Makefile tags as recommended in ex2

CFLAGS=-Wall -g
#include <stdio.h>

int main(int argc, char *argv[])
{
        int numbers[4] = { 0 };
        char name[4] = { 'a', 'a', 'a', 'a' };

        // first, print the out raw
        printf("numbers: %d %d %d %d\n",
                        numbers[0], numbers[1], numbers[2], numbers[3]);

        printf("name each: %c %c %c %c\n",
                        name[0], name[1], name[2], name[3]);

        printf("name: %s\n", name);

        // set up the numbers
        numbers[0] = 1;
        numbers[1] = 2;
        numbers[2] = 3;
        numbers[3] = 4;

        // setup up the name
        name[0] = 'Z';
        name[1] = 'e';
        name[2] = 'd';
        name[3] = 'A';

        // print them out initialised
        printf("numbers: %d %d %d %d\n",
                        numbers[0], numbers[1], numbers[2], numbers[3]);

        printf("name each: %c %c %c %c\n",
                        name[0], name[1], name[2], name[3]);

        printf("name: %s\n", name);

        // another way to use name
        char *another = "Zed";

        printf("another: %s\n", another);

        printf("another each: %c %c %c %c\n",
                        another[0], another[1], another[2], another[3]);

        return 0;
}

This returns

./ex11
numbers: 0 0 0 0
name each: a a a a
name: aaaa
numbers: 1 2 3 4
name each: Z e d A
name: ZedA
another: Zed
another each: Z e d 

Just noticed this is the same topic (maybe written in a different way) to a similar post from '18 but that one looks like its still unresolved.

I don’t have a good answer either, other than that you can’t rely on this behavior. It may be that gcc initializes the stack so it’s all 0.

I suspect that you would be able to break this reliably if you allocate the array on the heap?

Hi florian,

Thanks for the reply. I haven’t played with the heap yet (presume this is setting up a memory location with melloc?).

I got curious on this again with ex12 which I’ve just written up so I thought I’d investigate. It looks like the compiler is adding in an additional char value onto the end of the array, regardless of if I add one myself. For example when I compile and run this code (adding the null terminator) you can see the address at the ‘\0’ AND the address at (&full_name[0]+12) are both null.

(note I’ve wrapped the important parts of the terminal dump with ** for this forum

#include <stdio.h>

int main(int argc, char *argv[])
{
        int areas[] = { 10, 12, 13, 14, 20 };
        char name[] = "Zed";
        char full_name[12] = {
                'Z', 'e', 'd',
                ' ', 'A', '.', ' ',
                'S', 'h', 'a', 'w', '\0'
        };

        char *test;

        for (int i = 0; i<18; i++){
                test = &full_name[0] + i;
                printf("%p :: %c\n", (&name[0] + i), *test);
        }

        // WARNING: On some systems you may have to change the
        // %ld in this code to a %u since it will use unsigned ints
        printf("The size of an int: %ld\n", sizeof(int));
        printf("The size of areas (int[]): %ld\n", sizeof(areas));
        printf("The size of ints in areas: %ld\n",
                        sizeof(areas) / sizeof(int));
        printf("The first areas is %d, the 2nd %d,\n", areas[0], areas[1]);

        printf("The size of a char: %ld\n", sizeof(char));
        printf("The size of name (char[]): %ld\n", sizeof(name));
        printf("The size of chars: %ld\n", sizeof(name) / sizeof(char));

        printf("The size of full_name (char[]): %ld\n", sizeof(full_name));
        printf("The number of chars: %ld\n",
                        sizeof(full_name) / sizeof(char));

        printf("name=\"%s\" and full_name=\"%s\"\n", name, full_name);

        return 0;
}

Run #1

./ex12
0x7ffc9b742f18 :: Z
0x7ffc9b742f19 :: e
0x7ffc9b742f1a :: d
0x7ffc9b742f1b ::  
0x7ffc9b742f1c :: A
0x7ffc9b742f1d :: .
0x7ffc9b742f1e ::  
0x7ffc9b742f1f :: S
0x7ffc9b742f20 :: h
0x7ffc9b742f21 :: a
0x7ffc9b742f22 :: w
**0x7ffc9b742f23 :: **
**0x7ffc9b742f24 ::** 
0x7ffc9b742f25 :: �
0x7ffc9b742f26 :: 
0x7ffc9b742f27 :: g
0x7ffc9b742f28 :: �
0x7ffc9b742f29 :: �
The size of an int: 4
The size of areas (int[]): 20
The size of ints in areas: 5
The first areas is 10, the 2nd 12,
The size of a char: 1
The size of name (char[]): 4
The size of chars: 4
The size of full_name (char[]): 12
The number of chars: 12
name="Zed" and full_name="Zed A. Shaw"

Run #2 of the same code

./ex12
0x7fff47ccc2c8 :: Z
0x7fff47ccc2c9 :: e
0x7fff47ccc2ca :: d
0x7fff47ccc2cb ::  
0x7fff47ccc2cc :: A
0x7fff47ccc2cd :: .
0x7fff47ccc2ce ::  
0x7fff47ccc2cf :: S
0x7fff47ccc2d0 :: h
0x7fff47ccc2d1 :: a
0x7fff47ccc2d2 :: w
**0x7fff47ccc2d3 :: **
**0x7fff47ccc2d4 ::** 
0x7fff47ccc2d5 :: �
0x7fff47ccc2d6 :: �
0x7fff47ccc2d7 :: �
0x7fff47ccc2d8 :: �
0x7fff47ccc2d9 :: �
The size of an int: 4
The size of areas (int[]): 20
The size of ints in areas: 5
The first areas is 10, the 2nd 12,
The size of a char: 1
The size of name (char[]): 4
The size of chars: 4
The size of full_name (char[]): 12
The number of chars: 12
name="Zed" and full_name="Zed A. Shaw"

Replacing the ‘\0’ in line 16 with a letter (‘s’ in this case) simply removes our null value but the ‘auto generated’ one remains.
Running this code consistently ‘prints’ a null terminator at the address (&full_name[0] + 12)

 ./ex12
0x7ffc5bcfa9c8 :: Z
0x7ffc5bcfa9c9 :: e
0x7ffc5bcfa9ca :: d
0x7ffc5bcfa9cb ::  
0x7ffc5bcfa9cc :: A
0x7ffc5bcfa9cd :: .
0x7ffc5bcfa9ce ::  
0x7ffc5bcfa9cf :: S
0x7ffc5bcfa9d0 :: h
0x7ffc5bcfa9d1 :: a
0x7ffc5bcfa9d2 :: w
**0x7ffc5bcfa9d3 :: s**
**0x7ffc5bcfa9d4 ::** 
0x7ffc5bcfa9d5 :: W
0x7ffc5bcfa9d6 :: t
0x7ffc5bcfa9d7 :: �
0x7ffc5bcfa9d8 :: �
0x7ffc5bcfa9d9 :: �
The size of an int: 4
The size of areas (int[]): 20
The size of ints in areas: 5
The first areas is 10, the 2nd 12,
The size of a char: 1
The size of name (char[]): 4
The size of chars: 4
The size of full_name (char[]): 12
The number of chars: 12
name="Zed" and full_name="Zed A. Shaws"

Final mod was to change the value of this rogue ‘\0’ in the for loop and see what happens in the subsequent print statements. Looks like found the issue but not the cause, ended up “stack smashing” which sounds far from pleasant (I also just found : set number in vim!).

 15         for (int i = 0; i<18; i++){
 16                 test = &full_name[0] + i;
 17                 printf("%p :: %c\n", (&full_name[0] + i), *test);
 18                 if (i == 12){
 19                         *test = 's';
 20                         printf("CHANGED %p :: %c\n", (&full_name[0] + i), *test);
 21                 }
 22         }

Result:

./ex12
0x7ffe3698134c :: Z
0x7ffe3698134d :: e
0x7ffe3698134e :: d
0x7ffe3698134f ::  
0x7ffe36981350 :: A
0x7ffe36981351 :: .
0x7ffe36981352 ::  
0x7ffe36981353 :: S
0x7ffe36981354 :: h
0x7ffe36981355 :: a
0x7ffe36981356 :: w
0x7ffe36981357 :: s
**0x7ffe36981358 :: **
**CHANGED 0x7ffe36981358 :: s**
0x7ffe36981359 :: 
0x7ffe3698135a ::  
0x7ffe3698135b :: L
0x7ffe3698135c :: I
0x7ffe3698135d :: 
The size of an int: 4
The size of areas (int[]): 20
The size of ints in areas: 5
The first areas is 10, the 2nd 12,
The size of a char: 1
The size of name (char[]): 4
The size of chars: 4
The size of full_name (char[]): 12
The number of chars: 12
**name="Zed" and full_name="Zed A. Shawss LI��"**
***** stack smashing detected ***: terminated**
**Aborted (core dumped)**

Is my interpretation of what is going on correct? If so is this a change to the compiler since the authoring of the book? And would we ever need to initialise an array explicitly without the null terminator? If so how?

1 Like

Good question and I would like to know the answer. I can’t see why we would need to initialize it without the ‘\0’.

If you’re dealing with arrays, I don’t see why you would care about what comes directly after it in memory so long as the content of the array is what you expect. Which is the case here.

If you’re dealing with strings you absolutely positively do need to make sure you’ve got one \0 at the end. If the compiler adds another one, again, who cares so long as the string itself is intact?

It’s fantastic that you’re so curious, but in the case of a C compiler… don’t rack your brain. Some things are supposed to remain unknown to mere mortals. :wink:

1 Like

If I compile your program with;
clang -Wall -g hi.c -o hi
then I do not get garbage output as well.

If I however compile with;
clang -Wall -g -O hi.c -o hi
then i get garbage output.

If you’re making a string then you have to include the ‘\0’ but otherwise it just depends on what’s in the array. Almost always you’ll want to initialize it with 0, which should happen when you do something like:

int stuff[1000] = {0};

That one 0 will make the compiler (maybe) initialize all 1000 integers to 0. Where this gets tricky is some compilers will see you made a small random error in your code, then go crazy wrecking everything for “speed”.