Double free or corruption - ex17.c simulation

Dylan · September 8, 2018, 5:33am

So, I recently posted about ex17.c: Ex17 Seg fault when setting new entries

I’m having difficulty creating a dynamic array.

I started working on a project which does something very similar. I think I may be able to solve the issues with ex17.c if I get this working, but I’m having an issue freeing the memory. It’s pretty messy code, but I have the initialisation working. When I attempt to free the memory, I get double free or corruption (out), and the program aborts.

The offending function is this one:

void Close_dir(struct Directory *dir) {
  if (dir) {
    if (dir->files) {
      int i = 0;
      for (i = 0; i < dir->num_files; i++) {
	Close_file(dir->files + i);
      }
    }
    Close_file(dir->directory_attributes);
    free(dir);
  }
}

followed by this one:

void Close_file(struct File *file) {
if (file != NULL) {
 if (file->owner != NULL)
 free(file->owner);
 if (file->group != NULL)
 free(file->group);
 if (file->name != NULL)
 free(file->name);
 free(file);
 }
 }

(Sorry about the formatting, I can’t seem to get the lines of code to display on seperate lines. Let me know if you know a way and I will fix it.)

It seems that my call to free each file is not correctly incrementing, and the Close_file() function is freeing the same file every time. I ran it under GDB and the abort did not occur until it tried to free on the second run of the for loop.

The syntax I am using could be failing to increment the pointer. That was one suggestion I saw online. Every other form of syntax however, including array index syntax, causes a compiler error, as I am dereferencing a struct passing that when a pointer is expected. I tried converting the whole function over to expecting a struct, but then I was passing a struct to free(), which doesn’t work.

This is what I think the problem is, but I can’t find a way to tell the program to free every allocated piece of memory in the array while adhering to syntax and correctly increment the pointer. Can anyone give me any advice on how this is done?

Thanks.

zedshaw · September 8, 2018, 2:18pm

Part of the reason you have trouble formatting your posts is you don’t hit the ENTER key enough. I went in and edited your post to show you, but you have to hit enter 2 times between paragraphs, and enter after you type [code] or [/code] so they’re on their own line.

Is there a reason you aren’t hitting ENTER and adding more spacing? It’s not really saving you much and just makes it difficult to read what you post, but if there’s some kind of limitation you’re having I can probably help you get around it.

So the first problem is this C code isn’t correct:

if (file->;owner != NULL)
 free(file->;owner);
 if (file->;group != NULL)
 free(file->;group);
 if (file->;name != NULL)
 free(file->;name);
 free(file);

You have file->;group but that semi-colon ( shouldn’t be there. I suspect you didn’t actually copy paste this in, or if that is your real code then it doesn’t compile.

Looking at your pastebin it seems correct, so that means you’re calling double free on one of those. The problem is, you can’t really solve it with only the information “I get a double free”. A double free…where? What line? Without more information you’ll never figure this out.

So, how do you get more information? You run the program under a debugger (gdb or lldb if you’re on a mac), and then when the error happens you’ll get the abort to stop right where the error is and can dump the program and see what’s going on. I have a video for that, but let me know if you can’t find it.

Dylan · September 8, 2018, 3:40pm

Thanks Zed. I apologise for the lack of clarity in my first post and for the poor formatting. I actually pasted the code from pastebin, as the original is written in Emacs on the command line and I wasn’t sure how well it would paste. It probably would have been fine in hindsight. When I pasted it, the preview showed it all on one line, but hitting enter repeatedly didn’t fix that. I’ll try again next time. Thanks for reformatting it.
The semi-colons are definitely not in my code. I have edited the post to remove them.
I traced the process through GDB, and the double free occurred on line 75, so free(file);, on the second iteration of the for loop. The for loop in the Close_dir() function calls Close_file(dir->files + i); where i = 0, the first time, which runs without a problem. Then I get right through the second iteration, up to line 75, where i = 1, and as it attempts to free the file it aborts. This suggests that it is still freeing the elements of the struct successfully, but not the struct itself. That doesn’t make much sense to me, but that’s what happens under GDB. Are you able to replicate that? I achieved this by placing a break at line 57, where the for loop starts. I then simply run next until the for loop starts again. In the second iteration the calls to free() run successfully right up to the last one. When I get to free(file);, I hit next, then I get

double free or corruption (out)

Program received signal SIGABRT, Aborted.

That’s what makes me think that the program is trying to free the pointer to the first struct, instead of the second one, but the fact that it still frees the elements of the struct seems inconsistent. Do you have any thoughts?
Thanks.

zedshaw · September 9, 2018, 1:06am

Alright, so you now know where it is, you have to start tracing how many times you call free(file) and where they happen. You just need to print out right before then and trace it.

Also, when you have File *files you don’t need to do files + i. Just treat it like an array and use files[i]. You’re less likely to get it wrong then.

The other piece of advice is to dump the stack at the point that you get the double free abort inside GDB. That’ll tell you how you got to that point, so you can trace it easier.

Another thing, it might not be that you’re calling free more than once. It could be that you didn’t really allocate that array for files like you thought.

Next, try setting the pointers to NULL after you free them. The free function should know to do nothing when handed a NULL.

Finally, see if you can install a tool called valgrind. That will actually trace all this memory access for you.

Dylan · September 9, 2018, 4:22am

Thanks Zed. I’m going to try those techniques now, but I thought I would just ask for clarification with the array index. I remember you saying in the lesson on pointers that something files[i] should work the same way as files + i. The reason I have not used that syntax is that it gives me a compiler error. Just now I changed it back to files[i]. I didn’t change anything else, but I got the following error:

attributes.c: In function ‘Close_dir’:
attributes.c:58:13: error: incompatible type for argument 1 of ‘Close_file’
  Close_file(dir->files[i]);
             ^~~
attributes.c:23:6: note: expected ‘struct File *’ but argument is of type ‘struct File’
 void Close_file(struct File *file);
      ^~~~~~~~~~
make: *** [<builtin>: attributes] Error 1

It appears to me that what is happening when I change this is that the array index syntax is dereferencing the pointer. As soon as I change it back to files + i the code compiles without an issue. Is it possible that I have made an error in setting up the struct that causes this to happen? It does seem strange based on what you’ve said about pointers, and from my own experience working with them.

Dylan · September 9, 2018, 5:17am

NOTE: This post is a long one, documenting the steps I have taken to debug this code. As I got near to the end, I realised what the problem might be. Please scroll to the end of this post to see what I am now attempting. I have included the information about my debugging process, as it may still be useful to look at. I will post again when I have some results.

Alright, so I’ve just gone through GDB again. I ran through the loop. The first time, everything executes successfully. The second time, it aborts on the call to free(file);. The memory address of file within the Close_file() function is different, so a different pointer has been passed to the function. My original theory that it was not incrementing is incorrect. I tried setting file to NULL after freeing. Of course, since it is actually incrementing, this made no difference.
So this makes your other suggestion, that the memory has not in fact been allocated, sound more likely. However, when I run the program it is printing out each of these files. I changed the program after uploading it, so that on line 99, when each file is initialised, the size variable, at the start, is set to 10 + i. This variable does not mean much, but it is a way of differentiating them. The program the prints out these values, and an incremented number is shown in this field. This suggests to me that seperate files are in fact being created. I realise that they could be using illegal memory, but that seems unlikely for 10 different structs, when I have run this repeatedly without seeing an issue there like a segfault.
It actually points to invalid reads on each line of the function, starting at line 69, where I check if (file->owner != NULL). So the file object passed to it has no value. These checks run without a problem in GDB, but Valgrind is reporting errors. Perhaps the invalid reads are tolerated, but then attempting to free the entire pointer causes the abort signal.
The backtrace from GDB is this:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff7a4ba87 in __GI_abort () at abort.c:90
#2  0x00007ffff7a90a38 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7ba0f52 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff7a98478 in malloc_printerr (str=str@entry=0x7ffff7ba2ad8 "double free or corruption (out)") at malloc.c:5368
#4  0x00007ffff7a9a410 in _int_free (av=0x7ffff7dd2c20 <main_arena>, p=0x5555557579f0, have_lock=<optimized out>) at malloc.c:4285
#5  0x0000555555554a1b in Close_file (file=0x555555757a00) at attributes.c:75
#6  0x0000555555554977 in Close_dir (dir=0x555555757730) at attributes.c:58
#7  0x0000555555554cfa in main (argc=1, argv=0x7fffffffde48) at attributes.c:129

This also traces the problem to line 75, but no mention of an issue on the previous lines. Valgrind only complains of invalid reads on the other lines, but complains about an invalid free on line 75. I’m not sure why there is a difference, but clearly this call to free() is more significant.
The amount of memory left not freed is 610 bytes in 10 blocks. This corresponds exactly to the value of file_size allocated in the Create_file() function called on line 39, which is what the valgrind output shows on line 124 of the pastebin. That value is created on line 110 of the program. I ran it with --leak-check=full. However, since it does seem to have freed the first pointer passed to it, I would expect only 9 blocks to be showing instead of 10. I would also expect about 30 additional blocks to show up for the 3 string pointers in each struct, as the loop has not gone through freeing each of these due to the abort signal. So again it seems like the memory may not be allocated even though everything is printing without a problem.
When I look online, everything points to the fact that these invalid reads and frees are caused by either freeing something and then attempting to access it, or not allocating it in the first place. This makes perfect sense. So I’ve had a look at line 99 of the program, where I run the for loop to allocate the returned pointer to a struct for new_dir->files[i][/code], but I can’t see a problem here. That’s where I call Create_file, which allocates memory for the file struct, then initialises each variable of the struct, including allocating memory for the char * variables. Then it returns a pointer to file. The debuggers don’t report any issues here. I checked the code at line 99 under GDB, and the address being passed is updating, and the correct values show up in each part of the struct after the function has been run.

I just had a closer look at line 99, and saw that the address returned by Create_file() is the same each time. The address of new_dir->files[i] is different each time, but each time it is a pointer to the same object. Perhaps I should restructure this function to take the pointer as an argument, and to directly assign everything to that. I’m not sure why I didn’t do this when I started writing the program, but that seems to me like it might fix the issue. I will attempt that this afternoon and post again when I get some different results.

Dylan · September 9, 2018, 5:19am

I wanted to provide a link to the Valgrind output, but this site is not allowing me to link to that host now. If you want to see the Valgrind output, I can paste it in another post.
My first post was also hidden, I suspect this was due to the pastebin link. I removed this link. If I need to provide the full code again I will post it. Is there a reason why pastebin is being flagged? Is there a problem with using it? I’m happy to paste the full code on the forum, I just found that it makes things a bit tidier if I don’t stick 150 lines of code in the middle of the post.

zedshaw · September 9, 2018, 5:19pm

No idea why it flagged you but I removed it. Alright, I’m going to suggest that you continue to work on it rather than trying to post everything you find. I believe you have the solution but that you stop to come here to ask for help every time you find more information. You should take that information and simply try to solve it rather than asking for help. Valgrind is giving you great info, don’t post it here in its entirety. Just post relevant parts.

Basically, go do some more work. You’re almost there and don’t need to ask for help at every step.

Dylan · September 10, 2018, 4:09am

Thanks Zed, you’re right. I actually fixed the problem last night, but I didn’t have time to post about it. I changed the struct within the Directory struct to a two-dimensional pointer, and that got rid of the issues with access dereferencing it. After that I modified a few things and it worked. I also made a lot of progress with ex17. I have to read and write every element of every struct one by one in a loop.
I’m more accustomed to the practice in a place like Stack Overflow, or the Linux forums, where you post a problem and people give you the answer. I don’t think that’s necessarily a good thing for people who are trying to learn, but obviously those forums are for a different purpose anyway.
You forced me to dig a lot further into debugging, and I learned a lot. Thank you.

zedshaw · September 10, 2018, 1:28pm

Awesome. Also remember you can take the address of anything with &: &files[i].

And yes, I try to make you do the work to solve it since that’s what’s expected when you actually code. SO is fine for things like terrible error messages or quick fixes, but for real problems like what you have they’re useless.