Ex34 DArray: why bother defining element_size?

florian · December 13, 2019, 3:25pm

In the DArray implementation in the book, why do you bother specifying element_size although the struct contains only pointers?

If we allocate memory via DArray_new we get a chunk of that size, but we can just as well push a pointer to an arbitrarily sized chunk. There’s no safeguard against wrongly sized elements.

zedshaw · December 15, 2019, 10:56am

I don’t have the code in front of me but in general if you don’t include the size of everything in C then you have serious bugs. One fallacy of C programmers is this:

C allows a programmer to do almost anything.
A rogue programmer writing new code in my project can therefore violate any protections I use.
Therefore I should not bother with any defensive programming.

The fallacy is in #2. By assuming that the only people able to cause violations are rogue programmers with direct access to the code you ignore the vast array of people who can alter a program from the outside (aka hackers). You could say this is kind of an inverse No True Scotsman, or maybe a slippery slope fallacy, or possibly a strange tautology.

The problem with C (and it took me close to 25 years to see this) is it’s memory model–as justified by the vast number of Undefined Behaviors around memory–is externally exploitable by an end user while simultaneously defended as valid because only rogue programmers are used to justify the undefined behaviors.

For example, if you say that C is a terrible language because the \0 ended strings make it easy to buffer overflow you’ll find C enthusiasts will bust out some code sample that shows you can never defend against it anyway…against a rogue programmer though. They never write a code sample that is externally exploitable through a string exploit, because that would prove the point. Instead, they say:

Your complaint about C string is undefined behavior.
This undefined behavior is correct because a rogue programmer can do whatever they want.
Therefore it is also correct for external input users (implying they can also do whatever they want).

This then demonstrates the actual flaw in C’s design: It gives everyone the power of a rogue programmer.

In short, C’s design makes it possible for externally exploiting memory due to undefined behaviors, and that no C program can defend against these attacks because of the sheer volume of undefined behaviors that exponentially compound in a code base that assumes nobody should defend against any exploits because a rogue programmer can cause UB.

And, that’s why I try my best to stop externally accessible exploits by including the size of everything, while ignoring all arguments about rogue programmers doing hacking because that’s a straw man argument used to justify shitty code that allows externally exploiting undefined behavior.

florian · December 15, 2019, 11:39am

OK, that makes sense. Thanks for taking the time to write this!

Now, just to make sure I really get it: In the case of a data structure like this you would always provide functions that do work safely, knowing that you can’t stop other programmers from bypassing those functions? Then anyone using the structure in the right way can write programs that are safe against attacks from the outside? (EDIT: Well, somewhat safer, I mean…)