A Low Level Curriculum for C and C++

Nov. 11, 2011
A Low Level Curriculum for C and C++

[In this reprinted #altdevblogaday-opinion piece, Gamer Camp's Alex Darby explains the idea of a low level programming curriculum, and why the disassembly window is "only daunting if you let it be".] Background In my last post I wrote Why I became an Educator I was bemoaning the lack of focus on Low Level understanding that seems to have afflicted Computer Science degree courses of recent times (at least in the UK…). As a result, I received a comment from someone called Travis Woodward that said:

There are plenty of students out there who are more than willing to dive into low level stuff, but its hard to know where to start or even what to learn (the old 'you don't know what you don't know' problem). I've looked around for something approaching a low level curriculum, but they tend to just be lists of topics which aren't actually that helpful without context and suggested resources to start you off. The best intro I've found so far is a course called CS107: Programming Paradigms from Stanford on iTunesU, which has a good section on how C and C++ look to a compiler. So if any low level programmers want to put together a low level curriculum with suggested resources, then please do! :)

This is of course a commendable idea, and so I decided to get started on it… Low Level Curriculum? Before I go any further, I'd like to clarify what I mean by "Low Level Curriculum". During my time in the industry, I've helped architect and build a multi-platform once-Next-Gen-now-current-gen engine and tool chain, I've written plenty of shaders, tracked down countless hideous bugs by looking at disassembly and memory windows, hunted down the odd submission blocking threaded race condition, and on several occasions had to hand-unpick the broken stacks of core dumps from PS3 / X360 test decks to find bugs that only occur "in the wild". But that doesn't make me a low level programmer – this is the kind of thing I'd expect anyone with my sort of experience to have done. I've never sat for hours poring over GPad or Pix captures, I've never really had to worry about stuff like patching fragment shaders or batched physics calculations on SPUs, or how to get the most from my AltiVecs, and I've certainly never had to re-code large chunks of code in assembler taking advantage of caching or sneaky DMA modes to get a few extra FPS out of anything – this is what low level programmers do, platform specific hardware optimised code usually written to get the best performance out of a machine. This curriculum is not about learning to be a low level programmer. What it is about is gaining a solid understanding of the low level implementational underpinnings of C and C++ * – an understanding that I strongly feel should be the base line for any programmer working in games. Over the course of however many posts this eventually takes up I'll be covering:

  1. Data types

  2. Pointers, Arrays, and References

  3. Functions and the Stack

  4. The C++ object model (several posts)

  5. Memory (again, several posts)

  6. Caches

Assuming you read and understand all of the posts in this series – and that I manage to communicate the information effectively – you should end up in a place where for any given "foible" of the language you understand not only that it exists but also – and most crucially – why it exists. For example, you may (or may not) know that virtual function calls don't work in constructors, before the end of this series of posts you will understand why they can't work in constructors. Again just to be clear, I'm not necessarily talking about the same level of understanding of this as someone who writes compilers for their day job; but certainly a level of understanding that gives you a much better idea of what is likely to be going on at the level of the underlying engine that C++ sits on top of, and which consequently enables you to far better understand the implications of the code you write. * N.B. to be 100% clear, C will be covered strictly as a subset of C++. There Is No Source Code Available For The Current Location

Aieeee! Spare me the hexadecimal!

I'm sure the vast majority of programmers who use Visual Studio freak out the first few times they see this dialog. I know I did. I learned to program primarily in a green (or orange if the green screens were taken) screen dumb terminal Unix mainframe environment. You know, like they have in old films like Alien. The second- and third-year students had priority use of the XWindows machines (and the few Silicon Graphics workstations were for 3rd year graphics projects only), so dumb terminals were where I learned my trade. Even on the XWindows machines there was no programming IDE that I was aware of – I used EMACS and GNU make files, and the only debugger I had use of was command line GDB, which is not what you'd call user-friendly. I got by with std::cout. When I graduated, I went from this world of bakelite keyboards, screen burn, and command lines into the bright world of Windows 95 development using Visual Studio 4 (slightly before Direct X and hardware accelerated graphics). When I first saw this dialog box, you can bet your life I freaked out – and why wouldn't I? Thanks to the language syntax and code architecture focused high level teaching methods employed by my university, I had no more idea of what went on behind the veil of the compiler than my brief forays into debugging with GDB had afforded me. I'd just got a degree from a well-respected University where they had altogether avoided teaching me about assembler as part of the main syllabus, and I had assumed it was because they were worried it was too much for my puny mind to deal with. Suffice to say, I got over the freaking out part – but I still saw this dialog as a "No Entry" sign for far longer than I'd like to admit. I only really started to really get over it a few years later when I was working closely with someone who had got a job in games on the strength of their assembler programming. I had a crash, and they just casually leant over and clicked the "Show Disassembly" button. Then, equally casually, showed me exactly why my code was crashing – explaining it in terms of how C++ maps to assembler – and told me how to fix it. This blew my mindgaskets three times because:

  1. this person was so casual about it

  2. disassembly clearly wasn't the black magic it appeared to be

  3. given it was so simple, I couldn't believe I hadn't been taught about the low level innards of C++ at University

Rending The Veil Of Disassembly I really didn't realize how incredibly important this was until I had the pleasure of meeting a guy called Andy Yelland. If you already know Andy, then you will know exactly what I mean, but for those of you who have not met him, I will explain. Andy is one of those people who changes your perspective. He is more or less the polar opposite of the stereotypical ninja-level video game programmer: well dressed, professional, endlessly well-informed, friendly, funny, and socially adept. However, the most amazing thing about Andy is the speed with which he can dissect a console core dump. He just sits there and calmly unpicks the stack, occasionally keeping a few notes about which register some value is in, or looking up the address of a function in a symbol file as he goes, and then in somewhere between 5 minutes and a few hours (depending on how tricky the issue is), he'll turn around and tell you exactly what the problem was. Not only that, but he'll happily do this in a codebase he's never even seen before – and even better, he's totally happy to sit and explain it all to you as he does it. After sitting with Andy for a few dissections, I realised that whilst what he does seems like black magic, it is in fact anything but. It's about having an expert understanding of how C++ works at the assembly level, and bloody-mindedly applying that knowledge to reverse engineer the state of the system backwards from the current stack state (i.e. when the crash happened) to the point where the bad data was introduced. Clearly this takes a lot of practice, and to get anywhere near as good as Andy at it will take anyone (who isn't Rain Man) years of their life. I'm not saying that I think everyone should be able to casually decipher disassembly representing code they didn't write – I certainly can't do that. What I

Tags:

No tags.

JikGuard.com, a high-tech security service provider focusing on game protection and anti-cheat, is committed to helping game companies solve the problem of cheats and hacks, and providing deeply integrated encryption protection solutions for games.

Explore Features>>