Comments on a C Project

A Diverse Project Running Over Several Years

By Jordan Hrycaj
Date: Mon, Sep 14, 2015 (updated Thu, Sep 1, 2016)
Category: Architecture,
Tags: c malloc programming threads

How I Got There

For the last eight odd years (~2015) I was working on a set of forensic tools where we amassed considerably more than a million lines of production code, mostly in C and supporting languages - and learned a lot.

Writing forensic software comes with additional boundary conditions which are uncommon in regular consumer tools. Yet I see that these additional requirements might be beneficial for a more general development discipline.

Boundary conditions

I developed PCI DSS scanners for credit cards and other data objects that ran against life machines on a variety of current and historic :) operating systems as Win2k, Win7/8, Linux, Mac, Sparc, UHC, SCO. This led to the following requirements:

  1. Code base for compiling and running on a variety of OSes
  2. The tool chain must work on target systems (unless cross compiling)
  3. No installation (just run), small tools wanted

Changing file system access time stamps was not allowed, in general. Neither must there remain any trace in memory of e.g. credit card numbers or other sensitive data. This led to the following memory management and decoder programme architecture:

  1. Memory destructor sanitises data
  2. Programmes must work on data dumps the same as on live file systems

For a while during programme evolution, there was no restriction on data presentation. The tools ran on CLI for several years successfully and most probably still does. When we needed a graphical user interface I decided to have it presented in the web browser yet with native look&feel. This led to:

  1. Embedded http server
  2. State-of-the-art(~2011) web tools like JQuery/plugins
  3. Multi threading

I emphasised items 4 and 8 in order to indicate that I find these particularly interesting and worth discussing in some more detail below.

Approach

The implementation tool-chain was set up on a Linux cross-compile system using Gcc/Clang/MinGW with an extra generic layer supporting Win/VS development for debugging. Distributed Git became a life saver when it came to test automation and target system assembly. As for the size, the CLI tools include a 500k BIN list when uncompressed and were about 350k in size. The graphical boolean expression scanner exceeded just 1.4m.

A consequence of using C for portability was that I missed some niceties of functional programming like proper closures and transparent memory management which had to be sort of emulated.

For part 4 at first it seemed to be an innocent step to auto-clean memory after use. Apart from guarding C-optimisers teaching them not to optimise out the auto-clean stuff, the consequence for a system wide solution became something like replacing malloc() and free() with

xmalloc (size) -> pointer     // allocate data block, initialise w/zero
xfree (pointer, size) -> void // sanitise/overwrite data block, free()

Now, the xfree() function argument seemed to be a nuisance because passing back the data size could be resolved otherwise. Incidentally, it lead to a much better memory usage discipline even though the size was just (sizeof(*pointer)) in most cases. For debugging mode it was useful to check the passed size argument passed what it should be. On final destruction, the memory area of the data block was overwritten with repeated characteristic data patterns.

All this helped eliminating most stale/unallocated memory problems due to the fact that a stale data area is easily identified in a debugger by its pattern characteristics. The ease of debugging was unexpected. Only a few pathological cases needed to be resolved with valgrind.

Memory leaks had to be hunted differently by maintaining a list of active data blocks when in debugging mode. Identifying data blocks them by serial numbers helped to overcome memory address variations in different test runs.

For part 8 mentioned above I needed a portable multi threaded solution. There was no such necessity with the CLI tools as the programmes were mainly single purposed by design. A few call back function slots were enough for showing keep-alive and state messages.

For the gui system, chose state driven solution in order to avoid bare threads & locks. I developed an embedded HTTP server with an event driven call back interface for handling communication events (pages or page fragments, say). So I employed the select() function from the Berkeley socket suite as a non-preemptive task scheduler. It became the top-level scheduler for an otherwise idle system waiting for instructions.

In the case where a background action (eg. forensic scanner) was running, a simple call-back API – regularly polled for actions – placed the embedded HTTP server’s select() loop into a second level scheduler and allowed now for other services to run while the background service was suspended (as calling function or instance).

I found it more effective to work with the absolutely deterministic event/call back interface than debugging synchronisation points with opaque threading. Additional call backs hooks where interspersed where needed.

In the past, developing with neither, Posix nor Windows threading convinced me too much because of increasing complexity with size of the code base. Once I understood the state machine for my particular problem I was always ready to re-factor without side considerations in order to regain full control. This might not hold in general though and applies only to the particular challenges discussed here.

Finally

I had many helpers running this project, people as well as ideas. Paradigms used run back well into the ‘70s as there are MVC and functional programming.

The way we debugged memory might not be useful to languages with GC available but there is considerable demand of programming with time/memory restrictions not having the luxury of a GC. Also state machine programming vs. threads & locks might be worth considering in many cases.

If I had the choice today of a programming system regardless of legacy system portability I would probably have chosen Go(lang). What I wrote and implemented in C is much more elegantly and naturally stated there in terms of interfaces, streams/data channels, threads, and first class functions without giving up the possibility of efficiently programming data and network protocols. Also, Go(lang) comes with GC and manual memory management.

On the other hand Go(lang) is only a prominent species of a newer generation of programming systems competing with C/C++. I just happen to have played with this new species more than with others. I also played with C++ but soon gave up for legacy system portability and additional effort to keep it a small system from scratch. Aiming big from start though might have led me to use bare-metal techniques (controlling the linker) to un-bloat the binaries to make it work.