Subtlety in C

By Colin Walls

Embedded Software Technologist

April 16, 2020

Blog

I feel very strongly that a software developer has a great responsibility, which should be manifest in a priority: write code that is easy to understand.

I feel very strongly that a software developer has a great responsibility, which should be manifest in a priority: write code that is easy to understand. My reasoning is that clear code is generally less likely to have errors. More importantly, more time is spent maintaining code than writing it in the first place and clear code is more maintainable. The question of execution efficiency cannot be forgotten, but for the most part, modern development tools do a great job of attending to that matter.

It is very easy to write convoluted code in C. It is a very expressive language, so the possibilities for obfuscation are endless. Equally, it is a small, compact language that makes writing clear code straightforward. I will explore some possibilities.

It is often the case that a particular operation may be expressed in multiple ways in C. For example:

x = x + 1;

is exactly equivalent to:

x += 1;

or

x++;

It is most likely that a modern compiler will generate exactly the same instructions for all three. It is a matter of choosing which is clear and I would probably choose the last, unless the increment could conceivably be something other than 1 (even though it is 1 on this occasion), in which case I would choose the second possibility.

There are times when apparently equivalent code may harbor subtle differences. For example, something as simple as assigning a value to a variable may have pitfalls. We could write:

alpha = 99;

beta = 99;

gamma = 99;

Of course, this might be written more compactly like this:

alpha = beta = gamma = 99;

These appear to be completely equivalent; all three variables are set to the value 99. However, some more thought can reveal differences that might make one preferable over the other.

Firstly, a substandard compiler might generate code for the first construct like this:

mov r0, #99

mov alpha, r0

mov r0, #99

mov beta, r0

mov r0, #99

mov gamma, r0

Clearly R0 only needed to be loaded once. A good compiler would realize this and eliminate the redundant code. The second construct gives this hint, but a modern compiler should not need it.

There are three reasons why I strongly favor the first construct with the separate assignments:

1)These assignments should (each) be commented.

2)Although all three variables are being assigned the same value, it is possible that, in a future version of the code, this will not be the case. Separate assignments would, thus be more maintainable.

3)The order I which the assignments is done is very clear: alpha is first and gamma last. The second construct is interpreted by a compiler as (alpha = (beta = (gamma = 99))); which reverses the order. This would not matter for ordinary variables, but, if these were actually device registers, the order may be very significant.

The bottom line is that developers should always endeavor to write code that is clear, unambiguous and maintainable.

I have heard it said that you should imagine that the person maintaining your code is a psychopath who knows your home address. Worse: that person might be you.

About the Author

Colin Walls is an Embedded Software Technologist in Mentor Graphics’ Embedded Software Division.

My work in the electronics industry spans nearly 40 years, almost exclusively with embedded software. I began developing software and managing teams of developers.Then, I moved to customer roles, including pre-and-post sales technical support, sales management and marketing. I have presented at numerous conferences, including Design West, Design East, Embedded World, ARM TechCon, and my work frequently appears on Embedded.com.

More from Colin