How do you know when code is good or bad? Or complicated or simple? Chances are you know it when you see it, but that isn’t very quantifiable.

I find that some simple software metrics can help with this. In particular, metrics can help identify the most complicated areas of the code. This is useful in a few different situations:

When I’m writing code, I can highlight complicated sections that should be refactored.
When doing code reviews, I can find the most complicated sections to focus on so that my review provides the most value.
When I’m exploring a new codebase, metrics are a tool for helping me navigate. For example, where are the complicated sections and how is the code distributed — i.e. is it split into many files, or just a few?

You can get this information with just a few simple metrics.

File count

How many files is this code spread across? If it’s one that’s not so great. If it’s one thousand, that’s also probably not so great.

Lines of code

Lines of code is pretty straightforward. This is just the number of non-comment lines in each file. The cloc tool is really great for counting this in variety of languages. It also works well on mixed-language source code repositories.

Complexity

Measuring the complexity of code is helpful because more complex code is more difficult to read, understand, and maintain. In general, you want to keep your code from being too complex. There are many ways to calculate complexity.

Global count

Globals are your worst enemy — even more than function complexity. Globals are like complexity multipliers. When you’re reading code and you come across a global, it’s much harder to understand. Now, the behavior of this code can’t be determined without finding all the other uses of the global, and understanding how it is used in other places. So, counting up the globals – and where they are used can be really effective in assessing complexity.

Koopman Spaghetti Factor (KSF)

Phil Koopman is an embedded systems expert focused on “dependable embedded systems.” This something I can get behind.

He has an article where he “proposes” an new complexity metric called the “spaghetti factor.” The Koopman Spaghetti Factor is computed like this:

KSF = SCC + (Globals * 5) + (SLOC / 20)

KSF = Koopman Spaghetti Factor
SCC = The Strict Cyclomatic Complexity
Globals = The global variable count
SLOC = The number of lines of non-comment source code lines of non-comment code

This is to be computed for each module (source file) of the code. I really like this metric because it combines a bunch of different metrics – complexity, global count and lines of code – into a single metric.

The KSF also weights those different metrics according to importnace (or impact on complexity). Globals are the most sinister, so they get a 5x muliplier. Complexity (SCC) is next, so it gets no multiplier. Then lines of code are the least significant, so its divided by 20.

Ravioli

I like this KSF metric so much that I wrote a tool -- named ravioli -- to help me calculate it. Ravioli is a simple-to-use tool for calculating complexity metrics — including the Koopman Spaghetti Factor (KSF) — on C source code.

If you run it in your project folder it will calculate the KSF on all of the C files it can find and give you the results sorted by complexity. Here is some example output:

> ravioli .
-------------------------------------------------------------------------------
File                                         complexity   globals   lines   ksf
-------------------------------------------------------------------------------
motobox\Sources\FreeRTOS\tasks.c                     12         0    1387    81
motobox\Sources\datapage.c                            1         0    1242    63
motobox\Sources\FreeRTOS\queue.c                     15         0     930    61
motobox\Sources\command_processor.c                  19         2     243    41
motobox\Sources\rtos.c                                5         6     135    41
motobox\Sources\vehicle_comm.c                        8         1     432    34
motobox\Sources\vehicle_comm_sim.c                   11         0     373    29
motobox\Sources\Start12.c                             1         1     337    22
motobox\Sources\can.c                                 7         0     289    21
motobox\Sources\iso15765.c                           12         0     187    21
motobox\Sources\flash.c                               7         0     268    20
motobox\Sources\j1979.c                              10         0     201    20
motobox\Sources\Cpu.C                                 2         2      40    14
motobox\Sources\leds.c                                2         2      26    13
motobox\Sources\log.c                                 3         1     117    13
motobox\Sources\rti.c                                 2         2      23    13

During calculation, the SCC is computed for each function. To determine the complexity for an entire file, the maximum value of all the functions in that file is used. To see which functions are the real culprits, us the -f option to see the complexity of all the functions in your project.

It's designed especially for the C code used in embedded systems because it doesn't try to compile the C, and so won't get tripped up by non-standard extensions.

Be careful out there

I find these metrics to be good tools for identifying – and quantifying – complex code. This allows me to focus on these particular modules during development or a code review as potential areas for problems and refactoring.

But metrics are just tools. It’s easy to go crazy and get too focused on the numbers. Use some common sense and use the metrics to try and make the code better. The future developers working on your code (maybe you!) will appreciate it.