Pimp My Exercise Bike!

Background

A few months ago my legs started objecting to my running routine and I decided to purchase an exercise bike. Never having owned such a device in the past I wasn’t sure what to look for or what to expect. Somewhat naively I assumed that the bike would be similar to a treadmill, with various exercise programs to alleviate boredom and provide some variety.

As it turns out, the bike I purchased is as simple as they come. The lacklustre monochrome display shows the current speed, distance and estimated calories burnt. The resistance is controlled via a knob connected to a screw, which in turn pushes down on a magnet bar hovering over the flywheel. An exercise program is therefore a completely manual affair – use a timer on your watch/smartphone to beep after a certain interval and then rotate the knob. Not cool.

Recovering from my disappointment I decided to turn this bike into what I expected it to be.

Hardware

The exercise bike is a fairly simple device – there is just one “input” (the rotational speed) and one “output” (the resistance). The first task was to figure out how the input worked. Two wires come out from the pedal area and connect to the console. An oscilloscope showed that the signal carried by these wires is digital rather than analog – it goes to 0 every time the pedals reach a certain position, marking a single rotation. The rotational speed thus needs to be calculated by measuring the time it takes to complete a single rotation. This is not very accurate and poses problems when stopping, starting and changing speeds. The distance and linear speed can also be calculated from the number of rotations and the time, though these measurements are just an extrapolation of how far and how fast a real bicycle would go under similar conditions. I followed the same approximation used by the console that came with the bike, which translates 4 RPM to 1 KM/H. On the plus side, the simplicity of the input means that it only needs to be connected to two standard GPIO pins, one configured as an input and one ground.

Controlling the resistance of the bike turned out to be a much harder problem to solve. I spent a lot of time trying to mimic the manual operation of the screw that pushes down on the magnet bar and relies on a powerful spring to bring the bar back up when the screw is rotated the other way. After many failed attempts I decided to do something else. The rig consists of

  • 3/8″ threaded rod, slotted so that it doesn’t turn;
  • 3D-printed hinge connecting the rod to the magnet bar;
  • 3D-printed bevel gear which turns a 3/8″ nut, causing the threaded rod to move up and down;
  • 12v low-speed, high-torque geared motor;
  • L298 H-bridge to change the direction of the motor.
Slotted threaded rod, with the hinge to connect to the magnet bar.

Since the motor cannot be controlled accurately, I made two holes in the gear, positioned an infrared LED below and and infrared receiver above. This lets the controller know when the gear has completed half a turn, which is taken to be a single level of resistance.

The resistance control rig.

The next step was to replace the original console. The bike is controlled by a RaspberryPi 4B with 4GB of RAM, which is probably an overkill for what is required in this case. The display is a 7″ 1024×600 LCD. Both are mounted on a piece of plywood attached to the bike. The display is covered by a sheet of Lexan, which also holds the buttons used to select, start and stop an exercise, as well as to manually change the resistance level if needed.

The console.

Software

The RaspberryPi is powered by QNX, with drivers I wrote for GPIOs, the frame buffer, and button controls. The bike-specific software is divided into two components:

  • The controller resource manager
  • The console user interface

This division allows for a different UI to be used instead of the default one, and I have a plan (or, rather, my son does) to program games that use the input from the bike for the controls.

The resource manager has one thread for handling input (e.g., read sensor information) and output (e.g., set resistance level) messages, and a high priority thread for acting on the various GPIOs.

The user interface program is based on SDL, hacked to use the frame buffer driver for the display, and the four buttons from the bike’s console for the “keyboard”. It loads exercise programs from the SD card. Each program is a text file with a simple representation of the exercise as a set of intervals, each consisting of a duration and a resistance level.

The code for both components is available here.

Conclusion

To say that this was a fun project would be a serious understatement. I enjoyed tremendously all parts of it, from milling the threaded rod (donations to buy a real milling machine would be appreciated) to printing the gears, experimenting with the IR LED and receiver and writing the software. The fact that it works wonderfully well as a programmable exercise bike is an added bonus.

The complete project, with a sheet metal cover for the control rig.

Stop checking for NULL pointers!

One day you are tasked with writing a function called dup_to_upper(). The function takes a NUL-terminated C string, and returns a newly-allocated string with a copy of the original (similar to strdup()), but in which all lower-case letters have been converted to their upper-case counterparts (the function is essential for implementing a shouting mode plugin for some social media website).

The task is pretty straight-forward, and in little time you come up with a first version:

char *
dup_to_upper(char const * const src)
{
    size_t len = strlen(src) + 1; 
    char * const dst = malloc(len); 
    
    if (dst == NULL) {     
        return NULL; 
    } 

    for (size_t i = 0; i < len; i++) {
        dst[i] = toupper(src[i]); 
    } 

    return dst;
}

You take pride in your work, and congratulate yourself on checking malloc() for an out-of-memory error. You’ve done well.

But then it strikes you that there is another error condition you hadn’t covered: what if the string passed to the function is a NULL pointer? Sooner or later the function is going to dereference that pointer and most likely crash the program that called the function. You quickly add some code to address this concern:

char *
dup_to_upper(char const * const src)
{
    if (src == NULL) {
        errno = EINVAL;
        return NULL;
    }

    size_t len = strlen(src) + 1; 
    char * const dst = malloc(len); 
    if (dst == NULL) {     
        return NULL; 
    } 

    for (size_t i = 0; i < len; i++) {
        dst[i] = toupper(src[i]); 
    } 

    return dst;
}

Surely, you have done very well now, and overall increased the average quality of C code in the world. But did you?

Let’s take a closer look at the newly added check. Why is the function testing the pointer against NULL? Some functions check for NULL because the interface specifies it explicitly as a possible pointer value, often as a way to tell the implementation not to look at the value. For example, the sigaction() function can have either its act argument as NULL, in which case a new action should not be installed, or its oact argument as NULL, to indicate that the caller is not interested in knowing what the old action was.

This is not the case here, though: NULL has no special semantic meaning for dup_to_upper(). Our coder added the check not because it was specified by the interface, but because it is considered by some to be good defensive programming. The check for NULL was added in order to catch an invalid pointer. I contend that such a check is wrong for two reasons:

  1. NULL is by far not the only invalid pointer.
  2. NULL is not always an invalid pointer.

Let’s start by examining what it means for a pointer to be invalid. The question can be answered according to multiple criteria:

  1. Non-canonical addresses: Many architectures, especially in the 64-bit world, divide the range of addresses to “canonical” and “non-canonical”. A non-canonical address is one that the architecture cannot handle. For example, in the x86-64 architecture, only 48 bits out of the possible 64 address bits can be used to generate a canonical address. The range is split in two, which means that canonical addresses are those in the ranges 0x0000000000000000-0x00007fffffffffff and 0xffff800000000000-0xffffffffffffffff.
  2. Unmapped addresses: not all canonical addresses have a mapping installed in the process’ address space, and those that do can be subjected to access restrictions. When a process starts, mappings are created for each code and data segment required by the binary and any linked libraries. Further calls to map memory (such as using the POSIX mmap() function) map more ranges into the address space. It is rare, however, for a process to use all of the available addresses in its address space. Any attempt to read from, write to or execute an unmapped address results in a translation failure, which typically ends up aborting the process.
  3. Inaccessible addresses: these are addresses for which there is a mapping in the address space, but the page tables enforce some form of restriction on access. Restrictions can prevent an address from being accessible at all (neither read, nor write), allow it to be read but not written, or prevent it from being executed. Additionally, most architectures provide a way to designate addresses as only accessible from higher privilege levels.
  4. Semantically-wrong addresses: an address is used to identify an object in memory by its location. When a function is called with a pointer as an argument the caller intends for the function to work on that object. But the same coder that made the mistake of calling the function with a NULL pointer can make the mistake of calling the function with a pointer to a different object in memory.

The last category is the most interesting one. When I call dup_to_upper() I am expected to provide it with a pointer to a C string for which I want an upper-case copy. For example, I can ask the user for a string and then convert it:

char *str = NULL;
size_t len = 0;
getline(&str, &len, stdin);
char * const upper = dup_to_upper(str);

But what if I write instead

char * const upper = dup_to_upper(&str);

The pointer I gave no longer identifies the string I intended to pass to the function, but an object in memory corresponding to the address of that pointer. In this particular case the compiler will probably complain, but since C is weakly-typed it is easy to make such mistakes, for example with functions that take void * arguments.
Here is another example:

char *str1 = NULL; 
char *str2; 
size_t len = 0; 
getline(&str1, &len, stdin); 
char * const upper = dup_to_upper(str2);

Since str2 was not initialized, it may point anywhere. If I’m lucky it holds an address that is not canonical, not mapped or not accessible. If I’m unlucky it points at some arbitrary mapped address (at least dup_to_upper() doesn’t change the contents of the memory at which str2 points!). Again, the compiler may complain here, if it can detect that str2 was not initialized, but how about:

char *str = NULL; 
char *password = "Password1!"; 
size_t len = 0; 
getline(&str, &len, stdin); 
char * const upper = dup_to_upper(password);

There is no reason for the compiler to complain now – the argument passed to the function is syntactically correct. It’s just not what I wanted. You may claim that I have a bug in my code, but that claim applies equally well to passing a NULL pointer as the argument.

So what makes NULL special in this case? Some people who argue for NULL checks contend that it is a common-enough mistake to deserve special handling. It is true that static variables in C are initialized to 0 if no other value is given, and that malloc() returns NULL if it fails (and most programmers do not check for malloc() failures). On the other hand, automatic variables in C hold arbitrary values if not initialized, and the mmap() call returns MAP_FAILED on failure. This value, as will be explained below, is not a NULL pointer. Even when using static variables, or memory allocated with malloc(), invalid pointers can often have non-NULL values, e.g.:

struct foo_s {
    int a;
    int b;
};

struct foo_s * const foo = malloc(sizeof(*foo));
get_number(&foo->b);

If malloc() fails, then &foo->b is the address 0x4.This may seem like a contrived example, but this pattern happens quite often in the real world.

But surely, you now say, there is nothing wrong with the added check for NULL? Granted, it doesn’t catch all cases of invalid pointers, but it at least catches some.

The problem with the extra check is that it creates a contract in the API that the function cannot abide by. When it comes to documenting the new function the documentation will likely have a section about error codes, in which the author will say that the function returns NULL and sets errno to EINVAL if the argument is invalid. But, as we have just seen, that is only true for a very small subset of invalid arguments. The function will not return NULL and set errno to EINVAL if the given argument is 0x4, MAP_FAILED, an address obtained from mmap() with PROT_NONE or an address in the privileged range of the address space (many operating systems keep a range of the address space for use by the kernel). In all of these cases the function will likely result in an access violation and terminate the process. Not only does the function not fulfill its contract with the caller, it behaves differently for different values of invalid pointers: a NULL pointer causes it to return with an error, a non-canonical/unmapped/inaccessible pointer causes it to abort the process, and a semantically-invalid address may cause it to return a string with unexpected content.

The discussion so far assumed that NULL is always an invalid pointer. But is it? The C standard defines a NULL pointer as a pointer with the value 0. Earlier I mentioned that the mmap() function returns the constant MAP_FAILED in case of an error, and that this constant is not 0. There is a very good reason for that: mmap() should be able to return the address 0, as this address may, in fact, be valid. It is not common for address 0 to be valid, but it can happen. For example:

  1. Some boards may have RAM starting at physical address 0. Before the MMU is turned on, or if the board is running with the MMU disabled (or is a micro-controller without an MMU) then software needs to be able to access that address. For example, it should be possible to use memset() to initialize the memory at that address.
  2. I once implemented a system in which a second copy of the operating system was started on one core after the first system booted. This system ran on an ARMv7 board without a hypervisor. In order to initialize the second instance it was necessary to place an exception vector at a location that did not conflict with the exception vector of the first instance. In the ARMv7 architecture the exception vector can be placed at one of two addresses, one of which is 0. Consequently, the process that started the second instance had to map virtual address 0 in order to place the exception vector there, prior to handing over control to the second instance’s kernel.

In conclusion, when passing a pointer to a C function, there is only one semantically-correct value for that pointer, as opposed to 264-1 invalid values. NULL may or may not be one of these invalid values. Let the MMU and operating system handle it.

RGB Matrix: Part 2 – Using a Timer Interrupt

In Part 1 we used an infinite loop to draw images on the RGB matrix. The time between drawing one pair of lines and the next was filled by spinning, which is rarely a good option. Instead of spinning, we can use a timer interrupt to draw each pair of lines at regular intervals, saving on power and allowing the CPU either to sleep or to do something else while it is not drawing.

The code for Part 2 uses the Raspberry Pi’s system timer, which works at a frequency of 1MHz, to generate a timer interrupt every 100us (the value can, of course, be modified as necessary). Note that the system timer is a separate module from the architecture-defined counter on the ARM core, and is owned by the Video Core. Also note that, while the documentation specifies that the system timer generates interrupt 1, on the Raspberry Pi 4 this is actually interrupt 97, as the Video Core peripheral IRQs feed into the GIC starting at ID 96.

After mapping the system timer registers and registering an interrupt handler the initial expiration time is programmed into the timer to 100us into the future. The timer interrupt service routine performs the following actions:

  1. Set the address lines to the current row number. The row number is kept as a static variable in the scope of the ISR.
  2. Advances the row number.
  3. Draws the two lines with the data matching the new row number, as described in Part 1.
  4. Programs the timer to 100us in the future.

Setting the address lines to the current row number before advancing it is a trick picked up from the reference library. It allows us to eliminate spinning for the time required between drawing a line and setting the row address, taking advantage of the fact that the same image is drawn over and over at the current refresh rate.

Since the main loop is no longer responsible for drawing the images it just sleeps for 1 second, before copying the next image into the buffer used by the ISR to draw the pixels.

The code for Part 2 is available here.

RGB Matrix: Part 1 – The Basics

I have recently finished scratch-building a RaspberryPi/QNX-powered portable game console with my son. The “screen” for the console consists of a 16×8 red LED matrix, which obviously limits the game options for the console.

For our next project we wanted a better screen, which led me to buy this 32×16 RGB LED matrix. Purchasing most things nowadays is the easy part, though. Making use of the purchase is another matter. On the web site for the LED matrix the provider makes the following statement:

Of course, we wouldn’t leave you with a datasheet and a “good luck!” We have a full wiring diagrams and working Arduino library code with examples from drawing pixels, lines, rectangles, circles and text. You’ll get your color blasting within the hour!

The statement is technically true: the link sends you to a nice tutorial on how to hook up the matrix to GPIO pins, and proceeds to point to a library that can be used to drive the matrix on various boards. However, at no time are there any simple explanations as to how to use the matrix if you want to do it yourself, rather than rely on a library for a specific set of boards and specific use cases. The code itself is of little help as a tutorial, being far too complicated, board-specific and suffering from a mild case of ifdef hell. I hope to rectify that with a few blog posts that will show how to work with the LED matrix in simple, incremental steps.

Let’s start from the beginning. The most basic use case is to light up various pixels on the matrix using the basic colours. Since each pixel consists of red, green and blue components, there are 8 such basic colours. It is possible to generate other colours using software PWM, but that will be left for another post.

The first step is to follow the instructions here on how to connect the RGB matrix to your board. Other than a few ground connections, you need a GPIO for each of the following inputs on the matrix (assuming the 32×16 variant):

  1. R1: Red component for the first line
  2. G1: Green component for the first line
  3. B1: Blue component for the first line
  4. R2: Red component for the second line
  5. G2: Green component for the second line
  6. B2: Blue component for the second line
  7. A: First address line
  8. B: Second address line
  9. C: Third address line
  10. CLK: Clock, used to transmit per-pixel data
  11. LAT: Latch, used to start and stop the transmission of per-line data
  12. OE: Output enable, switches between input and display modes

A couple of things to note:

  • All of these connections can go to standard GPIO pins.
  • There are three address lines to drive 16 rows. When selecting a row with these address lines, you actually select two rows, one in the range of 0-7 and one in the range 8-15. This is why there are two sets of colour component inputs.

To draw two rows of pixels on the LED matrix we need to take the following steps:

  1. Set the latch pin to 0
  2. For every column:
    1. Set the clock pin to 0
    2. Set each of the R1, G1, B1, R2, G2, B2 pins to the desired colour value
    3. Set the clock pin to 1
  3. Set the latch pin to 1
  4. Set the A, B and C address pins to the desired row number

To display a complete image we repeat these steps for each row over and over.

Further notes:

  • There needs to be some delay after changing the latch and clock pins. You can experiment with different values, but 200 nanoseconds seems to work.
  • The address lines need to be set after the data has been provided.
  • When displaying a complete image I found that there was a “bleeding” effect, where the next row would display a faint shade of the previous one. From experimentation, setting the OE pin to 0 before step 1 and to 1 after step 3 fixes the problem, but I cannot say why.

You are now set to display static images using 8 basic colours on the LED matrix.

Example code for QNX on RaspberryPi is available here.