Tutorial :Why is Read-Modify-Write necessary for registers on embedded systems?


I was reading http://embeddedgurus.com/embedded-bridge/2010/03/different-bit-types-in-different-registers/, which said:

With read/write bits, firmware sets and clears bits when needed. It typically first reads the register, modifies the desired bit, then writes the modified value back out

and I have run into that consrtuct while maintaining some production code coded by old salt embedded guys here. I don't understand why this is necessary.

When I want to set/clear a bit, I always just or/nand with a bitmask. To my mind, this solves any threadsafe problems, since I assume setting (either by assignment or oring with a mask) a register only takes one cycle. On the other hand, if you first read the register, then modify, then write, an interrupt happening between the read and write may result in writing an old value to the register.

So why read-modify-write? Is it still necessary?


This depends somewhat on the architecture of your particular embedded device. I'll give three examples that cover the common cases. The basic gist of it, however, is that fundamentally the CPU core cannot operate directly on the I/O devices' registers, except to read and write them in a byte- or even word-wise fashion.

1) 68HC08 series, an 8-bit self-contained microcontroller.

This includes a "bit set" and a "bit clear" instruction. These, if you read the manual carefully, actually internally perform a read-modify-write cycle by themselves. They do have the advantage of being atomic operations, since as single instructions they cannot be interrupted.

You will also notice that they take longer than individual read or write instructions, but less time than using three instructions for the job (see below).

2) ARM or PowerPC, conventional 32-bit RISC CPUs (often found in high-end microcontrollers too).

These do not include any instructions which can both access memory and perform a computation (the and/or) at once. If you write in C:

*register |= 0x40;

it turns into the folowing assembly (for this PowerPC example, r8 contains the register address):

LBZ r4,r8  ORI r4,r4,#0x40  STB r4,r8  

Because this is multiple instructions, it is NOT atomic, and it can be interrupted. Making it atomic or even SMP-safe is beyond the scope of this answer - there are special instructions and techniques for it.

3) IA32 (x86) and AMD64. Why you would use these for "embedded" is beyond me, but they are a half-way house between the other two examples.

I forget whether there is a single-instruction in-memory bit-set and bit-clear on x86. If not, then see the RISC section above, it just takes only two instructions instead of three because x86 can load and modify in one instruction.

Assuming there are such instructions, they also need to internally load and store the register as well as modifying it. Modern versions will explcitly break the instruction into the three RISC-like operations internally.

The oddity is that x86 (unlike the HC08) can be interrupted on the memory bus in mid-transaction by a bus master, not just by a conventional CPU interrupt. So you can manually add a LOCK prefix to an instruction that needs to do multiple memory cycles to complete, as in this case. You won't get this from plain C though.


The thing is if you don't want to modify the other bits in the register you have to know what they are before you write something it. Hence the read/modiy/write. Note that if you use a C statement like:

*pRegister |= SOME_BIT;  

Event though that might look like a simple write operation at first glace, the compiler must perform a read first in order to preserve the other bits in the value (this is generally true, even if you're not talking about hardware registers, unless the compiler is able to use other knowledge about the value to optimize the read away).

Note that memory-mapped hardware registers are generally marked volatile specifically so that these optimizations cannot take place (otherwise many hardware register routines wouldn't work properly).

Finally, sometimes there is hardware support for registers that specifically set or clear bits in the hardware without requiring a read/modify/write sequence. Some Atmel ARM microcontrollers I've worked with have this with specific registers that clear or set bits in hardware only those bits that are set when you write to the register (leaving any unset bit alone). Also the Cortex M3 ARM CPU supports accessing a single bit (for read or write) in memory or in hardware registers this through accessing a specific address space with a technique they call 'bit-banding'. The bit-banding algorithm looks complex at first glance, but it's really just some simple arithmetic to map the offset of a bit in one address to another 'bit-specific' address.

Anyway, the bottom line is that there are some processors where you can get away without a read/modify/write series, but that's by no means universally true.


If you have to modify a subset of the bits in a word, and the architecture only supports word level read/write, you have to read the bits that must not change to know what to write back so that they are not modified.

Some architectures support bit level memory access either globally or for specific regions of memory. But even then when modifying multiple bits, read-modify-write many result in fewer instructions. In multi-threaded systems care must be taken to ensure that two threads cannot perform this non-atomic action on the same word concurrently.


Modern processors can either set or clear bits with single instruction. However, these instructions can not both set and clear at the same time. There are instances when some of bits of an IO port must all change together and not affect other bits. As long as the sequence of read-modify-write can not be broken, there is no problem.

The situation where the r-m-w can become a problem requires three conditions.

  1. The variable must be globally accessible such as an IO port or special function register or globally defined variable.

  2. The global variable can be modified in a function that can be preempted.

  3. The same global variable is modified while servicing a preemption.

The only way to resolve multiple bit modifications using a r-m-w non-atomic sequence is to protect the sequence of instructions by disabling the interrupt for the interrupt service routine that can also modify the variable or register. This is similar to digine exclusive access to resources such as LCD or serial ports.


When I want to set/clear a bit, I always just or/nand with a bitmask.

For some registers that's good enough. In such a case the CPU's firmware is still going to do a read-modify-write.

To my mind, this solves any threadsafe problems, since I assume setting (either by assignment or oring with a mask) a register only takes one cycle.

If you let the CPU's firmware do the read-modify-write for you, obviously it's going to include at least a read cycle and a write cycle. Now, most CPUs won't interrupt this instruction in the middle, so your thread will execute this entire instruction before your thread's CPU checks for interrupts, but if you haven't locked the bus then other CPUs can modify the same register. Your thread and other threads can still walk all over each other.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »