Inline Assembly in Arduino sketches 

 July 4, 2023

By  Peter

Join Our Mailing List

We publish fresh content each week. Read how-to's on Arduino, ESP32, KiCad, Node-RED, drones and more. Listen to interviews. Learn about new tech with our comprehensive reviews. Get discount offers for our courses and books. Interact with our community.

One email per week, no spam, unsubscribe at any time.

Inline assembly is a powerful tool in the Arduino toolkit that can lead to more efficient and compact code.

You’ve probably spent most of your Arduino life writing in C/C++. It’s a versatile language, user-friendly, and gets the job done. But what if there was a way to get even more control over your Arduino’s hardware? Enter inline assembly.

Inline assembly lets you incorporate assembly language – the low-level programming language – into your C/C++ code. It’s like having a secret passageway into the nuts and bolts of your Arduino. You get direct access to the machine code that the microcontroller executes, allowing you to optimize your code and push your Arduino’s limits.

But fair warning – it’s not for the faint of heart. It’s like working without a safety net. You’ve got all the power, but it’s all on you if something goes wrong.

Getting Started with Inline Assembly

So, you’re ready to give it a go? Here’s how you do it. When you’re writing your C/C++ code, use the keyword “asm” or “__asm__” to introduce your assembly instructions.

Here’s a “simple” example where we’re blinking the built-in LED:

void setup() {
  asm volatile (
    "sbi %0, %1 \n\t"           //pinMode(13, OUTPUT);
    :: "I" (_SFR_IO_ADDR(DDRB)), "I" (DDB5)
void loop() {
  asm volatile (
     "sbi %0, %1 \n\t"          //LED on
     "call OneSecondDelay \n\t" //delay
     "cbi %0, %1 \n\t"          //LED off
     "call OneSecondDelay \n\t" //delay
     "rjmp 4f \n\t"             //exit
  "OneSecondDelay: \n\t"
     "ldi r18, 0 \n\t"          //delay 1 second
     "ldi r20, 0 \n\t"
     "ldi r21, 0 \n\t"
  "1: ldi r24, lo8(400) \n\t"
     "ldi r25, hi8(400) \n\t"
  "2: sbiw r24, 1 \n\t"         //10x around this loop = 1ms
     "brne 2b \n\t"
     "inc r18 \n\t"
     "cpi r18, 10 \n\t"
     "brne 1b \n\t"
     "subi r20, 0xff \n\t"      //1000 x 1ms = 1 second
     "sbci r21, 0xff \n\t"
     "ldi r24, hi8(1000) \n\t"
     "cpi r20, lo8(1000) \n\t"
     "cpc r21, r24 \n\t"
     "breq 3f \n\t"
     "ldi r18, 0 \n\t"
     "rjmp 1b \n\t"
  "3: \n\t"
     "ret \n\t"
  "4: \n\t"                     //exit
     :: "I" (_SFR_IO_ADDR(PORTB)), "I" (PORTB5)
     : "r18", "r20", "r21", "r24", "r25"

You can copy this code in your Arduino IDE, compile it and upload it to your Arduino Uno as you would with a regular Arduino sketch.

In this example, we’ve used the “asm volatile” command. The ‘volatile’ keyword is essential here. It tells the compiler not to mess with your assembly code during optimization. Without it, the compiler might decide your code is unnecessary and cut it out!

Decoding Inline Assembly: A Deep Dive

A great way to wrap your head around inline assembly is to dissect an existing piece of code. So, let’s take a closer look at the blink LED example from earlier. Here’s what’s happening:

let’s break down this inline assembly code that makes an LED blink with a delay of about 1 second between on/off states.

void setup()

void setup()” sets the mode of pin 13 (PB5) to OUTPUT:

  • "sbi %0, %1 \n\t": Here, sbi is the Set Bit in I/O Register instruction. The %0 is replaced by _SFR_IO_ADDR(DDRB) which gets the address of the Data Direction Register for Port B (DDRB). The %1 is replaced by DDB5 (pin 5 of Port B, which is digital pin 13 on the Arduino UNO). So, this command sets bit 5 of DDRB, making pin 13 an output pin.

void loop()

In void loop(), the LED is turned on, waits for 1 second, turned off, and waits for another second. This cycle repeats:

  • "sbi %0, %1 \n\t": This sets bit 5 of PORTB high, which turns on the LED.
  • "call OneSecondDelay \n\t": This calls the OneSecondDelay subroutine, which creates a delay of approximately 1 second.
  • "cbi %0, %1 \n\t": This clears bit 5 of PORTB, which turns off the LED.
  • "call OneSecondDelay \n\t": Again, the OneSecondDelay subroutine is called for another 1-second delay.
  • "rjmp 4f \n\t": This relative jump command jumps forward to the exit point marked with 4:.

The OneSecondDelay subroutine creates a delay of approximately 1 second:

  • The registers r18, r20, and r21 are initialized to 0.
  • The subroutine runs two nested loops. The inner loop (2b) iterates 400 times, and the outer loop (1b) iterates 10 times. This set of 4000 iterations is performed 1000 times to approximate a 1-second delay.
  • "ldi r24, lo8(400) \n\t" and "ldi r25, hi8(400) \n\t": These load the low and high 8 bits of 400 into r24 and r25 respectively.
  • "sbiw r24, 1 \n\t": This subtracts 1 from the 16-bit word in r24:r25.
  • "brne 2b \n\t": This branches back to the start of the inner loop if the result of the previous operation wasn’t zero.
  • After the inner loop has run 400 times, r18 is incremented. When r18 reaches 10, r20 and r21 are incremented (simulating a 16-bit increment operation).
  • "cpi r20, lo8(1000) \n\t" and "cpc r21, r24 \n\t": These compare the value in r20:r21 with 1000. If they’re equal, the program breaks out of the delay subroutine with the "breq 3f \n\t" command. If they’re not equal, the program resets r18 and jumps back to the start of the outer loop with the "rjmp 1b \n\t" command.

And finally, "ret \n\t" returns from the OneSecondDelay subroutine to the point where it was called.

Now you can see how this inline assembly code can perform the same functionality as a more extensive C/C++ Arduino sketch. This is the power and efficiency that inline assembly offers you and why it’s worth your time to understand and utilize.

A glossary of assembly commands

Here’s a list of the descriptions for each command used in the inline assembly code:

  • sbi: Set Bit in I/O Register. This command sets a specific bit in an I/O register to 1. The command takes two arguments: the address of the I/O register and the bit position.
  • call: Call Subroutine. This command is used to call a subroutine at the specified label. The return address (the address of the instruction following the call) is stored on the Stack.
  • cbi: Clear Bit in I/O Register. This command clears a specific bit in an I/O register, setting it to 0. Like sbi, this command takes two arguments: the address of the I/O register and the bit position.
  • rjmp: Relative Jump. This command allows a jump to a different part of the code within ±2k of the current instruction.
  • ldi: Load Immediate. This command loads an 8-bit constant directly into registers 16 to 31.
  • sbiw: Subtract Immediate from Word. This command subtracts an 8-bit constant from a 16-bit register pair and places the result in the register pair.
  • brne: Branch if Not Equal. This command branches to a relative address if the Zero flag in the Status Register is not set.
  • inc: Increment. This command adds one to the content of the specified register.
  • cpi: Compare with Immediate. This command compares a constant with the content of a register.
  • subi: Subtract Immediate. This command subtracts a constant from the contents of a register.
  • sbci: Subtract Immediate with Carry. This command subtracts a constant and the Carry flag from a register.
  • cpc: Compare with Carry. This command compares the contents of two registers and considers the Carry flag.
  • breq: Branch if Equal. This command branches to a relative address if the Zero flag in the Status Register is set.
  • ret: Return from Subroutine. This command returns the program execution to the instruction following the original call instruction.

Each command provides powerful, low-level control over the microcontroller’s operation in an inline assembly context.

When to Use Inline Assembly

You’re not going to use inline assembly for everything, of course. It’s not as readable or portable as C/C++, and it can make debugging a nightmare. But there are times when it really shines.

Let’s say you need to perform a certain operation super fast. For instance, you’re responding to an external interrupt. You can’t afford to wait for the compiler to translate your digitalWrite command. You need that pin to go HIGH now. That’s where inline assembly can save the day.

Think about it. The “digitalWrite” function is fantastic but has a lot of overhead. It takes about 50 clock cycles to execute. You can cut that down to just one or two cycles using inline assembly.

Expanding Your Inline Assembly Knowledge

Okay, you’ve had a taste, and you want more? You should start by taking a dive into the AVR Libc Reference Manual for the lowdown on inline assembly.

You should also check out the inline assembly questions and answers on the Arduino Forum.

Lastly, don’t miss out on the goldmine that is the TechExplorations Blog. You’ll find dozens of articles that dive deep into Arduino topics.

Inline assembly is like the secret weapon for Arduino enthusiasts who want to push their creativity and their Arduino capabilities. It’s a bit of a wild ride, but if you’re up for the challenge, it can open up a whole new world of Arduino programming.


assembly, AVR, Programming

You may also like

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

I am finishing a new book on Node-RED and the Raspberry Pi Pico W. This book is filled to the brim with projects combining Node-RED with Raspberry Pi Pico W (the Pico version with built-in

Read More
Book sample: Node-RED & Raspberry Pi Pico

Did you know that our website contains more than 300 pages filled with top-quality content on technology and education topics free for all to access and read, on our website? These pages have been read

Read More
Highlight on the Tech Explorations guides and blog