Running 57 Threads At Once On The Arduino Uno

When one thinks of the Arduino Uno, one thinks of a capable 8-bit microcontroller platform that nonetheless doesn’t set the world alight with its performance. Unlike more modern parts like the ESP32, it has just a single core and no real multitasking abilities. But what if one wanted to run many threads on an Uno all at once? [Adam] whipped up some code to do just that.

Threads are useful for when you have multiple jobs that need to be done at the same time without interfering with each other. The magic of [Adam]’s ThreadHandler library is that it’s designed to run many threads and do so in real time, with priority management as well. On the Arduino Uno, certainly no speed demon, it can run up to 57 threads concurrently at 6ms intervals with a minumum timing error of 556 µs and a maximum of 952 µs. With a more reasonable number of 7 threads, the minimum error drops to just 120 µs. Each thread comes with an estimated overhead of 1.3% CPU load and 26 bytes of RAM usage.

While we struggle to think of what we could do with more than a handful of threads on an Arduino Uno, we’re sure you might have some ideas – sound off in the comments. ThreadHandler is available for your perusal here, and runs on SAMD21 boards as well as any AVR-based boards that are compatible with TimerOne. We’ve seen other work in the same space before, such as ChibiOS for the Arduino platform. Video after the break.

Posted in Arduino HacksTagged arduino, Arduino Library, Arduino Uno, MULTITHREADING, threading

28 thoughts on “Running 57 Threads At Once On The Arduino Uno”

Bob says:

March 17, 2021 at 11:37 am

Rodney Brooks subsumption programing for robots.

Reply
1. Multiple Thread Microcontroller Nomenclature says:
  
  March 17, 2021 at 12:50 pm
  
  Finally moving away from AVR architecture. It’s about time given how the price has stayed absurd and the speed and memory of basically every processor has also stayed low.
  
  Would be amusing to see how this type of use extends to what seem like upcoming (eventually) RP2742 and up style Raspberry Pi microcontroller devices which already literally have two processors for their microcontrollers and actual memory (up to 16 MB already!) and decent CPU speeds and actually not too bad 40 nanometer CPU manufacturing as well.
  
  Multiple PIO microcontrollers with decent memory available? Yes, please! I wonder how many total threads you can get when your microcontrollers run at 800 MHz or more per processor with potentially 4 or even more processors per chip and everything being actually reasonably priced, robust, supports anything from (some but quite robust) low level assembly, direct C/C++ as well as more higher level MicroPython and CircuitPython (basically Python but for microcontrollers) as well?
  
  https://www.cnx-software.com/wp-content/uploads/2021/01/Raspberry-Pi-MCU-Nomenclature.jpg
  
  Reply
Severe Tire Damage says:

March 17, 2021 at 11:57 am

I despise and thumb my nose at the AVR architecture every change I get, but ….

It is all about what the threads are supposed to do. I am not sure what the magic “57” is about, but the story with any RTOS with threads is that the threads are all normally blocked waiting for their purpose in life to happen. So it is more a matter of the size of the table to handle them than the compute horsepower of the host if they are all (as they should be) sitting there blocked waiting for their cue to come on stage.

Reply
1. Ezra Thomas says:
  
  March 17, 2021 at 12:17 pm
  
  Other than being an older architecture, what don’t you like about it? I’m designing a custom CPU on an FPGA and so your insight on this would be particularly interesting to me.
  
  Reply
  1. tekkieneet says:
    
    March 17, 2021 at 3:18 pm
    
    Not OP, but I have a few beef with the CPU core.
    – I like unified memory model e.g. Arm, STM8. I/O is memory mapped. So Any memory can be used for code or data without requiring special instructions for data in code space, I/O instruction. i.e. You can download code and run directly in RAM. You don’t need stupid macro in compiler to access data in code space.
    – Silly thing like “fuses” to set clock source that can’t be controlled in firmware at run time. STM8 has fuses, but they can be reprogrammed from firmware.
    – Atmel didn’t have hardware debugger at first. They still haven’t publish the protocol for chips that have it.
    
    AVR peripherals hasn’t keep up
    – Even chinese clone have 12-bit ADC these days. :P
    
    Old chip process. So you pay a lot more for memory and peripherals than Arm chips.
    
    Reply
    1. Grawp says:
      
      March 17, 2021 at 7:09 pm
      
      This! Each and every point you’ve mentioned except for the peripherals is a reson why the AVR architecture is so, well, hardly usable. Taking into an account that nowdays it is also pricier by any reasonable metric compared to ARM (I’m giving ARM just as an example of an architecture I’m familiar with) there’s no reason whatsoever to use it in new designs even for onetime hobby projects.
      
      Reply
2. przemek.klosowski says:
  
  March 17, 2021 at 1:07 pm
  
  Apparently each thread uses 26 bytes , and there’s 2048 bytes of SRAM ; 26*57=1482, so that’s probably what limits it. Of course the actual thread code has to fit somewhere too,
  
  Reply
  1. Paul LeBlanc says:
    
    March 17, 2021 at 2:25 pm
    
    The thread code won’t use any RAM itself, since it runs from flash memory. But there’s got to be memory somewhere for the global stack and whatever global variables the task manager might need. Plus there’s the heap for memory allocation.
    
    However, one would think that each thread would have to have its own stack or you couldn’t have pre-emptive task switching without severely limiting what each thread is able to do, so that brings into question that “26 bytes per thread” number.
    
    Then there are the 32 16-bit general purpose registers (64 bytes), some if not all of which would need to be saved and reloaded during a thread switch – again, without severely limiting what each thread is able to do. I don’t know what the AVR compilers do specifically, but most compilers will put as many local variables as possible into general-purpose registers in order to improve operation (both execution speed and code size reduction, although in the AVR architecture the former might not count for much) – either that or they go on the stack. The registers could be saved to the thread stack, but to save them all would be a minimum of 64 bytes. Plus the stack pointer, plus the program counter and you’re up to 72 bytes minimum (*57 = 4104).
    
    Something’s got to give here, because that doesn’t fit.
    
    Reply
    1. bobby says:
      
      March 17, 2021 at 6:56 pm
      
      There is no stack for each “thread”, only a single stack. As I understand it, the whole system is effectively the same as nested interrupts, where each “thread” is called from the timer ISR function. Threads must return, otherwise they block lower-priority threads (therefore blocking functions require a “thread” to be made up of multiple functions). Have a look at the readme on the git repo
      
      Reply
    2. bobby says:
      
      March 17, 2021 at 7:01 pm
      
      W.r.t. your memory requirements comment, you only need enough stack to save the registers once for each priority level running concurrently, plus whatever local variables are used in those concurrent “threads”
      
      Reply
3. Daniel Dunn says:
  
  March 17, 2021 at 3:59 pm
  
  The AVR has the really nice feature of being super standard and having clones that are usually good enough. They’re very simple and rugged.
  
  What I really don’t like is overuse of low end ARM. If you
  need that level of power, your application could probably benefit from connectivity, so why not go for ESP?
  
  Reply
  1. Somun says:
    
    March 17, 2021 at 4:49 pm
    
    Which AVR clones?
    
    Many reasons to choose an cortex m over the Esp32. Power consumption, huge selection of peripherals, multiple vendors… And having a recent GCC available is another advantage.
    
    Connectivity doesnt always mean wifi.
    
    Reply
  2. paulvdh says:
    
    March 17, 2021 at 5:24 pm
    
    Huh, does ESP have built-in USB?
    That’s what I want for my connectivity.
    
    Reply
Jay says:

March 17, 2021 at 12:05 pm

This sounds like a great application for the Controllino to use with safety. Constantly checking the safety circuit and when triggered interrupts main loop to actually shut machine down…

Reply
1. fiddlingjunky says:
  
  March 17, 2021 at 1:42 pm
  
  Safety critical features seem much better suited to an actual interrupt. Timer-based interrupts for checking state or hardware interrupts directly off a comparator or some such, which could call a safe-shutdown function if necessary. There’s a lot of room for introducing error in a thread-based safety loop, and it would likely take more development and take up more overhead.
  
  Reply
2. Steven Naslund says:
  
  March 17, 2021 at 2:00 pm
  
  Seems to me like a good purpose for these threads could be like I/O sampling loops in industrial control where you want to read an input and make a control output decision. Kind of like how you would use a PLC. First thing that came to my mind. A safety interrupt could be part of that if the timing is reasonable. For example, a temperature limit hit a couple times a second is probably good enough. An operator e-stop circuit probably needs to be a hard interrupt since someone pushed the “oh, no” button.
  
  Reply
  1. Steven Naslund says:
    
    March 17, 2021 at 2:05 pm
    
    Number of threads you can manage also has a lot to do with how tight the loops is. In industrial control say I am reading a temperature and turning a heater on and off. That is very tight deterministic code. If you are sitting around waiting for an asychronous event, that is a bigger problem since you can’t determine how long that thread will run.
    
    The architecture needs to have enough memory to store your thread and the thread switching requires time and processing power as well. A lot of devices can be made to handle threaded applications but certain devices are hardware optimized to deal with certain numbers of threads.
    
    Reply
3. Jii says:
  
  March 17, 2021 at 2:14 pm
  
  That’s not going fly on an actual safety application. Microchip does have microcontrollers with certified safety functionality, but you are going to have to use their special compilers to program that.
  
  Reply
Rob says:

March 17, 2021 at 1:33 pm

Heinz (baked beans) had 57 varieties.

Reply
1. Paul LeBlanc says:
  
  March 17, 2021 at 2:32 pm
  
  Not all of those varieties were baked beans. And they actually had 60 different products when they came up with the slogan.
  
  Reply
Ken Bloom says:

March 17, 2021 at 2:15 pm

I don’t understand where this attitude comes from: “Unlike more modern parts like the ESP32, it has just a single core and no real multitasking abilities.”

I remember the days when we did multitasking on Macintosh System 7 without multiple cores, without protected memory between processes, and without preemptive multitasking driven by a timer (you had to call SystemTask() or GetNextEvent() to give other processes a turn.) Why do people think you need multiple cores to do multitasking today? All you need is to understand your stack layout and your processor’s register saving convention when servicing an interrupt. (And *maybe* you want a clock interrupt to do preemption.)

Implementing multithreading on a 68HC11 microcontroller was a popular instructional exercise 20 years ago when I was in college. (I haven’t done it myself, though.)

Reply
1. RÖB says:
  
  March 17, 2021 at 6:59 pm
  
  Writing a time division CPU share (multi-task) core is still a very good educational exercise. Especially so in ASM.
  
  Reply
Greg A says:

March 17, 2021 at 3:24 pm

i’m sure it’s an elegant hack but when you’re writing for a microcontroller i really think all these things should be thought of as what they are, rather than abstracted into something big-computer-sounding. each thread should be an ISR and maybe a bottom half (a portion that operating in the main loop, draining a queue or state machine triggered by the ISR). you should think of the intersection between the requirements of your design and the abilities of the processor in question.

and on the flipside, when you’re writing for a big computer (like raspberry pi or whatever), you shouldn’t be messing around with these timing-sensitive things, they should be offloaded entirely to a peripheral or co-processor.

if you want to use a general purpose processor as a I/O co-processor, propeller is the architecture that mastered that. tacking multithreading onto an AVR isn’t going to accomplish any of these things very well.

anyways that’s just my philosophy

Reply
1. pelrun says:
  
  March 17, 2021 at 9:10 pm
  
  It’s not a “hack”, it’s a piece of engineering.
  
  And proper RTOSes are vital for many embedded applications. A trivial interrupt handler setup is not going to keep up when you need processes running on differing timescales and priorities, with complex locking interactions between them, and needing to hit timing guarantees.
  
  Sure, you can write a giant state machine by hand and spend all your time debugging subtle edge cases and failures, but who voluntarily does that?
  
  Reply
OldSurferDude says:

March 17, 2021 at 4:10 pm

Here’s an application: NTP server

In my implementation there are two functions. When available, read the time from the GPS module and set the Arduino time with it, When there is data on the ethernet port, check that it is a time request and return the Arduino time. Essentially two threads. Right now I have two calls in “loop” Yeah, they could be an ISR, but you asked for an application

Reply
1. smellsofbikes says:
  
  March 17, 2021 at 7:14 pm
  
  I wrote a much less capable version of something similar, that was a tick-based scheduler with priority. I did it for a stepper motor driver that needed very regular ticks for the stepper, but also needed to poll several user pushbuttons and display some stuff to an lcd. It works really well as a single axis driver for my lathe feed screw. It’s a decent use for an atmega.
  
  Reply
Harvie.CZ says:

March 17, 2021 at 4:52 pm

Are these actual threads or just bunch of timers with some callback array? Difference is that threads can run several concurent infinite loops without blocking whole core while in timer ISR you cannot do that.

Reply
1. pelrun says:
  
  March 17, 2021 at 9:02 pm
  
  There’s no effective difference – this implementation just puts the infinite loop outside the thread’s run() function, rather than expecting the thread to do it manually.
  
  Reply

Running 57 Threads At Once On The Arduino Uno

Running 57 Threads At Once On The Arduino Uno

Post navigation

28 thoughts on “Running 57 Threads At Once On The Arduino Uno”

Leave a Reply Cancel reply

Recommend

Hacking A Digital Microscope Camera For Fun And Automated PCB Inspection

Open Source Blockchain Microservices To Help You Build Your Own

Reminder: Not All Virtual Currencies Are Crypto

Simplified navigation between Composables of Jetpack Compose using Simple-Stack

「唤醒+∞」小度春季新品发布会

How to Write a Python Script to Scale Your Data Science Job Applications

Cryptocurrency: A Boom in Value Begets a Boom in Crime

D^3CTF VMWare Esacpe RealVM题解

终于换新颜：新版网站首页发布上线

Workshop Recap: Service Monitoring and Emergency Response with Defender

About Joyk