Have you ever felt elation when your design finally worked, followed by horror the next morning because it had morphed into a brick? This happened to me. No response to commands on the user interface and no indication of what caused it to die. Intermittent design “seizures” were my most feared failure mode. Or what about an intermittent failure that is actually caused by a chain of two or more design flaws in the system? Here’s my story; a snapshot of embedded development and valuable lessons a young engineer (me) learned the hard way.
I was a freshly minted firmware engineer at a company in 1988 where we developed and sold high speed satellite modems, earth stations, and redundancy switches. I was responsible for maintaining an earlier model modem. The code was written in assembly language. This was back when embedded software was beginning the move from assembly to “C”. Reading assembly code was an art, and documentation and modularity were key to being able to follow the logic. This code had neither. So debugging it was a challenge to say the least. I should mention that the firmware engineer that created this code base was in hot demand and apparently didn’t have the luxury of documenting and modularity. He would write code as quickly as he could, get it functioning, and then move on to the next project. This is one of the classic tradeoffs in writing code that I learned early on: How much documentation is enough to stay on schedule but be able to remember what you did or easily hand off to a sustaining engineer?
The engine running this modem was a classic 8-bit 8051. Believe it or not, in this age of sophisticated ARM microcontrollers, 8051s are still around and used for new embedded designs. Back then, an engineer didn’t have sophisticated debugging tools like IAR’s Embedded Workbench with JTAG controllers. I had to use a ROM monitor, oscilloscope, logic analyzer, EPROM (Erasable Programmable Read Only Memory), and a UV EPROM eraser.
Figure 1: Typical EPROM (source: Wikimedia Commons)
EPROMs were an interesting part of firmware development because non-volatile memory was rudimentary back then and there was very little capacity. We had about 16 – 128KB that would hold program memory to control our complex products (and we walked to work in 3 feet of snow, uphill both ways). This particular satellite modem I was working on had a 27C128 EPROM which is only 16KB of program memory. On-chip RAM was limited to 128 bytes. This is why assembly language code was used. There were no “C” cross compilers efficient enough for the job, and insufficient memory was precisely what was holding back “C” language adoption. So the setup on my desk was a UV eraser, piles of EPROMs, little rectangular white sticky labels detailing software version and date, and an EPROM programmer. Right in the middle of my lab desk was an 8 MHz 386 based PC clone computer with an RS-232 cable connected to the test satellite modem.
These EPROMs were reprogrammed after 20-30 minutes of UV exposure through the little glass window on the chip (see figure 1). I would lay them on a conductive foam lined tray and put them into the UV box. But they could only be programmed so many times before they started to fail. I knew they were failing when I’d get reprogramming errors or when they would program successfully but then fail in the system. Fortunately, if they were going bad, the system wouldn’t even boot up. So basically rinse, lather, and repeat each time I needed to update software on the board. By the way, the little white labels also covered up the little window on the chip to keep it from erasing under the florescent lights above me. As you can imagine, I made as many changes as I could before I went through this programming cycle. But there was a tradeoff; if too many changes were made in one pass, it would be more difficult to debug. So a lot of checking and rechecking of logic went on in my head as I reviewed my changes. Today, it is simple and comparably fast to make a change, recompile, download, and run code.
Figure 2: UV EPROM Programmer (source: BK Precision)
A ROM monitor was an interesting and crucial component to firmware debugging. It was basically a piece of code that existed in a certain section of the EPROM that allowed me to set breakpoints, single-step through code, and view memory. There was a companion application on the computer that facilitated some common debugging actions. This solution was a long way off from the sophisticated on-chip debugging capabilities of today’s microcontrollers, like the NXP Kinetis MK70FX512VMJ12, which have high speed JTAG controllers (instead of dedicated RS-232 ports that operated at 9600 bits/sec.)
So the first challenge to troubleshooting design flaws back then was with the tools. In this particular case, after roughly 3 weeks of setting breakpoints throughout the code, and only being able to capture a small amount of trace information compared to today’s standards, it turned out to be a timing issue with the setting of a variable, where the interrupt handler was changing it to a different value. And that value happened to be a pointer. When it hit a branch instruction, it went off to nowhere. However, the second problem was a bug in the ROM monitor itself in the trace function. And it was a combination of these two failures that made the problem so difficult to solve. From this I learned three lessons:
1. Never assume there is only one design flaw at work in a failure mode. There could be several interacting with each other.
2. Look solely at the evidence when troubleshooting and don’t make any assumptions without proving them out. Don’t always assume that the instrumentation you are using is correct.
3. Don’t hesitate out of pride to bring in a fresh perspective. In this case I finally had another, more experienced engineer look at it and he found the pointer problem. From that I came to realize that there was also a problem with the ROM monitor.
Technology has evolved so much since 1988. The tools continue to get better, however the complexity of technology continues to increase. Regardless, these three rules I learned early on, have always held constant for me in improving my troubleshooting skills along the way.
Jim Yastic, Principle, Embedded Horizons LLP, has over 30 years of experience in hardware and software engineering, product marketing, and technical sales. Jim holds degrees in Electrical Engineering, Computer Science, and an MBA in Finance. His experience includes working in embedded systems, software, networking and communications throughout a number of industries from military/aerospace, satellite communications and semiconductors, to telecommunications and more. Jim grew up in Texas and lives on a farm just west of Austin, where his current passion is keeping deer out of his garden.
Privacy Centre |
Terms and Conditions
Copyright ©2022 Mouser Electronics, Inc.
Mouser® and Mouser Electronics® are trademarks of Mouser Electronics, Inc. in the U.S. and/or other countries.
All other trademarks are the property of their respective owners.
Corporate headquarters and logistics centre in Mansfield, Texas USA.