ChipCenter Questlink
SEARCH CHIPCENTER
Search Type:
Search for:




Knowledge Centers
Product Reviews
Data Sheets
Guides & Experts
News
International
Ask Us
Circuit Cellar Online
App Notes
NetSeminars
Careers
Resources
FAQ
EE Times Network
Electronics Group Sites


Shotgun Wedding
by Darren Ashby

Have you ever had a problem with a circuit that you just couldnýt figure out? One reader did and sent me a very specific example, then concluded with a very general question that inspired this article. After I responded I wondered, could I generate some troubleshooting methodology that could be imparted to my fellow engineers? I present an answer to this question after the letter. (If you want to read the methodology, and skip the letter and my response, click here)

Question:

Hi Darren,

I have a general question inspired by a very particular incident.ý

The company I work for makes industrial controllers (PID) and indicators used to monitor inputs such as thermocouples, 0-5V, 4-20mA, 0-20mA, etc. Units are assembled and placed for a minimum of a 48-hours in a burn-in chamber set to 60 degrees C.ý After removal from the burn-in chamber, units are tested/calibrated in a homemade station.ý

Here is what has happened to me.ý A software bug was corrected, new code flashed into a micro and then tested in a calibration stand.ý This was done 32 times and all 32 times the unit passed calibration.ý Now onto production. Guess what?ý Many units are failing calibration.

So this is the question.ý In general, in your experience, which PARTS (ICs or transistors or discretes) are MOST LIKELY to be the PROBLEM?

Appreciate your help.

Michael

Answer:

Hello Mike,

From your description, it sounds like the temperature increase caused the problem. So I would suspect capacitors first if they are part of the input circuitry. (They would be unlikely if they are in the power supply however.) Next I would suspect the micro-controller as it uses some type of oscillator circuit that could change with temperature. The dies in transistors and diodes are usually good to 125C so I would not suspect those unless they are carrying enough current to account for the additional 65C. Are the units permanently damaged, or do they recover if left to cool down? Don't overlook the PCB however, it could expand and contract with temperature, changing traces or solder joints that could be a problem if the circuit design is sensitive to that change.

In a more general light, I think engineers often overlook specific parameters of discrete parts. (They usually aren't perfect like we are taught to believe in school.) I think the analog engineer that knows their transistors in and out are a dying breed, so if it involves more than using a transistor for anything more than a switch I would suspect that circuit. ICs are really just a bucket of transistors that were made to be easy to use, So I would look at it last.

I guess it's hard to make a blanket statement on what is likely to fail, as often times there are many small clues to a particular problem. And to complicate the fact, it is likely a combination of two or more factors causing the problem. Sometimes there may be seemingly insignificant clues. One time early in my career we had a problem with some displays we were producing as a percentage of them were failing and I was assigned to find out why. When I took the unit apart, it would function correctly. When I put it back together it would fail again. I looked for hours trying to find problems with pinched wires and cold solder joints to no avail. So I sat there and stared at the PCB for a while. And as I did, I noticed two small marks on a resistor, I wondered were they came from, cause I hadn't scratched anything. After some examination I discovered a screw head that would contact this resistor when the PCB was installed. When I removed the screw, the console worked correctly after assembly.

My only rule of thumb is don't discount a theory (no matter how obvious or ridiculous it may seem). Try to prove it right or wrong by experiment and then move on to the next idea. And one more thing, start with the simple things first. (If you what to know what I mean, read "ohms law still works" in my archives.)

Darren

I often see engineers having immense difficulty with diagnosing the cause of a problem when a lowly tech can identify the bad part right away. Sometimes a tech will struggle for days and the engineer will take one look at the schematic and say, "there is you problem." Some people have trouble with troubleshooting.

First, lets categorize different types of problems as well as different methods of troubleshooting.

Design problem: This is the most common mistake and the easiest to find, as it is generally repeatable and consistent.

Tolerance problem: Really a design problem, but I give it a special category because this is typically inconsistent and difficult to repeat. Environmental effects commonly aggravate this type of problem.

EMI problem: This can also be difficult to repeat, who knows when EMI is going to hit. It will often trip up the most competent engineers.

Software problem: So many products today use some type of software or firmware. I have seen software exhibit all of the symptoms above and also used to correct one of the above problems even though it was really a hardware issue. It gets its own category for that reason. Here is a metaphorical question, if you can fix a hardware problem with software, was it really a software problem in the first place?

So what types of troubleshooting methods are there? I like to group them into two categories.

Scientific method
Do what any good detective would do, look for all the clues you have been given and deduce what might be the problem based on experience and knowledge. Advantage: eventually you will identify the problem. Disadvantage: it takes a lot of patience.

Shotgun method
Take a shot at as many possibilities as you can had hope you get a hit. Some times you get lucky, and you solve the problem fast. But if you arenýt careful you can easily chase yourself in circles.

Scientific Shotgun Method
Does it surprise you if I say I think you can solve all of the above problems with a combination, a marriage if you will, of the shotgun method and the scientific method? I will elaborate. When a problem first comes to your attention, sit down and right down all the things you think it might be. Use your intuition as well as your experience in this exercise. Speaking metaphorically get out the shotgun, take aim and fire. Now comes the second part. Let the scientific method kick in, figure out a way to evaluate each of your conclusions to prove or disprove them. And have at it.

I typically see these results: 7 out of 10 times it was something stupid that the shotgun method caught easily and quickly, as if they were using an old software version, or a component wasnýt stuffed, or a fuse was burned out, etc.. An average of 2 out of 10 times something more subtle was found that took some trial and error and required new data to be found and evaluated till the problem was solved. 1 out of 10 times the solution took a longer time, but eventually was found by repetitive applications of both methods, where the shotgun approach opened up new areas of research that scientifically lead to the resolution. On the aggregate, problems are typically solved quickly with a minimum of running in circles when the scientific shotgun approach is used. (Did you ever think you would see those two words together as something meaningful?) This is a real boon in a consumer product world when shipping that new design on time is all-important.

As for Mikeýs problem, I should have taken my own advice and suggested to check the software version first, since there had been a problem with previous versions. He reported in a later email that they were burning the old version of software. Go figure, I could have used a little shotgun wedding on this one.

Product Engineering Archive

Guides and Experts   Analog Avenue   EDA Tools   PLD   DSP   EDA   Embedded Systems   Power   Test
Click here to get your listing up.

Copyright © 2003 ChipCenter-QuestLink
About ChipCenter-Questlink  Contact Us  Privacy Statement   Advertising Information  FAQ