Needle in the Haystack

Finding the best answer is not always straightforward.

Scientists are not programmers. Repeat that after me: scientists are not programmers. It’s not their fault; it’s just a lack of proper training.  If you are implementing some algorithm given you by a scientist, it’s important to know this and account for it.

Certain algorithms are not direct – most often for some process which is not easily reversible.  For example, I was given the task of implementing a way of finding the Wet-Bulb temperature, given the Dewpoint temperature, the Dry-Bulb temperature, and the Barometric Pressure.  Accompanying this task was some code, written by a scientist, in some form of BASIC.

To accomplish this, they started with an estimate (the DewPoint Temp) and worked forward, using the known equations to convert wet-bulb temp into dewpoint temp, then compared that result to the known dewpoint (Tdew).  If the result was less than the known dewpoint, they added a constant 0.05 degrees to the estimate, and tried again. When the result exceeded the dewpoint, they called it good and returned the latest estimate as the final answer.

Scientists are not programmers. If you asked them about this, they will say that it gets the right answer.  If you ask them how they came to choose 0.05 as the step size, after the blank stare (while they think about it), you will get an answer something like “Well, that’s the tolerance I want”. If you really press them, they will come up with “Well, any smaller and it’ll take too long – any larger and it’ll not be correct enough”, which is exactly true. That step size is somebody’s wild guess.

Being the obsessive speed freak that I am, I figured a better way.  What the scientist didn’t realize, is that you don’t have to have a constant step size.  With a modicum of further effort, you can adjust the step size dynamically, and get to the final answer much more quickly.

Simply start with a relatively large positive step, and do your estimates as before.  Afterward, make a decision – if you haven’t exceeded your target, step again in the same direction.  If you exceeded the target, don’t simply quit and call it good, REDUCE and REVERSE your step size and go again.  Now you’re heading negative. When you go BELOW your target, REDUCE and REVERSE your step size. Repeat this until the absolute value of your step size is below your tolerance.

In certain cases, this will take LONGER, but in the vast majority of cases where a fine tolerance is needed, this will get a more accurate answer in FEWER iterations.

You of course need to check things out and match your particular case. Use an iteration counter. You always want to reduce your step size when you reverse it: a factor of -1 would never converge and a factor of near -1 would converge slowly. But a factor of -0.01 would reduce your tolerance.  Best to use a factor of -0.1 to -0.4.

I have seen reductions of 100:1 in iteration counts between the original method and this improved search.  In cases where it was worse, it was around 15 vs. 10 iterations, in cases where it was better it was around  30 vs. 500 iterations.

Use your common sense, and don’t hold it against them.  Scientists are not programmers.

2 Responses to “Needle in the Haystack”

  1. Adam says:

    Have you thought of trying to do the calculation using a binary search algorithm instead?

    The method you’re describing above sounds very similar to the original implementation of an algorithm I made for calculating the 1dB compression point in an RF test. It had a dynamic step size that reduced every time the it overshot the desired value (within given tolerances).

    Then I read about how some Agilent guys did it using a binary search. They were talking about pure code implementations of RF measurements, which makes tests more portable b/c not every instrument on a test station will have the same fancy options!

    I rewrote my algorithm using a binary search approach, and it came out much simpler, cleaner – from a coding standpoint. Unfortunately it was roughly the same speed (on average), but at least it was far less dependent on carefully choosing the initial step size and reduction factor, which made it easier for other ppl to use.

    I thought you might be interested in trying the alternate approach out… from one speed freak to another ;) I’d be curious to see how it compared for you, performance wise.

  2. Steve says:

    I finally got around to thinking about your question.

    I suppose the binary search algorithm would choose an initial large step, and the step until you exceed the target.

    At that point, instead of reversing and reducing, you would simply reject the last trial, start from the previous trial and reduce the step by HALF. You would step either one or two steps before exceeding the target again, and then reject the latest and reduce the step by HALF.

    The downside that I see is that you (could) end up calculating the same point again and again.

    If trial 6 produces a value of 10 and trial 7 produces a value of 11 (which exceeds the target), you back off and try 6.5 which produces a value of 10.7. But if you add another 0.5, you’re trying 11 again.

    You must mean something different from that.

    I think that if you use a step modifier of -0.5 in my original plan, then you in fact have a binary search algorithm. Any lower and you get speed at the expense of accuracy, any higher and you get accuracy at the expense of speed.

Leave a Reply



Testimonial  Contact Info

207-593-8109

Culverson Software

184 Lakeview Drive

Rockland, ME 04841

General questions:

Sales@Culverson.com

Accounts/billing questions:

Accounts@Culverson.com


Logo