Needle in the Haystack

Finding the best answer is not always straightforward.

Scientists are not programmers. Repeat that after me: scientists are not programmers. It’s not their fault; it’s just a lack of proper training.  If you are implementing some algorithm given you by a scientist, it’s important to know this and account for it.

Certain algorithms are not direct – most often for some process which is not easily reversible.  For example, I was given the task of implementing a way of finding the Wet-Bulb temperature, given the Dewpoint temperature, the Dry-Bulb temperature, and the Barometric Pressure.  Accompanying this task was some code, written by a scientist, in some form of BASIC.

To accomplish this, they started with an estimate (the DewPoint Temp) and worked forward, using the known equations to convert wet-bulb temp into dewpoint temp, then compared that result to the known dewpoint (Tdew).  If the result was less than the known dewpoint, they added a constant 0.05 degrees to the estimate, and tried again. When the result exceeded the dewpoint, they called it good and returned the latest estimate as the final answer.

Scientists are not programmers. If you asked them about this, they will say that it gets the right answer.  If you ask them how they came to choose 0.05 as the step size, after the blank stare (while they think about it), you will get an answer something like “Well, that’s the tolerance I want”. If you really press them, they will come up with “Well, any smaller and it’ll take too long – any larger and it’ll not be correct enough”, which is exactly true. That step size is somebody’s wild guess.

Being the obsessive speed freak that I am, I figured a better way.  What the scientist didn’t realize, is that you don’t have to have a constant step size.  With a modicum of further effort, you can adjust the step size dynamically, and get to the final answer much more quickly.

Simply start with a relatively large positive step, and do your estimates as before.  Afterward, make a decision – if you haven’t exceeded your target, step again in the same direction.  If you exceeded the target, don’t simply quit and call it good, REDUCE and REVERSE your step size and go again.  Now you’re heading negative. When you go BELOW your target, REDUCE and REVERSE your step size. Repeat this until the absolute value of your step size is below your tolerance.

In certain cases, this will take LONGER, but in the vast majority of cases where a fine tolerance is needed, this will get a more accurate answer in FEWER iterations.

You of course need to check things out and match your particular case. Use an iteration counter. You always want to reduce your step size when you reverse it: a factor of -1 would never converge and a factor of near -1 would converge slowly. But a factor of -0.01 would reduce your tolerance.  Best to use a factor of -0.1 to -0.4.

I have seen reductions of 100:1 in iteration counts between the original method and this improved search.  In cases where it was worse, it was around 15 vs. 10 iterations, in cases where it was better it was around  30 vs. 500 iterations.

Use your common sense, and don’t hold it against them.  Scientists are not programmers.

Beware Simplicity

Simpler ≠ faster : you still have to know what happens “under the hood”.

If you read the post about en masse operations, you might remember that I pointed out that you should know what is happening behind the scenes. Here is a particular case where what looks like simpler code actually takes longer to execute.  If you don’t take the time to think about what is actually going on, then you might be fooled.

Consider a pair of signals, each around 12000 samples. Regulations state that I am allowed to drop (delete) certain samples from those signals before performing statistical operations on them.  The number of points to be dropped might be 2-10%, or up  to 1200 of the points. I have the indexes to be dropped in a third array. For graphing purposes, I need to keep the dropped points in separate arrays.

Now every programmer worth his salt has fallen into the trap of deleting elements 3, 5, and 8 from an array: If you try the straightforward way, you find out that after you delete element 3, that element 5 is not in the same place it was before!  So  you either have to delete element 8 BEFORE you delete element 5 and then 3, or you have to delete element (3-0), then element (5-1), and then element (8-2).

Having fallen into that pothole my share of times many years ago, I avoided it this time by doing the reversing trick: My list of points to drop was known to be in ascending order, so I reversed it, and then did the deletions.  Because I needed the deleted points in proper order, I had to reverse those after the deletion.  Here’s the code:

Deletions with reversal

That worked fine for some time, but while revisiting this code, it occurred to me that it might be faster to manipulate the index while deleting, and avoid the reversals and speed things up.  Here’s the code:

That’s certainly simpler, right?  As one should always do, I applied a Timing Measurement to it. And I was surprised.  I created two arrays of 12000 numbers and an array of 1200 random (0..11999) indexes.  It was consistently 5-6% MORE TIME this simpler way.

But if you stop and think about what’s going on, the reason is clear.  Suppose your signal array contains [0, 1, 2, 3, 4, 5] and you want to delete elements [1, 3, 4 ]

Using method A you reverse the list to get [4, 3, 1 ].
You delete element 4.  {that moves element 5 down – 1 move}
You delete element 3.  {that moves element 5 down – 1 more move}
You delete element 1. {that moves elements 2, 5 down – 2 more moves}
That’s 4 moves that were made in the shuffling process.

Now consider the “simpler” method:
You delete element [1-0].  {that moves elements 2,3,4,5 down = 4 moves }
You delete element [3-1].  {that moves elements 4,5 down = 2 moves}
You delete element [4-2].  {that moves element 5 down = 1 move }
That’s a total of SEVEN moves that were made.

So even though we eliminated three REVERSAL operations, we actually take LONGER because we are doing more work.  The increased amount of data-shuffling was enough to overcome the benefit of removing the reversals.

This was done using a random list of indexes to drop; I’d bet that there are possible scenarios where this wouldn’t hold true (for example if the points to drop were few, and confined to the end of the signal), but given that neither of those will be true in my case, I’m sticking with the original plan – on average it will be faster.

But don’t assume that fewer operations on the diagram means less work !

Terminator 2: the Sequel

Make sure that quitting time is followed by happy hour.

As mentioned earlier, a compiled LabVIEW application behaves similarly to the development system when terminating.  Namely, it leaves the main window on the screen, waiting for you to close it.  That’s handy in the DevSys, because you usually want to work some more on the program after quitting.

But in an executable, it’s not so good, because the user doesn’t understand why the window hangs around.

The earlier article offered a way to have it both ways by simply detecting whether or not you were running with the main VI from an LLB, or something else, and performing a QUIT LABVIEW if it was something else.

With the advent of LabVIEW 2009, the scheme of detecting whether you were in an LLB or not was broken, because LV2009 started putting VIs into an EXE using the folder structure that they came from.  Before 2009, an EXE was a container for ALL VIs in the program, regardless of their folder structure on disk.  It was like having one folder.

Using THIS VI’s PATH would point to one of those VIs, stripping it ONCE would point to the container, and stripping it TWICE would refer to the containing folder.

With LV2009 and later, we can no longer use that logic.  What we do instead is to examine the path to MAIN:

  • If we find an “.EXE”  in it, then we are in an executable, and we strip twice to get the containing folder
  • else if we find “.LLB” in it, we are in a library, and we strip twice (from the point of the LLB) to get the containing folder
  • else if we find “.VI” in it, we are in a stand-alone VI and strip ONCE to get to the containing folder

The attached VI is a replacement for the ROOT FOLDER vi mentioned in the earlier article, and is in LV 2009 format  (works in LV2010, too).

Use it when you’re ready to quit – see this snippet:

QUIT if not in LLB

Click to download the Root Folder.vi .

By using the NOT IN LIBRARY signal as an input for QUIT LABVIEW, you can run the same code unmodified in an app, or in the DevSys, and it does the right thing either way.

Virtual Devices

When you don’t have the DAQ hardware you need…

Any version of NI-DAQ and the Measurement and Automation Explorer (MAX) released recently has provisions for “simulated” devices.  You choose which devices you want, and then NI-DAQ will pretend those devices are actually installed on your system, any calls to DAQ functions concerning that device will succeed (or fail) just as if a real device was installed.

This lets you simulate a client’s setup without having their hardware shipped to you and do most (if not all) of the programming on your own terms without being at their site.  The data produced is, of course, simulated data.  For an analog input channel it’s a sine wave, for a digital port, it’s a counting pattern.  It’s enough for you to tell if your software is working correctly with NI-DAQ.

With their hardware simulated on your machine, you can handle the basic communication part to get data in and out. Then you can install conditional-compilation pieces to substitute data more realistic for your particular situation if you need to.

You can be reasonably confident that the DAQ part of a program you develop this way will work on the real hardware, the same as it did on your simulated hardware.  Of course, for any extreme cases (high sample rate, high channel count), the simulation will be less exact, but it’s a useful feature to develop faster with fewer headaches.

What time is it, again?

The TIMESTAMP indicator is smart enough to get you into trouble.

Just ran into what at first appeared to be a bug, but turned out to be proper, if misunderstood, behavior.

I have a project which records data files. When the actual recording starts, and again when it stops, I remember the time (by using a GET DATE/TIME in SECONDS function) in a TIMESTAMP variable, which is stored in the data file. There might or might not be some calibration activity after the recording has stopped.  When the DONE button is finally clicked, I record the current time using a FORMAT DATE and TIME STRING function, into another string field called “TEST TIME”

I have a viewer which examines the data files and reports various info about them. The indicator that shows the TEST TIME is on a different window from the one that shows the START TIME and END TIME.

If there’s no CAL operations done, then the TEST TIME has always been just a few seconds later than the STOP time (enough time to react to the test being done and click the DONE button), or it could be a few minutes later.

However, I recently noticed that, on a data file sent from my client to me, that the TEST TIME was almost an hour EARLIER than the STOP time.  How could that be?  Further rummaging thru other files that I had from him shows the same thing: the TEST TIME was short of an hour EARLIER than the DONE time.  If there were CAL operations done, this difference was 45-55 minutes; if not, it was a few seconds short of an hour.

I have run thousands of tests on my machine without noticing this; my client has also run nearly a thousand, and has never brought it up.  Whay is that?

Continue reading “What time is it, again?” »

Beating the Jitters

A shortcut to determinism in real-time applications

Determinism in software is the ability to ensure that any and all paths taken through the code take a consistent amount of time to execute.  Most desktop applications have no interest in this consistency because A) it doesn’t affect anything, and B) the existence of interrupts and preemptive multitasking means that you cannot do anything about it anyway.

In an embedded control application, or one running under a real-time OS however, determinism is often important because it is critical that the control outputs be changed at a consistent time with respect to the control input sampling time. On the input side, when a signal is changing at a relatively rapid rate, any error in the TIME of measurement is just as destructive to the measurement accuracy as an error in AMPLITUDE.  For applications such as PID loops, this is even more important, since derivative terms are adversely affected by timing inaccuracies.

Equally important, but not always equally recognized, is the fact that the timing of the output samples has exactly the same effect.  If your control output is not consistently timed, then loop stability is compromised.

Continue reading “Beating the Jitters” »

Writing Non-Fragile Code

Oooops…. who broke it?

“Fragile” code is code that breaks in one place because of changes you make in some other place. It’s most aggravating when you’re due to ship a new version tomorrow and you need to make one last tweak at 11:30 PM, or your client is looking over your shoulder and this little “harmless” change shows up as a smoldering heap during the demo.

In this case, “break” doesn’t ONLY mean “broken arrow” , or uncompilable code (at least you can chase those down easily enough). Here, “break” also means “operates incorrectly” or “completely wrecks itself like it never did before” or somewhere in between.

These sorts of breaks come from unrecognized dependencies, and they’re all too easy to make: the header size has been 3 for months and months now, so when you add a new function that needs it, it’s easy to stick in a constant 3 and be done with it.

DON’T DO IT.

Continue reading “Writing Non-Fragile Code” »

Time Alignment of Signals

A picture is worth 1024 words

I discussed the idea of a time-alignment scheme in the article Delays, Delays, Delays.  The idea is that signals which have a mechanical delay of some kind (gas transport time, for example) can be time-aligned with signals that have a lesser or no such delay by means of a “logical” delay line inserted into all channels.  If the sum of all channel’s physical + logical delays is a constant, then the final output is time-aligned.

Here is a test of that concept, where three signals are generated, with different phases, and then fed thru that process. The outputs, as you can see, have been aligned.

TimeAlignment

You have to know the physical delay time, however; it does not guess.  You also have to account for the fact that some data is lost at the beginning, if that’s of concern, then start your recording early enough to account for that.

Even with those caveats, this scheme allows you to shift signals to account for mechanical delays, EVEN IN REAL TIME, if you need to.

Watch your step

But who’s watching the watchers?

Some development environments have a concept called “watching”, where you choose a variable to watch and you see a continuous display of that variable in some window.  This is very useful during debugging, as you can step through your program and find out where this variable is being changed.

LabVIEW has no such built-in feature, but it doesn’t really need one.  You can construct your own watch windows, have them run independently of your main code and accomplish the same thing.

Simply make a new VI with a WHILE loop and a STOP button.  Add a WAIT for 200 mSec (or something) inside it (so you don’t hog the CPU).  Each time thru the loop, grab your watch variable, process it, and display it.

The “processing” can be unbundling a single item from a complicated cluster, or picking an element out of an array, or anything you need to display the item in question.  Perhaps you need to call a VI to get it. Perhaps you need to query an I/O port, or a TCP instrument. Whatever you need to do to watch your troublesome variable.

SUGGESTION:  When you’re done with it, save it in a folder called “Miscellaneous Stuff” or something, so you can get at it easy next time.  There will be a next time.

    Affiliations

    • TI Alliance

    Calendar

    May 2013
    M T W T F S S
    « Mar    
     12345
    6789101112
    13141516171819
    20212223242526
    2728293031  

Testimonial  Contact Info

207-790-0949

1-877-676-8175 (toll-free)

Fax: 1-815-572-8269

General questions:

Sales@Culverson.com

Accounts/billing questions:

Accounts@Culverson.com


Logo