Wednesday, February 15, 2012

Wed., Feb. 15th

Arrived at the lab a little early today; at 1:20 pm (my nominal arrival time on my aspirational MTWF schedule is 1:30).  Juan was already here waiting for me, and he is now on his laptop working on some LogicLock related stuff.  He says he can stay until about 5:00 pm.  I set him up with a user account on COSMICi, and he changed his workgroup to APCR-DRDL (I'm not sure which of these was really necessary) so that he could get access to the shared Q:\ network drive over the wireless, which is working now.  But I told him that he should not actually run Quartus in that folder until we get the versions synchronized.  On that topic:

I was considering today installing the service packs for Quartus 9.1 (SP1 and SP2) on COSMICi (my Dell), with the thought that this shouldn't hurt, and it will then let everyone at least use the latest version of 9.1 and work out of the shared Q:\ drive.  (I think) we can always revert to plain 9.1 later if there are any problems with the service pack version.

Also, I was toying with the idea of trying to see if we can actually build in the latest version of Quartus (11-point-something) to see if this improves the as-compiled performance of the design at all, and/or speeds up our compile times.  However, I'm not sure if our existing Altera licenses will actually work with a newer Quartus version.  I also have to be careful not to overwrite the existing database files etc. when trying this; remember when running Quartus v11 to always work in a new "q11" subfolder of FEDM_code instead of the old "q91" folder.

First, let's tackle the service packs.  Step 1:  Download the service-pack installers from the ftp link Juan provided earlier...  Now downloading "91sp1_quartus_windows.exe" and "91sp2_quartus_windows.exe".  Just for reference, my present version of Quartus, according to the About box, is is:  "Quartus II Version 9.1 Build 222 10/21/2009 SJ Full Version".

Now backing up current Q:\ share contents (C:\shared\FEDM_code\q91) contents to C:\LOCAL\FEDM_code\backup 2012-02-15...  (Just in case the database update in SP2 corrupts anything...)  Done.

Chrome says the service pack downloads will take about 2 hours, so in the meantime, I'll work on something else...

Let's see, looking back through my recent blog posts, at the end of Monday I had just changed the CTU firmware to stop trying to turn on POSHOLD and TRAIM modes on the GPS (since the server will manage GPS initialization, looking forward), and was going to re-test with that change.

Recompiled the new CTU firmware (in Design Tools for Eclipse).  Now recompiling CTU gelware...

It might be a good idea to eventually create a new shared folder C:\shared\COSMICi_devel\ (as a top-level development directory), and share it with a new drive letter (like "T:\" for "top"), and then place individual development folders under there, not only "FEDM_code" but also "CTU_code", "Server_code", "WiFi_code", and whatever else.  No big hurry to do this, though, since no one besides me is working on anything but the FEDM_code.  Also, everything besides the firmware can be shared through Dropbox instead.  I actually made myself a Dropbox folder "COSMICi_devel" and already put my copies of the various shared Dropbox folders in there.  If later someone wants more folders besides the ones they have, I can share the higher-level folder with them.  It is large, though (1.22 GB at present), sharing the whole thing with everyone may take up too much of some people's Dropbox quota.

OK, now building the EEPROM programming files for the CTU gelware...  Burning onto board...

Note to self:  We really need to order some more power cables for the DE3 board, so we don't have to manually hold the cable in the proper position while using it.  OK, I emailed Terasic to see if two replacement cables (main & spare) could be sent to us free of charge courtesy of the Altera University Program.

OK, let's now look at the output from my test run:  Looking at the end of node0.uart.trnscr:


----------------------------------------------------------------------
At Wed Feb 15 14:43:33 2012 + 405 ms opened node0.uart.trnscr transcript...


Wed Feb 15 14:43:38 2012 + 198 ms: < 
Wed Feb 15 14:43:38 2012 + 198 ms: < HOST_STARTING,CTU_GPS,1.9
Wed Feb 15 14:43:38 2012 + 200 ms: < HOST_READY
Wed Feb 15 14:43:38 2012 + 200 ms: < $PDME,21,OK*1B
Wed Feb 15 14:43:38 2012 + 201 ms: < $PDME,22,OK*18
Wed Feb 15 14:43:38 2012 + 201 ms: < $ACK,*65
Wed Feb 15 14:43:38 2012 + 202 ms: < $ERR,UNK_CMD,*00
Wed Feb 15 14:43:38 2012 + 202 ms: < $ACK,WIFI_STARTING,v0.19*67
Wed Feb 15 14:43:38 2012 + 203 ms: < $ERR,BAD_CHK,[$GPRMC,212115.016,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W
Wed Feb 15 14:43:38 2012 + 203 ms: < $GPRMC,212116.020,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*70]*3D
Wed Feb 15 14:43:38 2012 + 204 ms: < $ERR,BAD_CHK,[$GPGGA,212116.020,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29*65]*0B
Wed Feb 15 14:43:39 2012 + 658 ms: < $ACK,*65
Wed Feb 15 14:43:39 2012 + 659 ms: < $ERR,UNK_CMD,*00
Wed Feb 15 14:43:39 2012 + 659 ms: < $ACK,WIFI_READY*60
Wed Feb 15 14:46:47 2012 + 486 ms: > exit
Wed Feb 15 14:47:35 2012 + 448 ms: < 
Wed Feb 15 14:47:35 2012 + 448 ms: < HOST_STARTING,CTU_GPS,1.9
Wed Feb 15 14:47:35 2012 + 450 ms: < HOST_READY
Wed Feb 15 14:47:35 2012 + 450 ms: < $ACK,*65
Wed Feb 15 14:47:35 2012 + 451 ms: < $ERR,UNK_CMD,*00
Wed Feb 15 14:47:35 2012 + 451 ms: < $ACK,WIFI_STARTING,v0.19*67
Wed Feb 15 14:47:36 2012 + 893 ms: < $ACK,*65
Wed Feb 15 14:47:36 2012 + 894 ms: < $ERR,UNK_CMD,*00
Wed Feb 15 14:47:36 2012 + 895 ms: < $ACK,WIFI_READY*60
Wed Feb 15 14:47:37 2012 + 143 ms: < $GPRMC,212136.045,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*71
Wed Feb 15 14:47:37 2012 + 143 ms: < $GPGGA,212136.045,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*64
Wed Feb 15 14:47:37 2012 + 144 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 14:47:37 2012 + 145 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A
Wed Feb 15 14:47:38 2012 + 149 ms: < $GPRMC,212137.049,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*7C
Wed Feb 15 14:47:38 2012 + 150 ms: < $GPGGA,212137.049,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*69
Wed Feb 15 14:47:38 2012 + 151 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 14:47:38 2012 + 152 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A
... (more of the same) ...
Wed Feb 15 14:47:50 2012 + 101 ms: < $GPRMC,212149.002,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*7A
Wed Feb 15 14:47:50 2012 + 102 ms: < $GPGGA,212149.002,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*6F
Wed Feb 15 14:47:50 2012 + 103 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 14:47:50 2012 + 104 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A
Wed Feb 15 14:47:50 2012 + 376 ms: > MUTE
Wed Feb 15 14:47:51 2012 + 102 ms: < $GPRMC,212150.005,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*75
Wed Feb 15 14:47:51 2012 + 103 ms: < $GPGGA,212150.005,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*60
Wed Feb 15 14:47:51 2012 + 104 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 14:47:51 2012 + 105 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A
Wed Feb 15 14:47:51 2012 + 306 ms: < $ACK,MUTE*6C
Wed Feb 15 14:48:00 2012 + 916 ms: > exit

In between the two runs here is where I power-cycled the DE3 board after reprogramming the new firmware into EEPROM.  (Even though the Wi-Fi board is power-cycling as well, there isn't a new transcript header because the node 0 model/proxy, and its UART BridgeServer instance, with its transcript-writing message handler function, is designed to persist over multiple reconnections.)


OK, now that we've got rid of the confirmations of the now-unnecessary PDME commands, let's look at the next anomaly.  These are the acknowledgement/error sequences for the blank lines sent by the Wi-Fi, which both look like this (but a second or so apart):


Wed Feb 15 14:47:35 2012 + 450 ms: < $ACK,*65
Wed Feb 15 14:47:35 2012 + 451 ms: < $ERR,UNK_CMD,*00


To get rid of this, we could either modify the Wi-Fi script to not send the blank lines, or else modify the CTU firmware to just ignore the blank lines, instead of acknowledging them and trying to interpret them as command lines.  I think I prefer the latter method.

The next anomaly, of course, is that the GPS wakes up thinking that the current date is 161211 = Dec. 16th, 2011 (about two months ago!!), as can be seen in this line a bit later:


Wed Feb 15 14:47:37 2012 + 143 ms: < $GPRMC,212136.045,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*71


This indicates that (of course, as we knew already) the GPS has not yet acquired time info from any satellites, and any internal battery-backup of its internal clock must not be working either.  Probably this particular date is actually the last date at which the module had acquired a time lock.  Of course, this just underscores the need for the server to initialize the date/time on GPS startup - which hopefully will encourage it to acquire the satellites more quickly (maybe once the date is set, it will realize that its almanac and ephemeris are out of date, and proceed to download new ones - if not, we may need to do a cold-start or warm-start command first to force this).  Anyway, lots of server-side coding work is still needed to support this remote initialization, and also to enable the server to recognize when the date/time reported by the GPS module are wrong (based on the server's time, which is at present NTP-sync'd to about +/- 10 ms accuracy), so that it will realize that such initialization is needed, and automatically apply it.

First, let's work on the easier fix, for the blank lines.  I implemented a quick fix just for the case where the line really has zero characters (not even whitespace) before the terminating newline.  This should be sufficient for now.  Rebuilt the firmware.  Recompiling the Quartus design now...  Converting programming files...  Getting set up to reburn EEPROM...  Reburned & retested.  Now I got:


Wed Feb 15 16:21:16 2012 + 694 ms: < HOST_STARTING,CTU_GPS,1.9
Wed Feb 15 16:21:16 2012 + 695 ms: < HOST_READY
Wed Feb 15 16:21:16 2012 + 695 ms: < $ACK,WIFI_STARTING,v0.19*67
Wed Feb 15 16:21:16 2012 + 696 ms: < $ERR,BAD_CHK,[$GPRMC,212122.019,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W]*42
Wed Feb 15 16:21:18 2012 + 166 ms: < $ACK,WIFI_READY*60
Wed Feb 15 16:21:18 2012 + 905 ms: < $M55356,4.2,00012,1*
Wed Feb 15 16:21:19 2012 + 109 ms: < $AH[G22.5268,04.2,00,.013,]
Wed Feb 15 16:21:19 2012 + 882 ms: < $GPRMC,212300.041,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*72
Wed Feb 15 16:21:19 2012 + 883 ms: < $GPGGA,212300.041,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*67
Wed Feb 15 16:21:19 2012 + 885 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 16:21:19 2012 + 887 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A
Wed Feb 15 16:21:20 2012 + 886 ms: < $GPRMC,212301.042,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*70
Wed Feb 15 16:21:20 2012 + 886 ms: < $GPGGA,212301.042,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*65
Wed Feb 15 16:21:20 2012 + 888 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 16:21:20 2012 + 889 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A

I dunno WHAT is going on with the "$M55536" and "$AH" lines; I think the serial data is getting garbled.  Perhaps I wasn't holding onto the power connector firmly enough?  Anyway, let's try again...  

----------------------------------------------------------------------
At Wed Feb 15 16:26:16 2012 +  92 ms opened node0.uart.trnscr transcript...

Wed Feb 15 16:26:20 2012 + 962 ms: < 
Wed Feb 15 16:26:20 2012 + 963 ms: < HOST_STARTING,CTU_GPS,1.9
Wed Feb 15 16:26:20 2012 + 964 ms: < HOST_READY
Wed Feb 15 16:26:20 2012 + 964 ms: < $ACK,WIFI_STARTING,v0.19*67
Wed Feb 15 16:26:22 2012 + 407 ms: < $ACK,WIFI_READY*60
Wed Feb 15 16:26:23 2012 + 254 ms: < $GPRMC,212143.013,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*70
Wed Feb 15 16:26:23 2012 + 255 ms: < $GPGGA,212143.013,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*65
Wed Feb 15 16:26:23 2012 + 255 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 16:26:23 2012 + 256 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A
Wed Feb 15 16:26:24 2012 + 255 ms: < $GPRMC,212144.015,V,3025.676,N,08417.112,W,0.0,0.0,161211,4.1,W*71
Wed Feb 15 16:26:24 2012 + 256 ms: < $GPGGA,212144.015,3025.67587,N,08417.11218,W,0,00,99.0,051.73,M,-29.7,M,,*64
Wed Feb 15 16:26:24 2012 + 257 ms: < $PDMETRAIM,2,0,0.000000000,0,0,0,0,0,0,0,0,0,0,0,0,0*43
Wed Feb 15 16:26:24 2012 + 258 ms: < $PDMEPOSHOLD,0,0000.000,N,00000.000,E,000.00*4A

OK, that looks better....  No errors, no garbled lines...  Time to work on the server-side code again, I guess.  I've done a lot of coding towards the processing of input lines coming in on the uartServer, but I need to go back and see how far we're actually getting into that code...  Probably the easiest way to do this is to just turn on debug logging on the server, and run another test...  Let's first archive the current log files, to Dropbox/COSMICi_devel/Server Code/data/logs 2012-02-15.

When starting after having been off for a while, I got a lot of garbage...  I think what happened there is:  The OCXO draws a lot of current while it is warming up, so the common +5V supply voltage was sagging, which interfered with communication from the GPS...  I worked around it by supplying the GPS module off of USB temporarily.    We still really need to address our system-wide power-supply issues...  Hope Samad gets his new board built and working soon.

OK, now in the server log, with debugging on, we are seeing lines like the following:

2012-02-15 17:27:43,845 | COSMICi.server.comm  |  Thread-19:    node #0  uart0.con0.rcvr   |       communicator.py: 814:_announce           |    DEBUG: Connection._announce():  Announcing incoming message [HOST_STARTING,CTU_GPS,1.9] to a message handler...

but then nothing interesting happens in response to that...  And similarly for all the subsequent messages received... Each message only goes to 2 message handlers - does that even include the one to process commands?  Unclear.  So I need to pick up where I left off, which is:  We need some debug logging to confirm when an incoming command on the UART is dispatched, and to where, and see what is done in response to it...  The place to start, I think, is in the model code, where we register the message handler that is supposed to interpret incoming messages over the UART.  This is done in the private nested class WiFi_module._UART_MsgHandler.  Let's see, its initializer is getting run.  Ah, but then its actual "handle()" method returns early without doing anything, because at the time I hadn't finished implementing the SensorHost class.  I got a little farther after doing that, so it might be worth commenting out the early-return to see if we can get any farther thru the code yet.  First, let's add a debug() line to handle() to make sure we're even getting in there.  We aren't.  Hm...  Maybe we need to say something more informative than "a message handler" in the logging.  When we create a message handler, we can give it a name.

Added "name" argument to the BaseMessageHandler initializer.  Its default value is "generic."  The name is now printed by the above debug message.  The BridgeMsgHandler class now overrides the default name and sets it to "bridge".  And WiFi_Module._UART_MsgHandler gives it the name "Wi-Fi.UART".  Let's try again...

OK, that is working now, and we gave default names to all the existing message handler classes.  I've confirmed that we're definitely getting into the handle() method, so I'm ready to proceed.  Let's continue working on that Friday, after the team meeting...

At home this evening:  Burned the latest version of the Wi-Fi script (with appropriate site selection) onto the EZURiO board I have at home, to facilitate testing of my upcoming server changes.  I can just manually type mock text lines from the CTU board into UwTerminal to exercise the new server code as it is written.

Because it's annoying to have to keep recompiling/reburning the script to the Wi-Fi module each time a minor network-related or debug parameter changes, I also started writing a config.txt data file that eventually could be read by a new script module to hopefully avoid having to do so many reburns in the future.  However, before the new module can be included in the script I'd have to make room for it by either taking out some existing module(s) or code, or moving more string literals out to strings.txt.  Sigh.  Anyway, this is a fairly low-priority item since now that the new general-purpose HOST command is included, I shouldn't need to make too many more changes to the Wi-Fi script (cross fingers).

However, the "config.txt" facility REALLY needs to be implemented before this system is released in any form to the public, because the present situation where you have to modify the script code and recompile to change any network settings is really awful.  Not a good solution for installation by unskilled users at all.  Anyway, we're cross that bridge when we come to it, I guess...

No comments:

Post a Comment