The Cosmic Inquirer: Wi-Finally

Yesterday I got the basic two-way Wi-Fi communication between the board and the server working properly, and now I am just making some minor tweaks, thinking about the startup process, etc. Some points:

I need to test the WIFI_READY message, from the Wi-Fi board to the FEDM. I already verified yesterday that the FEDM_READY message gets propagated correctly, if the Wi-Fi board is already up when the FEDM is started. However, in this scenario, the FEDM won't see the WIFI_READY message, and the Wi-Fi board won't see any messages from the FEDM, unless relayed back by the server (which could certainly be done). So anyway, we need to think more carefully about the turn-on sequence, and what the FEDM will do with its data if/when the server and Wi-Fi board are not yet up and running and ready to receive it. We have enough RAM to buffer up a limited amount of data, several hundred pulses' worth I think (more if compressed), but of course data may still get lost if the server is not up yet. Conceivably, after turning on, the FEDM should always first verify that the server is up and running before it even begins data collection. And the server should perhaps acknowledge receipt of data, and the FEDM could intelligently choke off the data stream if the server fails to respond in time. But, this may all be overkill for what is anyway an abnormal network failure scenario. Anyway, it perhaps deserves some more thought.
At some point, we should perhaps actually power the Wi-Fi board directly from the FEDM board's power supply, so they will turn on at the same time when the FEDM is powered up. Need to look at the voltages/currents, and verify how to do this. At one point, we were thinking of powering the Wi-Fi board through the existing serial cable (since a pin is allocated for that purpose); this needs to be revisited. If not, we can wire up a separate connection.
I briefly considered whether the Wi-Fi board should actually be configured and programmed by the FEDM firmware on startup, rather than independently. (The FEDM could be programmed to start at the faster baud rate that is first expected by the Wi-Fi board on turn-on.) However, there is probably not enough ROM capacity on the FEDM to hold the required files. Also, this would slow down the startup sequence significantly. It probably makes more sense to just assume that the Wi-Fi board will retain its configuration information. This is usually the case, although we might very occasionally see situations where the network goes down temporarily, causing the script to get hung up (unfortunately there was not enough script space to program more robust recovery from network failures). When the script hangs, it must be manually reset or power-cycled. If the FEDM controls its power, possibly the FEDM could cause the reset. However, sometimes on reset, the programming is lost. This is why giving the FEDM the intelligence to reprogram the Wi-Fi board might be desirable. However, it is probably not feasible.
I should probably go ahead soon and write the code to stream data from the FEDM to the server. It would be nice if we had a real multithreaded environment, so that we could just do this in a dedicated thread. However, in the meantime, we could just do it in the main loop. We need some kind of concurrency/locking primitive to safely access a shared data structure, e.g., a circular buffer for pulse data. Actually, one way to accomplish this without an atomic lock operation would be to just have separate read and write pointers into the circular buffer. The writer thread (ISR for the input capture device) updates the write pointer, but conservatively, always making sure the read pointer is far enough ahead to make room for the new data; and only updates the write pointer when it is finished writing the new data. Similarly, the reader thread (main loop) updates the read pointer, but conservatively, always making sure that the write pointer is ahead so that there is data available to read before reading it, and only updating the read pointer after it is finished reading the data.
We should add a command to soft-reboot the FEDM. There must be a HAL macro or function for this right? Or maybe a newlib routine?

Successfully verified that the WIFI_READY message is now successfully received by the FEDM when the FEDM is booted before the Wi-Fi module.

Now getting ready to write the code for the data buffer. Before doing this, I want to free up the maximum amount of RAM, so that we can make the buffer as large as possible. (However, it might be a good idea to write the buffer's initialization routine in such a way that it automatically allocates the maximum-sized block of memory that will fit, and reports its size... That way it will automatically adapt as we make other aspects of the code more complex.)

Relocating extra_RAM to just above program_ROM, to see if this will allow us to put the .bss section there; the point of which is to free up more space in working_memory. (BSS, for "Block Starting with Symbol," is this historical name for space for statically-allocated writable data preinitialized to zero.)

Memory map is currently:

0x00000-0x0ffff: program_ROM (64K) - .text, .rodata
0x10000-0x117ff: extra_RAM (6K) - .bss
0x12800-0x13097: (misc. memory-mapped hardware device registers)
0x20000-0x2ffff: working_memory (64K) - .rwdata, .heap, .stack

System generation was successful; now regenerating the BSP... Having some difficulty... I think some paths got screwed up when I moved from Q:\ to R:\, at C:\LOCAL\FEDM_code\q91 (which I did yesterday so as not to interfere with Darryl who was working in Q:\). Creating a new workspace and new projects from scratch, in R:\eclipse_workspaces\mpf_workspace and R:\software respectively. Moved all the old workspace/project folders into the old_software subfolder.

OK, apparently .bss can't be located at 0x10000 because it can't be referenced as a 16-bit offset from the global pointer which is near the top of working memory. So, let's relocate extra_RAM to ABOVE working_memory. New memory map is as follows:

0x00000-0x0ffff: program_ROM (64K) - .text, .rodata
0x10000-0x1ffff: working_memory (64K) - .rwdata, .heap, .stack
0x20000-0x217ff: extra_RAM (6K) - .bss
0x22800-0x23097: (misc. memory-mapped hardware device registers

Regenerating SOPC system, reconfiguring/rebuilding BSP, make clean... Damn, the .bss is 1148 bytes too large to fit in the extra_RAM. Maybe I can make the extra_RAM 2K larger? It would take about 28 more M512s, I calculate, and we supposedly have about 49 left, so theoretically it should fit. Now recompiling Quartus design with 8K extra_RAM (from 0x20000 - 0x21fff).

Incidentally, I should maybe try increasing the working_memory from 64K to somewhat larger (72K), by using a little bit more of the M-RAM, which might be possible if we go from a 32-bit memory word size to a 16-bit word size. I'm not sure SOPC Builder will let me do this without complaining though... Also, even if possible, changing the word size might slow down the code and/or increase the code size significantly. (Well, it might not increase the code size, if the bus handles the transaction width translation transparently from the perspective of the CPU.)

Alright, the fitter completed successfully, so now we are good. Glad to be finally making good use of the M512 memory blocks on the device... Every kilobyte counts! As soon as the Quartus build finishes, I'll retest the system with the memory rearrangements.

Hm, weird, looks like I spoke too soon... The fitter completed, but the assembler appears to have hung. I've never seen that happen before. Killing Quartus and restarting the assembler. Worked fine that time; weird.

Now I'm getting what looks like a stack overflow, even though we have plenty of memory now! What's going on? Doing another "make clean" and recompiling everything from scratch...

All I can think of that's changed is that the BSS area was moved. I may have to move it back...

OK, the Wi-Fi script crashed at some point... Maybe that was the problem... Although I don't exactly understand how that could really have messed up the UART driver at our end badly enough to crash our firmware...

Aha, at some point during my attempted restarts, the Wi-Fi lost the script... Sigh... Reloading it...

I also noticed earlier that something I did (unplugging a cable?) triggered the GPIO interrupt, which caused the script to exit.

Looks like the CPU got wedged to where I can't even reload the firmware; re-programming design... Now reloading/restarting firmware... Still nuthin'... Oh, system IDs aren't matching for some reason... Tell it to ignore that...

Argh, the GPIO crashed the Wi-Fi again... I think I need to disable that particular timer-driven input polling "thread..."

OK, now I am having this really weird thing, where I can communicate just fine over serial between the FEDM <-> PC, and between EZURIO <-> PC, but not between FEDM <-> EZURIO. In the latter case, a null-modem adapter and a gender-bender are inserted. This was working fine yesterday, but isn't today!

Oh, and to make matters weirder, communication works one way and not the other!

On the Wi-Fi board, I added the jumper on J10, between pins 1 and 2, which is supposed to ground the /EN pin of the level-shifter, making sure it is enabled, then reset the board. This seems to have fixed the problem. Not sure why this jumper was missing - I guess because I was manually tying its output to VCC earlier, to disable the level-shifter for purposes of communicating with the DE3 board. Anyway, I guess I was just lucky before that it worked for a while even with the jumper missing (enable pin was floating).

Now, on to the next problem. We have ability to capture pulses, ever since I put the new project together (and increased the size of extra_RAM and moved the BSS section in there). Possibly Darryl's changes (adding a reset control throughout the input-capture datapath) will help, once that's working, but things were working fine before, so it's troubling that suddenly it's not working.

The problem went away after I recompiled the design. Also, when we turn the function generator off and on, it seems to sometimes hose the state of the FPGA; so, let's remember to reload the FPGA after any such changes.

Finally, I can get back to my earlier task: Writing the circular buffer for pulse data.

Let's first calculate the size of the pulse-form data structure. A time_val is 64-bits, i.e., 8 bytes. Thus a RiseFall is 16 bytes. Thus, 96 bytes are needed for the array of 6 rise/fall pairs. Plus 4 more bytes for the integer "nlevels," gives us 100 bytes all together.

Wrote the buffer.h header file; write the C code tomorrow.

The Cosmic Inquirer

Thursday, August 11, 2011

Wi-Finally

No comments:

Post a Comment