Wednesday, August 3, 2011

Blowing my Stack

I suspect we are having stack overflows, so a priority for today is to find a compiler flag that will turn on dynamic stack overflow checking, so we can verify whether this is a problem.  If so, then we'll have to seek another way to reduce memory usage.  Probably, we should start by removing unnecessary features that add a lot of complexity, such as the dual serial line input.  (So sad, since I already had working code for that left over from my other app...)

First, let me check to see whether I got paid...  Yes I did, thank goodness.

Looking at stack-related gcc options:

-fconserve-stack         Tries to minimize stack usage.
-fstack-check              Generates code to dynamically check for stack overflows.

OK then, let's recompile the firmware with both of those options turned on...

There also is a BSP checkbox called "Reduced device drivers," I am going to try that as well.  From the docs (Nios II Software Developer's Handbook), it looks like it just eliminates some drivers that I don't need anyway, and manages I/O via polling rather than interrupts, which is fine.


There is another one called "Small C library", that might be worth looking at. From the handbook, it eliminates printf() support for floating-point, which we don't really need.  It eliminates scanf(), which we don't need.  But oops, it also eliminates fopen(), which we do need, and input routines, which we do need.  So, that one isn't an option for us.

It also occurred to me that if I compile without debugging support, that should reduce the code size quite a bit.  But, let's not resort to that if we can avoid it, since debugging support will be useful as we continue code development.

cc1 is reporting that the -fconserve-stack option is unrecognized.  cc1 is supposedly gcc, but perhaps it is an older version.  Removing that option.

I'm still having problems (even worse than yesterday's, still!) and nothing I try (in terms of changing how the 64-bit ints are dealt with) seems to help at all.

I'm wondering if the problem might have something to do with this warning I keep getting about the ID and date of the system image not matching between the design loaded onto the FPGA and what the Nios IDE is expecting.  Previously, when this error occurred I told the environment to ignore it, but that was about the time these weird problems started happening (nothing seems to be getting read right).  So, I regenerated the SOPC system, recompiled the firmware from scratch, and am recompiling the gelware.  No, that didn't help.

OK, apparently the re-entrant version of printf() isn't working properly; that's what caused the problem where things were looking even worse than they did yesterday.  Trying now to figure out why...

(hours pass)

I never did figure out why the re-entrant version of printf() wasn't working, but for now I am just using the normal version, since for the time being we only have one thread (the interrupts) actively producing output anyway.  This should probably be revisited later on though.

We noticed that the high word of all the time data values was always 0, even though it should have incremented about every eight seconds.  It occurred to David that the problem might be the use of the integer data type in the cs_combine module.  Sure enough, integer is just a signed 32-bit value in VHDL.  We replaced them with unsigned vectors and that fixed that problem.

Next, we noticed a lot of data values going haywire.  Perhaps now that we are doing real 64-bit adds in cs_combine, that slowed things down.  Sure enough, we slowed the clock down by a factor of 2 as a test and now the output values are stable.  Probably the 64-bit adds in cs_combine are not fitting within the 20 ns system clock cycle.  But, we could slow the system clock down without slowing down the fast clock for the sampling system.  That might be worth a try.  Or, we could create a pipelined 64-bit adder, or just use the Altera adder megafunction to make one.  Lots of possibilities.  But, let's save 'em for tomorrow...

For now, I am just checking the timing analysis...  Man, the compile is slow now!  Actually I think it has hung.  I can't even stop it.  I will probably have to just force-quit the application...

Back on the printf() problem... I think the problem is actually with the va_list stuff (from stdarg.h), it doesn't seem to work right if the argument list is at all complicated, even when reentrant mode is off...

Anyway, command-line input from the JTAG port works.

According to timing analysis, the system clock is fine; it's the fast clock that has the problem.  Perhaps using all 64 bits slowed down the early-stage modules.  The fast clock certainly would have larger fan-out.

Tried a build with 200 MHz fast clock; outputs are stable except for that annoying threshold 6.  Timing analysis reports 169 MHz max speed.

Tried 150 MHz; last threshold is still messing up though, even at only 300 ms pulse period.  It may be a separate issue.

We had fully reliable output (in the case of relatively short pulse widths) when the entire system was slowed to half speed (25 MHz system clock, 125 MHz clock for the dual-edge-triggered registers).  That can be our baseline for improvement.

Need to confirm that it works with 50 MHz system clock and 125 MHz fast clock.  If so, then something to consider:  Replace all the pseudo-dual-edge-triggered flip-flops in the design with single-edge triggered ones clocked at twice the speed (back up to 250 MHz).  That should work, and then we should be able to crank up the speed some more (since the overhead of the PDE registers in eliminated) until they stop working again.    Not sure we can get all the way back up to 500 Msps (500 MHz) but it is worth a shot...

No comments:

Post a Comment