The Cosmic Inquirer: August 2011

Wednesday, August 31, 2011

We took 8 showers in 17 hours.

Last night's overnight run crashed at about 8 am this morning (about an hour and a half after sunrise). At least we got data that might give us a night/day contrast in the event rate.

Adjusting PMT 1 voltage to try to match PMT 1's event rate. At 1,400 V the PMT1 rate is still slightly less (about 17% lower) than PMT2's rate (at 1,200 V). Above 1,450 V, PMT1 stops producing pulses. Right at 1,450 V, PMT1's rate is about 6% higher than PMT2's rate. At 1,430 V, PMT1's rate is still 10% higher than PMT2's. At 1,410 V, PMT1's rate is about 2% less than PMT2's - about the same within the margin of error.

Of course, this is all dependent on the scintillators' type, and their positioning (due to any shielding and/or radiation sources in their immediate environment). Just now, during that test, PMT1 was sitting behind my keyboard. Let me put it back in the window sill and check again. Now the PMT1 pulse rate is 25% higher than PMT2! Let's go back down to 1,350V. Now the PMT1 rate is about 3% lower than PMT2. I think that's about as close as it's worth trying to get for now. Once we're using the paddles, this kind of calibration will be more meaningful, since at least we'll know we have two scintillators that are the same.

I'm going to let the current run (with the almost-matching pulse rates) continue for a bit. Oh wait, is it worth trying to lower the threshold? We have to be careful, because if we lower it too much, the pulse rate will be too high and could crash the system.

Aha, I wonder if the crashing problem is due to these occasional errors I'm seeing filling up the STDOUT buffer when the nios-terminal is not attached:

ERROR: icdp_handle_have_data(): The HAVE_DATA flag is not up!

These are probably due to the fact that the handler possibly processes multiple pulses per HAVE_DATA event, which could result in another HAVE_DATA interrupt coming in for a pulse that was already handled. Also, we do extra HAVE_DATA handling whenever there is a BUF_FULL event.

I changed these things. This will lower our peak performance a little (since every pulse will now trigger a separate interrupt) but it is a little bit cleaner. If we have throughput problems later, consider changing this back (or else turning the system clock speed back up to 80 MHz again).

OK, we got eight coincidences in the overnight data set:

(#1) 4:33 pm. Neither North nor South:

Tue Aug 30 16:33:27 2011 + 211 ms: < PULSE,1,20748,871599293573,2,(0,(1,6),4)
Tue Aug 30 16:33:27 2011 + 212 ms: < PULSE,2,47445,871599293573,4,(0,(0,(0,(1,8),3),2),2)

(#2) 5:42 pm. From South by 5 ns:

Tue Aug 30 17:42:26 2011 + 460 ms: < PULSE,1,40789,1699440061556,1,(0,7)
Tue Aug 30 17:42:26 2011 + 462 ms: < PULSE,2,92031,1699440061557,5,(0,(1,(0,(0,(1,4),2),2),3),4)

(#3) 6:00 pm. From North by 5 ns:

Tue Aug 30 17:59:43 2011 + 529 ms: < PULSE,2,103061,1906866999930,5,(0,(0,(0,(1,(0,5),2),2),3),3)
Tue Aug 30 17:59:43 2011 + 528 ms: < PULSE,1,45704,1906866999931,1,(0,4)

(#4) 6:17 pm. Neither North nor South:

Tue Aug 30 18:17:12 2011 + 547 ms: < PULSE,1,50695,2116678060181,2,(0,(2,3),4)
Tue Aug 30 18:17:12 2011 + 548 ms: < PULSE,2,114408,2116678060181,3,(0,(0,(1,3),2),2)

(#5) 7:11 pm. From North by 20 ns:

Tue Aug 30 19:11:52 2011 + 126 ms: < PULSE,1,66549,2772581913802,1,(0,4)
Tue Aug 30 19:11:52 2011 + 127 ms: < PULSE,2,149798,2772581913798,4,(0,(0,(1,(0,4),2),3),1)

(#6) 2:15 am. From South by 5 ns:

Wed Aug 31 02:14:45 2011 + 155 ms: < PULSE,1,188649,7847208784469,1,(0,4)
Wed Aug 31 02:14:45 2011 + 157 ms: < PULSE,2,425844,7847208784468,4,(0,(1,(0,(1,3),2),2),2)

(#7) 2:36 am. From South by 20 ns:

Wed Aug 31 02:35:55 2011 + 911 ms: < PULSE,1,194843,8101342573919,2,(0,(1,5),4)
Wed Aug 31 02:35:55 2011 + 911 ms: < PULSE,2,439464,8101342573923,1,(0,4)

(#8) 3:09 am. From North by 15 ns:

Wed Aug 31 03:09:15 2011 + 429 ms: < PULSE,1,204543,8501263144244,1,(0,6)
Wed Aug 31 03:09:15 2011 + 431 ms: < PULSE,2,461019,8501263144241,5,(0,(1,(0,(0,(1,7),3),4),3),21)

So, this time at least we got a good mix of shower orientations (3 from the South side of sky, 3 from the North side of sky, and 2 from near the Azimuth/East/West plane, within the resolution limit.

At 7:11 pm we got a shower from near the North corner of the sky when the supernova was above the horizon - however at 20/25 = 80% of max time difference or (very roughly) 11 degrees away from the north-south PMT axis, it was probably *too* close to the North to have actually been from the supernova.

The Northwestern sky from Tallahassee (sans atmosphere) at 7:12 pm EDT on 8/30/11,
when we detected a cosmic-ray shower from close to North.

I actually had several runs in that data file, so I separated out the largest one, and am now re-running my Scilab analysis script anal-pulses.sce to plot the average pulse rates over that run. Even just this part was still a 17.4-hour run (3:21 pm to 8:44 am), which is why we had so many more coincidences in it, compared to our short run from last Saturday.

Here is the average pulse rate data over the run:

Pulse rate data for PMT #1 (bottom), PMT #2 (middle), and both together (top). The horizontal axis is minutes into the run, and the vertical axis is pulses binned in each minute. The rate for PMT #1 is lower due to a different scintillator material used (we think). This discrepancy will be (approximately) compensated for by increasing the PMT #1 base voltage in our next run.

There is no noticeable change in pulse rate after sunrise (which happens about 400 minutes into the run) of sunset (which happens 100 minutes before the end of the run). It could be that almost all of the pulses that are energetic enough to be picked up here are originating from primary astroparticles having energies much greater than most particles in the solar wind, so that the solar contribution to this data is low. Or, perhaps this data is dominated by terrestrial radiation (since after all, we are pretty close to sea level).

Ray mentioned that powerful (e.g. X-class) solar flares could produce high-energy particles. We are near the sunspot minimum right now, but are on the upswing, so such flares may become increasingly common in the next few years:

14-year history of sunspot activity. From http://www.nwra.com/spawx/ssne-cycle2324.html.

Ray got one of the scintillator paddles ready to hook up:

Scintillator paddle & PMT in gun case.

We hooked it up. The pulses are lower & broader, which makes sense due to the thinner & wider scintillator. We had to adjust the threshold down from 350 mV to about 300 mV to get a reasonable (not too high, not too low) event rate. Tomorrow (or whenever we next work on this) we'll hook up the second paddle and look for coincidences. Hoping we'll see more of them, due to the much larger area.

Tuesday, August 30, 2011

First shower detections!

Stupid firmware is still giving me trouble today. Not seeing any output!

Could it be a baud rate mismatch? I already backed off from the 80 MHz system clock. Regenerating SOPC system files and BSP to make sure those match... Nope, that wasn't it.

After spending a while stepping through the code in the debugger, I realized that I was trying to print the new DAC_LEVELS message before the serial port was even open! So of course it was crashing. DERP... OK, that is fixed now.

Collecting new data. Wrote Scilab script to look for coincidences. Found only 3 coincidences within +/- 30 ns:

Saturday at 7:56 pm: (Coincidence within +10 ns.)

Sat Aug 27 19:55:49 2011 + 109 ms: < PULSE,1,39642,1806343255288,1,(0,8)
Sat Aug 27 19:55:49 2011 + 110 ms: < PULSE,2,99774,1806343255289,1,(0,3)

Saturday at 9:31 pm: (Coincidence within +/- 5 ns.)

Sat Aug 27 21:31:14 2011 + 285 ms: < PULSE,1,64706,2951362942182,3,(0,(1,(1,7),4),13)
Sat Aug 27 21:31:14 2011 + 286 ms: < PULSE,2,163209,2951362942182,5,(0,(1,(0,(1,(0,6),2),3),4),2)

Total width 95 ns.
*******************
****************
************
********
******

Saturday at 9:33 pm: (Coincidence within +10 ns.)

Sat Aug 27 21:33:39 2011 + 949 ms: < PULSE,1,65347,2980500374764,2,(0,(1,7),4)
Sat Aug 27 21:33:39 2011 + 950 ms: < PULSE,2,164778,2980500374765,5,(0,(1,(0,(0,(1,5),2),3),3),3)

******************
**************
***********
********
*****

314937 total pulses

Sat Aug 27 17:25:17 2011
Sat Aug 27 23:05:18 2011

5 hours, 40 minutes = 340 minutes = 20,400 secs.

average pulse rate: 15.44 pulses/sec.

PMT 1: 89,389 pulses ( 4.38 pulses/sec.)
PMT 2: 225,548 pulses (11.06 pulses/sec.)

So, looking at the arrival times, the pulses arrived at the PMTs at the same time within 5 ns, or at PMT1 (which was in the more Southerly location) about 5 ns before PMT2 (in the more Northerly location).

It is interesting that two coincidences arrived within 2 minutes of each other (at about 9:32 pm) that both had similar features: Crossing 2-3 thresholds (550-700 mV) on PMT1, and 5 thresholds (1 V) on PMT2. Perhaps some activity from the center of our galaxy (which was in that general direction)?

Here's what was in the Southern sky as seen from our location at that time:

Southern sky as viewed from Tallahassee at 9:32 pm on Sat. Aug. 27th,
when we saw two candidate showers coming from slightly South of the zenith.

I selected a planetary nebula which could be a potential gamma-ray source.

Anyway, here also is a plot of total pulse rate from both PMTs over the course of Saturday's 5-hour run:

Pulse rates from two test PMTs between 5:25 pm and 11:05 pm on 8/27/11.
Threshold for pulse detection was 400 mV. Horizontal axis is minutes,
vertical axis is number of pulses detected in that minute.

It looks like the pulse rate was fairly constant, apart from white-noise looking fluctuations, which isn't so interesting, but at least we have some real data now!

We can estimate the probability of getting the 3 coincidences we saw by chance. Since the average pulse rate on PMT 2 was 11.06 pulses/sec, the expected number of pulses per nanosecond is 11.06e-9. There were 89,389 pulses on PMT 1 over the run, so the total number of expected pulses on PMT 2 occurring within +/- 5 nanoseconds of pulses on PMT 1 over the course of the run if there were no showers would be 10*(11.06e-9)*89,389 = 0.00988, less than 0.01. Instead we saw 3 such pulses.

We can use the Poisson distribution to estimate the probability of these coincidences occurring by chance. f(k;lambda) = lambda^k exp(-lambda) / k!. Here, lambda = 0.00988 (the expected number of occurrences) and k=3 so k!=6. The calculation gives 1.59e-7. Let's see, how do we convert that to the number of sigmas? I think there is some simple logarithmic approximation... Anyway, it is more than 6 sigmas, at least...

I think tomorrow I should turn up the voltage on PMT1 (and if necessary, turn down the voltage on PMT2) until both PMTs produce above-threshold pulses at approximately the same average rate. Then, next time we get a coincidence, I can check to see if the matching pulses are consistently the same size as well. This will be a good test as to whether the coincidences are indeed from particle shower pancakes passing through. That seems consistent with the data for the 3 coincidences above, except that the PMT1 pulses are consistently smaller (perhaps due to a different scintillator material).

I should also try turning the thresholds back down to 350 mV (or maybe even 300 mV) to increase the pulse rate and thus (hopefully) the frequency of coincidences. However, if the resulting pulse rate is too high, I may have to do some other tweaks to the firmware to get the data rate manageable again.

Monday, August 29, 2011

Supernova notes from Saturday

[Drafted this post Saturday but only posting it now]

The news just broke that a Type IA supernova just exploded this week (yes, I'm using language loosely here) in the Pinwheel galaxy (M101), only 21 million light-years away (close in cosmic terms). As the remnant nebula expands, there should be an increasing flux of high-energy particles coming from that vicinity. If we get our 3-node detector LAN working soon, we might be able to identify an excess of UHECRs originating from that area of the sky! This gives us additional motivation to get busy.

I came in to the lab and hooked up our 2nd test PMT to the board. I can see the pulses on the scope, but the firmware is not reporting them yet. I double-checked the pin assignments for the 2nd channel against the layout, and they are all correct.

I think the PLL frequency was just too high - after turning it down (and turning up the cooling) the 2nd channel started working.

The pulses on the 2nd channel seem larger (and more frequent) than those on the first one. Ray said it might be because that PMT isn't wrapped in felt. But these are too big to be pulses from single leaked-in optical-wavelength photons. The histogram for those peaked at about 40 mV and these are over 300 mV.

I might just be imagining things, but it seems like the total pulse rate (even on PMT 1) is higher today than it was a few days ago. Could it be because of the supernova?

OK, turns out one of the PMTs (#2) had its voltage set much higher than the other one. (12,000 V?) At some point the voltage readout stopped working and I had to switch bases. Now, both PMT bases are set to 1.2V HV readout (1,200 V internal high voltage) and they are generating pulses at about the same rate.

I put one PMT at the far North end of the room, and the other one at the far South end of the room, so that if a significant proportion of UHECR-initiated showers comes from the direction of the supernova (which is in the Big Dipper), if we get coincidences, we will be able to see an excess one ones coming from the North in the relative time-of-arrival data.

Mad Scramble

We're scrambling like mad to try to get some good data, especially since Ray's NSF report is due in two days. It would be nice to have something for it.

Input channel 2 is still being flaky. Sometimes it sorta works, but with garbled data values. Other times it generates an enormous bolus of pulses that are all too wide and then crashes the firmware.

I wonder if the latter is maybe because drops of water are dripping onto the board from the thermoelectric cooling plate. This is why we need a proper cooling system designed for this board!

I changed the program to tell the initial DAC levels to the server, so at least that critical information will be included in the data logs.

Tried increasing the system clock speed to 80 MHz, in hopes that the hardware FIFO on channel 2 won't fill up as often. Then I realized this is really not a very good idea, because it messes up the DAC controller, which has to go slow enough for the DAC chips to keep up with. But, I can fix that by clocking the DAC controller module directly off of the 50 MHz external clock. It should still be able to be controlled by the dac_cpu_if (running at 80 MHz) because the protocol used between those two modules is asynchronous. First the 72-bit (6x12) parallel data bus is configured, then (a bit later) the dac_go signal is raised. Once it sees this signal go high, dac_control then reads its input bus and takes care of all the rest on its own accord. Due to the delay between setting the data bus and the go signal, there should be no setup time issues.

At some point, I also need to revisit the performance issue for the high-speed parts of the input capture system.

During the most recent round of debugging (after adding the 2nd datapath), I had to lower the high-speed clock from 500 MHz to 200 MHz to avoid violating timing constraints and get both datapaths working reliably. Sometime, we need to figure out how we can crank the speed back up. One approach: Compile the high-speed components in isolation, and then put them into a LogicLock region to prevent them from being relocated in subsequent compiles.

At some point I also need to figure out why the firmware is crashing sometimes. I could do a run under the debugger, and then suspend execution when I get into a hung state and see where we are in the code. I suspect it is the STDIO libraries again, but don't know why they're having problems. Memory issues still?

I added some macros to let me stub out the DEBUG() messages and other diagnostics to save memory when I am not otherwise using them. Basically, in debug.c you now just set MIN_REPLEVEL to the level of the lowest-level diagnostics that you wish to be able to see, dynamically. Any diagnostics below that level will be compiled away.

OK, switching DAC_control back to 50 MHz fixed the DAC levels, at least. But now I'm getting no output! Argh. And now, I'm getting output but the CPU is restarting every 10-15 seconds. But I can prevent it from restarting by pausing the high-speed time counter! Weird. Mebbe I need to start using a Gray counter or something, for reduced noise? Great, a carry-save Gray counter, I don't have that yet! Does it even make sense?

I'm going to back out of the increase of the system clock speed to 80 MHz, as it seems to have only made things worse (although, between the restarts just now, the system was working fine)!

Actually, maybe it was just overheating, b/c I added the cooler back on and now it seems to be working again without resetting itself...

And now, it's *not* working... I'm so confused! I think I need to go more slowly and take more careful and detailed notes about every single little thing I am doing.

OK, what's happening right at the moment is, I'm recompiling the Quartus design with the original clocks again, but with the timer_0 device simplified to a basic periodic timer (since I wasn't using its advanced features anyway).

Oh of course, fatal error in Quartus. Everything is going wrong today! Not to mention that FAMU's Internet connection was down for an hour in the middle of the afternoon.

SUPERNOVA!

About 21 million years ago, a type IA supernova exploded in M101 (the Pinwheel Galaxy). Due to the speed-of-light delay, it just became visible to our telescopes this past week. I heard about it Saturday morning, and rushed into the lab to try to get an experiment up and running to see if we could detect any high-energy cosmic rays from it. Ray says the particle flux should increase as the remnant nebula develops, so we are not too late. After some fiddling, I got the system working with raw pulse data streaming from 2 input-capture datapaths. For some reason, the pulse rate in the 2nd PMT is higher than the first. I adjusted the voltages on their HV supplies to be the same (1,200 V) but the difference is persisting. The pulse rate on PMT 2 was so high I kept getting hardware FIFO overflows; I increased the size of the FIFO from 4 to 8 pulses which helped a little. I shifted down the top end of the voltage ladder to -400 mV (from -300 mV) to filter out more pulses, which brought the pulse rate down to a manageable level. Then I set up an experiment with the two test PMTs on opposite sides of the room on a North-South axis, in hopes of gathering evidence for an excess of northerly showers between the hours of 7:30 am and 3:00 am (when the supernova is above our horizon).

Unfortunately, the system crashed after a few hours, and did not keep running overnight. However, we do at least have several hours worth of data that we can analyze looking for coincidences. I tried restarting the run on Sunday, but now the 2nd datapath is not working for some reason (but that PMT is still producing pulses). Could have been a heating problem. Try again with the power on the thermoelectric cooler turned up.

Note to self: Also try turning up the speed of the system clock, to clear the FIFO more quickly. I think fmax for the system clock is something like 80 MHz.

After a quick skim through the data, it looks like the pulse rate was highest when the supernova was highest in the sky. But I'm not sure about this yet; need to do a real plot.

Note to self: Need to log the DAC settings to the server at the start of the run so that information will be included the data file.

Friday, August 26, 2011

Hanging by a Thread

The stupid firmware is still hanging in the middle of serial output; don't know why... It's difficult to debug this sort of thing! I thought maybe the SOPC config was still not in sync, but I checked all the base addresses, and they are right now. Then I thought, OK, maybe the stack is overflowing again, but I increased the reserved space from 15K to 20K, and that didn't help.

I cleaned up the interrupt code a little, and decreased the code size a little. Something I did worked (not sure what) because stdio is working again. Maybe the code size was just getting too close to the limit and this was causing problems for some reason. However, I'm still not seeing any pulse-capture interrupts.

Maybe it is a performance issue - i.e. the pulse_cap module is not coming close to meeting its timing constraints, and as a result it is never getting to the point of producing output handshakes. Indeed, fmax is reading as only 270 MHz or so, for some reason. I'll cut down the PLL speed to 250 MHz temporarily (from 500 MHz) and see if that helps.

Not only did that not help, but now I'm having the stdio problems again. Sigh...

Shrunk my code again; this time I got one interrupt and part of another before it crashed! Progress...

More shrinking, and turning the reporting level down to warning level, fixed the problem. Lesson learned: The first time anything weird happens, make sure the total code size is somewhat less than the maximum (which seems to be about 73K in the present environment, which makes sense due to our 64K ROM for .text and .rodata, and the 9K RAM for .rwdata). Still unsure, though, why the linker isn't warning me before something goes wrong... Anyway, we're good for now.

Before leaving for the day, I Dremeled down the pins on the 2nd SMA connector and plugged it into the thru-holes in its proper location on the board (PMT2, J32). (Note: Neither connector is actually soldered in yet.) As soon as I find the BNC-to-SMA cables, or a second BNC-to-SMA converter jack, I can hook up the 2nd test PMT, and test the system with two input channels. (It might even work now!)

Mucos Membrain

Spending some time at home reading MicroC/OS-II: The Real-Time Kernel (2nd ed.), by Jean J. Labrosse, because I am thinking of upgrading our firmware to use uCOS.

I tried to port the first example PC program on the CD-ROM from Borland to Visual Studio, but ran into problems. I think I'll try emailing the author to see if someone else has already done it. He replied and said I should look at uC/OS-III instead. I explained that I was stuck with version II due to the Altera EDS. I could consider switching to a Cortex soft core which could run OS-III, but it might not fit on this FPGA.

Thursday, August 25, 2011

Fat Man's Squeeze

Seeing as how the present datapath (with 3 PMT inputs) won't fit on the FPGA without significant recoding, for now I am going to cut the number of PMT inputs down to 2, so that we can at least start doing some work on coincidence detection code. It should be easy enough to write this code in a general sort of way, so that it can work with any number of inputs, so we can test it with 2, and then when we have 3 inputs, it should still work (albeit more selectively).

Latest top-level design is on COSMICi (my work PC) in Q:\COSMICi_FEDM_top9.bdf, v0.11. It has the third instance of the PMT_ic_datapath stubbed out. It uses 18,961/27,104 (70%) of flip-flops (8,143 = 30% unused), 193/202 (96%) of M512 blocks (9 = 4% unused), and 100% of M4K blocks, and the MRAM.

Thinking about the firmware design for coincidence filtering. Really, we need N separate software pulse queues for the N input channels, since (due to the way the interrupt-handler works), multiple pulses from one channel could get read in from the datapath before we turn our attention to another channel on which pulses were being received at the same time. So, to synchronize them in time requires sorting the various pulses received by arrival time, and this is easiest to do if each channel has its own queue of (already-sorted) incoming pulses. Then we just need a selection-sort kind of process to merge them together.

Thinking now about how the "coincidence groups" will be identified. There is a potential ambiguity if two successive groups of pulses arrive close together in time - which "neighbors" will each pulse be identified with? However, as long as all of the potentially-important data makes it to the server (where it can be re-grouped if necessary by a more sophisticated analysis), it is probably good enough if we just take a "greedy" approach, and just send all pulses that arrive within the window size of any pulse on the other channels.

Pseudo-code for coincidence-detection algorithm:

Identify the pulse that is the earliest-arriving out of the ones sitting at the front of all N of the software pulse-input queues (for the N channels).
Calculate the absolute time difference between this pulse and the earliest-arriving one at the front of the other (N-1) software input queues.
MIN this with the time difference between this pulse and the pulse most recently pulled from the queues.
If this time difference is less than the window size, then send this pulse to the server, since it may be part of a group. Otherwise, throw it away.
Repeat 1-3 until at least one of the N software pulse queues is empty.
Then, take care of other tasks until a little bit later,
Then start back at step 1.

Note that this algorithm will send pulses data for all potential coincidences that happen out of even only 2 of the 3 (or 4) PMTs. However, this still should be enough to cut down on the spurious data rate from terrestrial radiation substantially (we'll see how much). Any additional filtering that may be needed (to focus on 3-node or, later, 4-node coincidences) can be done on the server side.

At the end of the day, I was still trying to debug the new firmware (for the new multi-channel datapath). For some reason, we seem to not be receiving any pulse-capture (HAVE_DATA) interrupts. (Or any BUF_FULL interrupts either, for that matter.) I have reviewed the new C code a couple of times and it looks fine. I am thinking that maybe the SOPC system configuration got out of sync between the gelware and the BSP at some point (like, memory-map addresses of the PIOs, since Darryl added & deleted some PIOs). Regenerating SOPC files and BSP files, rebuilding ELF file, recompiling Quartus design. If this doesn't work, I'll just give up and go home for the night and take another stab at it tomorrow.

Wednesday, August 24, 2011

Multiple PMT Inputs

Today, my plan is to finish the gelware+firmware changes needed to process 3 PMT inputs. I discovered yesterday that we can't use all 4 PMT inputs because PMT3 is wired to the timing signal input, so it isn't really an independent input! Asked Ray to ask to join Mentor Graphics' Higher Education program, so that maybe we can get a cheap(ish) full license for PADS, so that we can tweak the design.

I finished entering the pin assignments based on the OrCAD schematic, but I still need to double-check them against the layout.

Design with 3 copies of the datapath is exceeding the available RAM. Cutting down the pulse-form gelware FIFO size in half, from 16 pulses down to 8 pulses. Note to self: If this doesn't work, I could also try putting it in logic instead of RAM - I think we still had plenty of logic cells left on the FPGA. Nope, not enough register cells for that either. Guess I'll have to cut the FIFO size down to 4 pulses per channel.

Changing the step size for the ladder to 100 mV, so that hopefully we can see more thresholds beings crossed.

Also disconnected some of the diagnostic outputs from earlier (stopped routing them to the board), so we can see if that helps clean up the noise on the PMT input.

Connected channels 2-4 of the scope to the first 3 thresholds, so that we can see visually on the scope exactly where the PMT pulse is crossing each of these thresholds.

A thought: Should I reprogram the FEDM to do its reporting in the form of LOGMSG messages over the UART/Wi-Fi bridge to the server?

Still no luck with the depth-4 FIFO. Each of the datapaths is using 43 of the M512 blocks currently. Total M512 usage is 283/202. In other words, 81 too many. So, if two of the datapaths didn't use M512s, then maybe we could fit. So, construct an alternate version of the datapath that uses logic instead of M512s, and use two of these, and one of the original? Trying that now...

Another thing: We could do this thing that we've been postponing forever, which is to compress the pulse representation in the pulse-capture datapath, using the assumption that the pulse width is relatively narrow. However, that is pretty involved - it will require making substantial changes to numerous modules - and it is error-prone - a lot of things could go wrong, and so it may be time-consuming to debug. It could be an appropriate job for Darryl or David (if one of them does any work for us in the Fall), since they are more experienced now. Or it could be done by a member of the Senior Design team, if a student who is good enough at VHDL is appointed.

OK, last fit attempt was not successful, we tried to use 29070/27104 register cells. Each of the FIFOs that is using individual logic register cells is using about 10k worth of those. More than half of this (5400 cells) are in the FIFO, but almost all the rest are in pulseform_cap. Each of the 6 pulse_prep instances takes about 650 cells and pulse_combine takes 780. Within pulse_prep, 387 cells are in cs_combine and 260 are in se_pulse_cap. This is pretty much unavoidable in the current architecture, due to the need to store 64 bits for the sum and carry bits for both the rising and falling edges (that's 256 bits already).

We should perhaps consider reducing the counter size from 64 bits to, say, 48 bits (a 25% decrease). That still gives about 6.5 days for the length of a run. That seems a bit too short. Or better, 56 bits (a 12.5% decrease) would give us a run length of about 4.5 years - more than enough.

Let's see, shrinking the width of each of the datapaths by 12.5% would save about 2500 cells in the two datapaths that are implemented as logic-only (no RAM blocks). This would be more than sufficient to bring our register cell usage back under the limit.

This change is somewhat easier (and less error-prone) than compressing the pulse representation before buffering. Still, it will require changing numerous modules, so it should perhaps wait for a student to be assigned this task.

Tuesday, August 23, 2011

Tiwaz' Day

See Wikipedia.

What to do today?

[ ] Start on 3-way replication of input-capture datapath with Darryl. He can work on making the gelware changes, while I upgrade the firmware. We should make sure to backup the present version of everything first.
[ ] (Lower priority.) Try to get Nios CPU reset (FEDM user command CPU_RESET) working. Need to learn more about the reset process, loading, etc. It's possible that I just need to set those BSP options (e.g. the one to load .rwdata) and burn the design in EEPROM, in order to reinitialize the .rwdata properly from EEPROM on reset.
[ ] After CPU reset is working, consider adding a watchdog timer, so that if/when the firmware ever crashes, it will automatically reset.
[ ] Diagnose and fix the problem with occasional crashes when using STDIO library from ISR (for sending data either to stdout or to uart_0) when the system is handling high data throughput. One strategy for addressing this: Route all output through main-line code. This will then require a couple of extra software line buffers, to pass lines of text from the ISR to the main execution context. A concern, though: Do I even have enough ROM space left to do this additional coding? Also, the line buffers themselves will cut down somewhat on the amount of RAM available for the pulse buffers.

Darryl is here, and he defined the new icdp_ctrl register format:

Bit #0 - ICDP_RESET (affects all ICDP channels simultaneously)
Bit #1 - ICDP_RUN_PAUSEn (affects all ICDP channels simultaneously)
Bit #2 - ICDP_NEG_INPUT (affects all ICDP channels simultaneously)
ICDP Channel #1:

Bit #4 - PUMP_DATA0 (PUMP_DATA for ICDP #1)
Bit #5 - BUF_FULL0 (BUF_FULL for ICDP #1)
Bit #6 - HAVE_DATA0 (HAVE_DATA for ICDP #1)

ICDP Channel #2:

Bit #7 - PUMP_DATA1 (PUMP_DATA for ICDP #2)
Bit #8 - BUF_FULL1 (BUF_FULL for ICDP #2)
Bit #9 - HAVE_DATA1 (HAVE_DATA for ICDP #2)

ICDP Channel #3:

Bit #10 - PUMP_DATA2 (PUMP_DATA for ICDP #3)
Bit #11 - BUF_FULL2 (BUF_FULL for ICDP #3)
Bit #12 - HAVE_DATA2 (HAVE_DATA for ICDP #3)

My notes on modifying the firmware (C code) to support multiple datapaths:

Change data structure in icdp_driver.h to include the PMT input channel #, and define more bitmasks.
Change the interrupt service routine to scan to see which PMT(s) raised their HAVE_DATA flags, then read from the appropriate one(s).
Change the routine that streams data to the server to include the PMT ID in the PULSE output message.

While coding, I realized the following changes to the register format made sense:

Global control bits:

Bit #0 - ICDP_RESET (output; affects all ICDP channels simultaneously)
Bit #1 - ICDP_RUN_PAUSEn (output; affects all ICDP channels simultaneously)
Bit #2 - ICDP_NEG_INPUT (output; affects all ICDP channels simultaneously)

ICDP Channel Selector:

Bit #3 - ICDP_SEL0 (output; low-order bit of ICDP channel #, 0-3, to select)
Bit #4 - ICDP_SEL1 (output; high-order bit of ICDP channel #, 0-3, to select)

Bit #5 - ICDP_PUMP_DATA (output; controls data pump for the currently selected channel)
Edge-capture interrupt flags (inputs):

Buffer-full flags:

Bit #6 - BUF_FULL0 (BUF_FULL for PMT #1)
Bit #7 - BUF_FULL1 (BUF_FULL for PMT #2)
Bit #8 - BUF_FULL2 (BUF_FULL for PMT #3)

Have-data flags:

Bit #9 - HAVE_DATA0 (HAVE_DATA for PMT #1)

Bit #10 - HAVE_DATA1 (HAVE_DATA for PMT #2)

Bit #11 - HAVE_DATA2 (HAVE_DATA for PMT #3)

This reformatting entailed two major changes:

All edge-capture interrupt flags of a given type were put together, so that simply shifting the mask for channel #0 left by an amount given by the channel number is sufficient to compute the mask for a given channel number.
A two-bit channel select control output was added, so that we would still only need 1 icdp_data PIO to read from any of the (up to 4) different PMT input channels. This controls a MUX for the 32-bit data output words, plus a DEMUX to route the single PUMP_DATA control signal to the currently-selected datapath (we only want to pull data from one datapath at a time).

Monday, August 22, 2011

Moon Day

Plan for today:

Test new FEDM heartbeat function.
When Darryl gets here, have him start work on replicating the datapath.

Had a thought about why the software doesn't start running automatically after board programming: Currently, .rwdata is allocated to the working_memory module, which can't be initialized because it is in M-RAM. Yet, .rwdata may contain global variables that need to be initialized. We could try moving it to the extra_RAM module (in M512 blocks) and see if that fixes the problem. However, depending on how big the .rwdata section is, we might not have room for it...

Alternatively (looking through docs), it looks like turning on the enable_alt_load_copy_rwdata flag in the BSP might help. (This might also solve the current problem that processor reset doesn't work properly.) I'll try that first, and if that doesn't help, I'll try moving .rwdata to extra_RAM.

Currently waiting for Quartus to finish a recompile, with the new timer_0 peripheral added to the SOPC design, and the enable_alt_load_copy_rwdata flag turned on in the new BSP (which I already compiled). At some point, I need to take a look at incremental compilation to speed up Quartus!

OK, enable_alt_load_copy_rwdata broke everything; not sure why. Turned it back off and things worked at first, but then the FW crashed after producing two heartbeats! Weird.

Moved .rwdata to extra_RAM temporarily and rebuilding, so I can assess whether we have enough M512's left to get away with that. Nope, I'm 544 bytes short. But, that's not too bad! According to the fitter report, we have 17 M512s left, which is 32x18x17= 9,792 bits = 1,224 bytes > 1 kB. So we could increase the extra_RAM size by 1K (from 8K to 9K), which should free up enough room for the .rwdata (for now).

Regenerated SOPC system, opened BSP editor, went to Linker Script tab and hit "Restore Defaults..." in the top pane. extra_RAM size now shows as 9216 bytes as expected. Regenerating target files. Rebuilding project.

Argh, now it's complaining it needs another 900 bytes for the .bss section. Wonder if I should just move .bss back to working_memory. I think I saw code that initializes it to 0 on startup anyway. Let's try that... Editing BSP, regenerating files, rebuilding...

Now having a GP offset problem, presumably resulting from moving .rwdata into extra_RAM. Apparently GP points into the .rwdata section, but also needs to point to the start of working_memory (perhaps to initialize .bss), which is now too far away. Relocating the 9K extra_RAM to 0x1c000-1e3ff, and the 64K working_memory to 0x20000-0x2ffff, so that the GP (which points to somewhere in extra_RAM) doesn't have as reach out as far to reference the start of working_memory, and the offset will fit in 16 bits.

Saving SOPC system, starting BSP editor, restoring defaults, saving BSP, regenerating BSP makefiles, rebuilding project. It links!

Sweet; by moving .rwdata to extra_RAM we actually increased the available stack+heap space to 62K. Changed the code to use 62-15 = 47KB for the pulse buffer.

Rebuilding Quartus so that (with luck) the system will actually run automatically on startup again now. It does! I should perhaps retry enable_alt_load_copy_rwdata, but maybe it's only supposed to be on when the design is loaded from the EEPROM?

Still having a crash after 2nd heartbeat. Maybe I have to stop the alarm within the callback in order to reuse the same alarm data structure? No, that made things worse. Finally solved the problem by alternating between two different alarm data structures.

Things are running pretty smoothly now. Trying to recompile the present firmware into Quartus so that I can try running it from EEPROM. That works! Next up: Try to get the CPU_RESET command working. (Don't waste *too* much time on it, though...)

Sunday, August 21, 2011

Sol's Day

OK, finally got 9.1sp2 versions of both Quartus and the Nios II EDS successfully installed on my Inspiron at home. For future reference, Google "altera download center" to more quickly find any version of any Altera software.

I determined that indeed, it appears that a interval timer peripheral needed to be added to the SOPC design in order for the alarm function to be available. In our case, this system timer device is called "timer_0".

The new code with the heartbeat feature added is compiled, but the feature still remains to be tested. Copying the code to the FEDM_code v3 folder on Dropbox so I can test it at work tomorrow...

Saturday, August 20, 2011

Saturn's Day

Thinking a little about the heartbeat function. Actually, rather than polling, I could use alt_alarm_start(). Need to figure out if there is a system clock peripheral set up in the BSP. Tried to do this from home, but first I need to install the EDS.

Friday, August 19, 2011

Frigg's Day

See Wikipedia.

Today getting 175 Hz consistent pulse rate, after turning off INFO messages. If I also turn off warnings, we avoid crashes, and get LOST_PULSES messages that tell us how many pulses were dropped.

Based on the present line length, my calculation is that a max of about 180 Hz is theoretically achievable based on the 115,200 baud rate, 10 bit-clocks per byte, 64 bytes per line. In practice it looks like we can only get about 175 Hz sustainably though. This is not too surprising, since there will inevitably be some amount of dead time in the serial link due to flow control.

Ray is talking about writing a paper. I need to read the Stanev book, also "Astroparticle Physics" by Grupen, which he says I can get from the library.

On tap for next week (Ray's priorities):

Replicate the input capture datapath 3 times (not 4 yet) for the 3 paddles;
Also generalize the firmware to handle the multiple input sources;
Work on coincidence detection algorithm
Analysis for shower direction triangulation

Also, my own to-dos:

It would be nice to have an occasional "heartbeat" from the FEDM, which will confirm that the main loop of the firmware is still running and hasn't crashed (to distinguish this from a case where the PMT is powered down or malfunctioning). Perhaps a good way to do this would be to just poll the newlib clock() function in the main loop. I need to write a new module heart.{c,h} to do this. Heartbeats should carry a sequence number.
Figure out why firmware isn't running automatically when .sof file is loaded. Somehow it's not getting compiled into the design (I just did a recompile today, still no dice).

BTW, I confirmed we're still working at 2 ns time resolution, although the timing analyzer reports fmax=480MHz.

Thursday, August 18, 2011

Thor's Day

Still having trouble getting comfortable & getting to sleep at night. The back pain is a little better since I started sleeping on the other side of the bed, as well as taking 2 aspirin & 2 Tylenol before bed, but now the breathing machine is bothering me. Had to sleep without it last night. Actually felt pretty good in the morning though, and was able to get up earlier than usual.

On tap for today:

Test the flyover mode which I implemented last night. If it works, make it the default mode.
Test together: Darrel's ICDP run/pause, new faster baud rate, flow control, flyover. If it all works, see how fast a pulse rate I can get through the wireless; if not, debug.

Placed a homemade bridge on jumper J50, which connects the UART_DTR (data terminal ready) output to ground. This is there to ensure that the Wi-Fi board's autorun script is automatically invoked. The only other DTE (data terminal equipment, host side) outputs from the DE9 port are TxD and RTS, and those are now controlled by the uart_0 device.

Let's now install the new script with flyover support on the WiFi board.

Starting COSMICi server app (C:\Users\Mike\Documents\My Dropbox\Server Code\COSMICi_server.py).

Starting UWTerminal (C:\EZURiO\UwTerminal.exe) at 115,200 baud, with CTS/RTS handshaking on.

Powering up Wi-Fi board... Trying basic commands...

<ENTER> --> 00
at<ENTER> --> 00
at+dir<ENTER> -->

06 factory.def.pc
06 nodeid.txt
06 strings.txt
06 autorun

That all is as expected. Now, let's wipe the board and load the new script...

at & f * -->
01 E03C MEDIA_CORRUPT (this is normal)

Power-cycle board.

at+dir now gives just factory.def.pc

Increase stack sizes:

at+set 42="128"
at+set 40="1000"
at+set 41="1000"

Re-downloaded data files nodeid.txt, strings.txt... Done; they are now in the dir.

Cross-compile and load autorun.uws... Cross-compile succeeded, now loading... Hope there's enough memory! Looks like it. at+dir now shows autorun.

Unchecking DTR checkbox to enable autorun. Hitting reset button...

The Main, Auxio, and UART connection windows popped up as expected. WIFI_READY appears in the UwTerminal. The help menu in the AUXIO window shows the new flyover command.

Typing "loglevel 1" in AUXIO window to turn on INFO messages, so I can see what's happening in more detail when I run the "flyover" command. That worked.

Now typing "flyover" command... It thinks it was successful. Let's test. First, output: Type "hello world" in the UART-bridge window and hit enter. It appeared in the UWterminal. Next, input: Type "cowabunga dude" in the UWTerminal window (with LineMode off) and hit ctrl-j. It appeared in the UART-bridge window. Great!

Now, on the WiFi board, I'm going to switch off input from the level-shifter, by tying pin 2 of J10 to pins 1 and 2 of JP2.

Now let's power everything up so we can test the FEDM code at the new faster baud rate, and with CTS/RTS. Set the power supply to 6V, and turned it on. Current at 1.928A.

Setting up the chiller; setting 2nd power supply to 5V, connected in parallel to fan & Peltier cooler; current is about 1.2A.

Burning the .sof file for the current project version (New_with_Nios_trim). That is at 100%; no output on the UART bridge yet.

Running a nios2-terminal to see if there is any buffered output on the JTAG serial port. Nope. (I wonder why that stopped working a while back?)

OK, let's run within the IDE. Refreshing JTAG connection, running. OK, got FEDM_STARTING and FEDM_READY messages in the UART-bridge server window. Let's now try sending some pulses.

Turning on function generator, recalling setup 1, which is a 2.6V negative pulse, 1 Hz frequency, 10 us pulse width 20ns leading edge time and 60 ns trailing edge time. Turning on scope to view input pulses and data being pumped out of ICDP.

Oops, I realized there is not yet code in main to unpause the ICDP. Added that, re-running.

Typing "HSC_STOP" and "HSC_RESET" commands in the Nios terminal window so I can turn on the pulse generator safely.

Turning on pulse generator. See pulses on scope.

Typing "HSC_GO". Now seeing pulses. I forgot to take away the "+" signs. Doing that now... Oops, fixed the leading-edge ones but not the trailing-edge ones! Correcting...

OK, that's better... Now, let's try increasing the pulse rate.

Tried 1 Hz thru 64 Hz at steps of 2x. 1 Hz through 32 Hz worked fine if jerkily. Flow control seems to be working; the serial errors are far fewer! 64 Hz resulted in a buffer full warning (not unexpected), and then STDOUT hung and the application stopped.

Oh, I need to fix the pulse buffer module which I figured out the other day is not actually thread-safe. However, I figured out how to make it thread-safe: Keep separate n_pulses_written and n_pulses_read variables, each updated only by its respective "thread" (really just execution context), and differentiate between buffer full and empty conditions based on the difference between these. Let me make that change now.

OK, that change is made. Trying to run it... Hm, firmware didn't start up properly... Devices could be hosed... Let's try reloading the gelware... OK, that helped.

Tried 50 Hz; this eventually led to a bunch of full buffer warnings. That's more like it... Changed code to continue draining pulses from datapath even if they have to be discarded. I'm torn about whether it's better to do that or just reset the datapath.

Had a run that successfully streamed data at 100 Hz (pulses per second). Tried 150 Hz; this led to a buffer-full event. This seems to be crashing STDIO; possibly it's that the re-entrant output routines are not really safe to use without MicroC/OS-II. I should try it again at a higher reporting level to suppress the STDOUT output, and see if at least we get the lost-pulse reports on the wireless link - that's more important anyway.

Late in the day Ray and I did a successful test with a real PMT. Video follows:

Wednesday, August 17, 2011

Odin's Day

Just learned from Wikipedia that Wednesday derives etymologically from "Woden's day," where Woden is an Old English name for Odin, ruler of Asgard in Norse mythology. You learn something new every day!

Things to do today:

[ ] See what pulse rate I can get at 115,200 baud when connecting directly to the PC.
[,] Implement flyover mode on Wi-Fi board and see what pulse rate I can get with that.
[,] Look at hardware flow control (CTS/RTS).
[ ] Maybe design a solution for adjusting the baud rate in software.

I did a quick check on the flow control issue... Although it appears to be supported both by the Altera UART and by the EZURiO module, of course (duh, I forgot) there are no pins for it in the Sparkfun level shifter module, so there is no way to get it working at the moment, unless we build our own new level shifter.

We went to Radio Shack and got components to breadboard a new level shifter that will handle the hardware handshaking signals. However, first I might try just bypassing the RS-232 levels entirely, like I did when communicating with the DE3 board and the Wi-Fi module.

I'll need to solder a new DE9 (D-sub 9-pin) connector: A female connector, with the following pins connected by medium-length wires to pins inserted in the complementary expansion-header through-holes of one of the Wi-Fi boards:

J2 pin 8 (M_DCD) --> DE9 pin 1 (DCD)
J2 pin 21 (M_TX) --> DE9 pin 2 (RxD) (Needed)
J2 pin 25 (M_RX) <-- DE9 pin 3 (TxD) (Needed)
J2 pin 10 (M_DSR) <-- DE9 pin 4 (DTR) (Needed - Autorun signal)
J2 pin 38 (GND) --- DE9 pin 5 (GND) (Needed)
J2 pin 12 (M_DTR) --> DE9 pin 6 (DSR)
J2 pin 19 (M_CTS) <-- DE9 pin 7 (RTS) (Needed for flow control)
J2 pin 25 (M_RTS) --> DE9 pin 8 (CTS) (Needed for flow control)
J2 pin 6 (M_RI) --> DE9 pin 9 (RI)

Actually, before I go to all that trouble, let me first just try using the one that I already made earlier for use with the DE3 board (GPS app). We will need to build a new one later though.

As a bonus, with this approach, we can monitor a duplicate copy of the Wi-Fi board's serial output in a UwTerminal window on the PC, using the normal serial port (which is still enabled for output only).

Installing the latest autorun file on that Wi-Fi board. It works fine for communications with the PC. Now regenerating the SOPC system with UART flow control support turned on. I imagine I will probably also have to regenerate the BSP and recompile. Doing that now. It compiles fine, but I'm still not sure if the STDIO library is really using those signals. We'll find out...

After we got back from Radio Shack, Darryl made the changes to add the enable signal (RUN_PAUSEn) throughout the input capture datapath, and got it to compile. I'm combining that with my changes to add the CTS/RTS signals, and recompiling.

I think I'll wait to actually test these changes until tomorrow, since it's getting late... Heading home now.

At home, did some work to prepare for setting up flyover mode in the script. Added support for it to modules\network\bridges.uwi, so that we can now just do goto_mode(BM_FLYOVER) to enter flyover mode, and we can leave flyover mode by just calling goto_mode() with any other bridge modes.

Added a "flyover" script command in modules\commands\cmd{parsing,handlers}.uwi.

After we've verified that the "flyover" command works, we can change 4_main.uwi to make it the default bridge mode (change the docmd_trefoil() call to docmd_flyover()).

Regarding baud rate setting: It would be easy enough to add commands to the FEDM and WiFi board's command-line interpreters to change the baud rate, but we have to be careful... After changing the FEDM's baud rate, communication to it will be cut off until the WiFi board's baud rate is also changed. The command to the FEDM can be called "BAUD", while the one to the WiFi board can be "baud". In either case, its single argument is one of the supported baud rates, as a decimal integer. It would be good after changing the baud rate on both sides to do a PING to the FEDM to verify that communication is still functioning.

But it's getting late now, so I think I'll wait to do those particular changes until tomorrow...

Also, note to self: Take the "+" signs out of the PULSE message to save bandwidth.

Tuesday, August 16, 2011

Tuesday Bluesday

I'm having a lot of back pain, night-time headaches, difficulty getting up in the morning. Sigh. I don't know how I am going to survive the next year.

It's possible the re-entrant versions of the stdio routines actually aren't appropriate to use when MicroC/OS-II (the embedded real-time operating system) isn't running. So, maybe they are just deadlocking when the ISR tries to print while the main thread is printing, whereas with a uC/OS2 based design, there would be a timer-induced preemptive context-switch between threads that would break the deadlock. Anyway, I could try using the RTOS features, but I don't know whether there is enough memory on the board for it... I should try the "Hello World" example. Anyway, I ordered a book on the OS so I can learn more about it. (Altera's documentation on it is pretty sparse.)
Another approach, short of installing a whole RTOS, would be to serialize the output myself using my own line-buffered output stream combinator. However, this would take more memory. Also, when this buffer fills up, how can a warning message about this situation be printed? And, what is it that is really hanging, output to the stdout console via JTAG, or output to the serial UART? (I suspect it must be one of these.)
I should run the firmware under the JTAG debugger, and, when it hangs, stop and see where it is. This might tell me something.

Also, need to back up the current design, and move it back to the shared drive.

I backed up the contents of R:\ (C:\LOCAL\FEDM_code\q91) on Dropbox in "FEDM_design\FEDM_code v2", and also copied it to the Q:\ network drive (C:\SHARED\FEDM_code\q91). The old contents of the Q:\ drive were moved to C:\SHARED\FEDM_code_v3.

I am now working in Q:\ (shared) again, instead of in R:\ (not shared).

I am recompiling the Quartus design with the latest firmware integrated in.

Had a weird problem where the debugger crashed... Hmm.

I just looked at the disassembly in FEDM_ctrl_fw.objdump, and it appears that our attempt at semaphores by doing ++ and -- on global variables isn't really working, because those operations aren't atomic; the adds are done in registers, rather than directly on memory locations (RISC architecture).

Added a sequence number on output pulses, and noticed something interesting - some of the output is apparently getting lost somewhere in transit, because there is a jump in the sequence numbers of the lines arriving at the server.

I wonder if the writes to the serial port are being done in nonblocking mode, in such a way that if the serial output buffer is full, data just gets lost?

Anyway, the problem (or at least, the most immediate problem) is NOT in the pulse buffer.

The problem (hanging of the serial output) seems to only happen for pulse rates over about 30 Hz or so.

Here's an idea: Let's send less data for each pulse, and see if it decreases the problem. This might be further evidence that it is tied to serial output buffering.

Another idea: Increase the serial data rate.

Decreasing the length of the lines seemed to help. Let's now try upping the baud rate to 115,200 baud. Changed it in SOPC builder; now regenerating... And rebuilding firmware... Also had to change the Wi-Fi script, reloading that... Also recompiling the Quartus design...

Oops, just noticed that the IDE is still referencing the R:\ drive... Had to spend a while fixing that.

Still having problems at 40 Hz pulse rate, even with the faster baud rate. And now, mysteriously, serial input isn't working.

While testing output from the Wi-Fi board, I noticed the Python server isn't treating Control-M (CR) like end-of-line; it is waiting for the next character first. This is suboptimal. However, it is difficult to fix since I am using the existing .readline() method. I'd have to modify how that facility works to fix it. I think I actually noticed this problem before. Oh well, it doesn't matter too much since Control-J (LF) still works.

Can't figure out why the UART is ignoring serial input at this baud rate. Looks like we will have to revert to the slower baud rate.

I think we should cut our losses WRT trying to solve this problem. We'll just have to be satisfied with an event rate of no more than about 30 pulses per second (sustained average), and occasional dropped pulses when this rate is exceeded. I was hoping we could do better than that... But if the stdio library is too crappy... Although to be fair, the problem could be with limitations in the Wi-Fi board's bridge facility instead, like inadequate buffer sizes... Since there's no flow control, if the Wi-Fi network isn't keeping up with the data rate, there's really nothing it can do, if it runs out of places to put serial data...

This brings up another question: Can we use CTS/RTS flow control? Check on this tomorrow.

Another thought: We should perhaps try the (reportedly faster) flyover mode of the Wi-Fi board. This will require some extensive changes to that script though.

Well, I took the baud rate back down to 57,600 and serial input is still not working! Connected directly to the PC serial port - it works from there! So I don't know WTF is up with that. In positive news, though, when streaming directly to the PC serial port, we were able to sustain pulse rates around 70 Hz (theoretical maximum). 100 Hz caused the buffer to overflow, as expected.

I should perhaps try re-upping the baud rate to 115,200 and checking again to see what sustained data rate I can get without the Wi-Fi in the loop. I should be able to get about 140 pulses per second, at least.

However, I eventually still need to fix the Wi-Fi input & data rate problems.

Note added later: Tuesday stands for Tiwaz' day.

Monday, August 15, 2011

Just another manic Monday

Today, starting on my To-Do list from Sunday.

BTW, I brought in a M-M DE9 cable I happened to have at home, so that I could get rid of the gender bender in the FPGA<-->WiFi serial link. Hopefully, there are no wires crossed in the new cable (although if there are, I might be able to get rid of the null modem adapter as well).

First up: Change data output format. This involves changing the server_stream_pulses() routine in server.c.

While looking at the newlib docs, I realized why my earlier conditionally-reentrant PRINTF() macro didn't work, because I wasn't using the "v" versions of the formatted output routines. Fixed that, trying again. OK, that is working now.

Some temporary glitches I ran into while starting things up:

Power-cycled Wi-Fi and it lost its settings.
Re-running our firmware within IDE didn't work - had to re-program board in Quartus.

Next: Add reporting levels. Did that.

Darryl came by and we spent some time fixing various compile errors in his modifications to implement the RESET control for the IC datapath, then tested it. It seems to be working. The RESET user command is now implemented.

At the end of the day, we were trying to debug an issue I have been seeing for a couple of days where the firmware seems to hang up when it is consistently getting too high a rate of events. If it can't keep up, it should drop pulses, but it shouldn't get choked up! Need to debug...

Senior Discount

It looks like we'll probably be able to get a senior design team to work on the project this year, in exchange for academic credit. Here are some parts of the system that they can perhaps take on and be responsible for:

Electrical engineers: Power supplies, pulse stretcher/amplifier (?), optoelectronics. Simplified board design?
Mechanical engineers: Enclosures for electronics, mounts for optical components, mirror assemblies. Mounting hardware for ceiling assembly.
Computer engineers: Timing sync input capture datapath, ADC interface (?), server-side features (data analysis, visualization, mapping, database storage). (Although really, CS students might be better for the pure programming aspects.)

Sunday, August 14, 2011

Look Ma, no wires!

Just taking some brief notes on Sunday to prepare for the week. On Friday I got buffered pulse data transmitting successfully over the wireless network to the server. Some things to do this week:

Make the data format a little more concise, so we can handle faster pulse rates - just send the starting time and the deltas between steps. Maybe do some more error checking - e.g. make sure all deltas are positive and not overly large.
Add reporting levels to firmware diagnostics, to suppress debug output when not needed. Consider adding a network logging capability for remote diagnostic purposes.
Write a little visualization widget on the server, so we can actually see the pulse shapes.
Ray is getting the physical scintillator/PMT apparatus ready, and then we can connect it to the front-end board, and send real data to the server.
We can actually go ahead & start logging pulse data on the server, although not yet with the accurate time values.
We can start work on integrating the GPS app into the system. However, the new datapath for time sync pulses still needs to be written (although in the meantime, we could maybe just use another copy of the existing datapath temporarily).
We can replicate the datapath 4 times for the 4 different PMT inputs, & dremel down the other 4 SMA connectors and solder them all to the board.

Friday, August 12, 2011

Running in Circles

I had to go over the main office today to sign the Fall employment paperwork, but she wasn't there, so I had to go back over again later in the day... Anyway, that's done now.

Today, my plan is to implement the circular buffer for pulse data, the purpose of which is to allow us to respond in a timely manner to input-capture interrupts (to prevent pulses from getting lost in the datapath), while also streaming data out to the server over the network at a manageable rate.

Yesterday I wrote the buffer.h file defining the interface, and today I just need to add one function to it (pb_init()), and then write the buffer.c implementation, and test.

Today I also want to test Darryl's ICDP reset code, which he finished yesterday.

One more little thing: Have the BUF_FULL interrupt request the main loop to stream out a notification of this event to the Wi-Fi port. That's done... I created a new module server.h to centralize the code to send messages out to the server.

At some point, we need to create a debug_level variable to allow us to suppress unwanted diagnostics from being sent to the JTAG debug port, since this slows things down. We can have a standard set of levels such as DEBUG, INFO, NORMAL, WARNING, ERROR, MUTE (from most to least information). Write a new module debug.{c,h} to keep track of that information.

Spent most of the day writing pulsebuf.{c,h} (new name of module). Got it working right at the end of the day. Now that it is working, I need to tweak a few things:

* Suppress the verbose pulse data diagnostics to STDOUT, to improve performance.
* See what the maximum throughput of pulses I can achieve is, without the buffer filling up.
* Write some server-side code to visualize the pulse data... :)

This is a good stopping point for this week!

Thursday, August 11, 2011

Wi-Finally

Yesterday I got the basic two-way Wi-Fi communication between the board and the server working properly, and now I am just making some minor tweaks, thinking about the startup process, etc. Some points:

I need to test the WIFI_READY message, from the Wi-Fi board to the FEDM. I already verified yesterday that the FEDM_READY message gets propagated correctly, if the Wi-Fi board is already up when the FEDM is started. However, in this scenario, the FEDM won't see the WIFI_READY message, and the Wi-Fi board won't see any messages from the FEDM, unless relayed back by the server (which could certainly be done). So anyway, we need to think more carefully about the turn-on sequence, and what the FEDM will do with its data if/when the server and Wi-Fi board are not yet up and running and ready to receive it. We have enough RAM to buffer up a limited amount of data, several hundred pulses' worth I think (more if compressed), but of course data may still get lost if the server is not up yet. Conceivably, after turning on, the FEDM should always first verify that the server is up and running before it even begins data collection. And the server should perhaps acknowledge receipt of data, and the FEDM could intelligently choke off the data stream if the server fails to respond in time. But, this may all be overkill for what is anyway an abnormal network failure scenario. Anyway, it perhaps deserves some more thought.
At some point, we should perhaps actually power the Wi-Fi board directly from the FEDM board's power supply, so they will turn on at the same time when the FEDM is powered up. Need to look at the voltages/currents, and verify how to do this. At one point, we were thinking of powering the Wi-Fi board through the existing serial cable (since a pin is allocated for that purpose); this needs to be revisited. If not, we can wire up a separate connection.
I briefly considered whether the Wi-Fi board should actually be configured and programmed by the FEDM firmware on startup, rather than independently. (The FEDM could be programmed to start at the faster baud rate that is first expected by the Wi-Fi board on turn-on.) However, there is probably not enough ROM capacity on the FEDM to hold the required files. Also, this would slow down the startup sequence significantly. It probably makes more sense to just assume that the Wi-Fi board will retain its configuration information. This is usually the case, although we might very occasionally see situations where the network goes down temporarily, causing the script to get hung up (unfortunately there was not enough script space to program more robust recovery from network failures). When the script hangs, it must be manually reset or power-cycled. If the FEDM controls its power, possibly the FEDM could cause the reset. However, sometimes on reset, the programming is lost. This is why giving the FEDM the intelligence to reprogram the Wi-Fi board might be desirable. However, it is probably not feasible.
I should probably go ahead soon and write the code to stream data from the FEDM to the server. It would be nice if we had a real multithreaded environment, so that we could just do this in a dedicated thread. However, in the meantime, we could just do it in the main loop. We need some kind of concurrency/locking primitive to safely access a shared data structure, e.g., a circular buffer for pulse data. Actually, one way to accomplish this without an atomic lock operation would be to just have separate read and write pointers into the circular buffer. The writer thread (ISR for the input capture device) updates the write pointer, but conservatively, always making sure the read pointer is far enough ahead to make room for the new data; and only updates the write pointer when it is finished writing the new data. Similarly, the reader thread (main loop) updates the read pointer, but conservatively, always making sure that the write pointer is ahead so that there is data available to read before reading it, and only updating the read pointer after it is finished reading the data.
We should add a command to soft-reboot the FEDM. There must be a HAL macro or function for this right? Or maybe a newlib routine?

Successfully verified that the WIFI_READY message is now successfully received by the FEDM when the FEDM is booted before the Wi-Fi module.

Now getting ready to write the code for the data buffer. Before doing this, I want to free up the maximum amount of RAM, so that we can make the buffer as large as possible. (However, it might be a good idea to write the buffer's initialization routine in such a way that it automatically allocates the maximum-sized block of memory that will fit, and reports its size... That way it will automatically adapt as we make other aspects of the code more complex.)

Relocating extra_RAM to just above program_ROM, to see if this will allow us to put the .bss section there; the point of which is to free up more space in working_memory. (BSS, for "Block Starting with Symbol," is this historical name for space for statically-allocated writable data preinitialized to zero.)

Memory map is currently:

0x00000-0x0ffff: program_ROM (64K) - .text, .rodata
0x10000-0x117ff: extra_RAM (6K) - .bss
0x12800-0x13097: (misc. memory-mapped hardware device registers)
0x20000-0x2ffff: working_memory (64K) - .rwdata, .heap, .stack

System generation was successful; now regenerating the BSP... Having some difficulty... I think some paths got screwed up when I moved from Q:\ to R:\, at C:\LOCAL\FEDM_code\q91 (which I did yesterday so as not to interfere with Darryl who was working in Q:\). Creating a new workspace and new projects from scratch, in R:\eclipse_workspaces\mpf_workspace and R:\software respectively. Moved all the old workspace/project folders into the old_software subfolder.

OK, apparently .bss can't be located at 0x10000 because it can't be referenced as a 16-bit offset from the global pointer which is near the top of working memory. So, let's relocate extra_RAM to ABOVE working_memory. New memory map is as follows:

0x00000-0x0ffff: program_ROM (64K) - .text, .rodata
0x10000-0x1ffff: working_memory (64K) - .rwdata, .heap, .stack
0x20000-0x217ff: extra_RAM (6K) - .bss
0x22800-0x23097: (misc. memory-mapped hardware device registers

Regenerating SOPC system, reconfiguring/rebuilding BSP, make clean... Damn, the .bss is 1148 bytes too large to fit in the extra_RAM. Maybe I can make the extra_RAM 2K larger? It would take about 28 more M512s, I calculate, and we supposedly have about 49 left, so theoretically it should fit. Now recompiling Quartus design with 8K extra_RAM (from 0x20000 - 0x21fff).

Incidentally, I should maybe try increasing the working_memory from 64K to somewhat larger (72K), by using a little bit more of the M-RAM, which might be possible if we go from a 32-bit memory word size to a 16-bit word size. I'm not sure SOPC Builder will let me do this without complaining though... Also, even if possible, changing the word size might slow down the code and/or increase the code size significantly. (Well, it might not increase the code size, if the bus handles the transaction width translation transparently from the perspective of the CPU.)

Alright, the fitter completed successfully, so now we are good. Glad to be finally making good use of the M512 memory blocks on the device... Every kilobyte counts! As soon as the Quartus build finishes, I'll retest the system with the memory rearrangements.

Hm, weird, looks like I spoke too soon... The fitter completed, but the assembler appears to have hung. I've never seen that happen before. Killing Quartus and restarting the assembler. Worked fine that time; weird.

Now I'm getting what looks like a stack overflow, even though we have plenty of memory now! What's going on? Doing another "make clean" and recompiling everything from scratch...

All I can think of that's changed is that the BSS area was moved. I may have to move it back...

OK, the Wi-Fi script crashed at some point... Maybe that was the problem... Although I don't exactly understand how that could really have messed up the UART driver at our end badly enough to crash our firmware...

Aha, at some point during my attempted restarts, the Wi-Fi lost the script... Sigh... Reloading it...

I also noticed earlier that something I did (unplugging a cable?) triggered the GPIO interrupt, which caused the script to exit.

Looks like the CPU got wedged to where I can't even reload the firmware; re-programming design... Now reloading/restarting firmware... Still nuthin'... Oh, system IDs aren't matching for some reason... Tell it to ignore that...

Argh, the GPIO crashed the Wi-Fi again... I think I need to disable that particular timer-driven input polling "thread..."

OK, now I am having this really weird thing, where I can communicate just fine over serial between the FEDM <-> PC, and between EZURIO <-> PC, but not between FEDM <-> EZURIO. In the latter case, a null-modem adapter and a gender-bender are inserted. This was working fine yesterday, but isn't today!

Oh, and to make matters weirder, communication works one way and not the other!

On the Wi-Fi board, I added the jumper on J10, between pins 1 and 2, which is supposed to ground the /EN pin of the level-shifter, making sure it is enabled, then reset the board. This seems to have fixed the problem. Not sure why this jumper was missing - I guess because I was manually tying its output to VCC earlier, to disable the level-shifter for purposes of communicating with the DE3 board. Anyway, I guess I was just lucky before that it worked for a while even with the jumper missing (enable pin was floating).

Now, on to the next problem. We have ability to capture pulses, ever since I put the new project together (and increased the size of extra_RAM and moved the BSS section in there). Possibly Darryl's changes (adding a reset control throughout the input-capture datapath) will help, once that's working, but things were working fine before, so it's troubling that suddenly it's not working.

The problem went away after I recompiled the design. Also, when we turn the function generator off and on, it seems to sometimes hose the state of the FPGA; so, let's remember to reload the FPGA after any such changes.

Finally, I can get back to my earlier task: Writing the circular buffer for pulse data.

Let's first calculate the size of the pulse-form data structure. A time_val is 64-bits, i.e., 8 bytes. Thus a RiseFall is 16 bytes. Thus, 96 bytes are needed for the array of 6 rise/fall pairs. Plus 4 more bytes for the integer "nlevels," gives us 100 bytes all together.

Wrote the buffer.h header file; write the C code tomorrow.

Wednesday, August 10, 2011

Negative Feedback

Things to do...

Today, 1st order of business is to debug Darryl's negative-pulse-detection configuration code - it wasn't working yesterday when I tested it. It occurred to me that possibly he used the wrong bit - need to check.
In other matters, I need to test the wireless communication - yesterday I verified both halves of this (communication from the FPGA board to a UwTerminal on the PC over the serial port, and communication from the server to the Wi-Fi board over the serial port), and then I plugged the FPGA board into the Wi-Fi board via a null modem adapter (since the level shifters at both ends are configured as DCEs; they both use pin 2 of the DE9 to transmit and pin 3 to receive), and so it is ready to test.
It also occurred to me that the Wi-Fi board ought to generate a "WIFI_READY" message to the FEDM after it finishes powering up & connecting to the server and is ready to relay messages. Need to add that to the autorun script... And add a command handler to the firmware to recognize that change of state (since there's no point in sending stuff out the serial port before then).
After that, I can just continue going through the list of bullet points from Friday, adding new features...

Let's get to work...

(1) Debug NEG_INPUT:

Checking top-level schematic... Darryl used bit 2 from the icdp_ctrl PIO bus to control the input-capture datapath. This was not a good choice, since bit 2 (on the input side) is already allocated to the HAVE_DATA input. Although it is perhaps possible to use a given PIO bit for both input and output, for different purposes, it seems to me to be unnecessarily confusing. And it might conceivably cause problems.

Checking the C code (icdp_driver.h)... He does define NEG_INPUT_MASK as (1<<2), so at least that is consistent with his choice in the schematic.

Changed the NEG_INPUT signal from bit 2 to bit 4.

Let's modify the schematic to check the output of the XOR gates as well, so we can make sure that is working.

OK, the input pulses are getting negated properly, but we're not capturing them for some reason... It could be that during startup the datapath is getting corrupted.

This is a good reason to go ahead and add the RESET control signal to all the input-capture datapath modules. That way in icdp_init(), we can ensure that the IC datapath is all cleared out and ready before we turn on the pulse generator for testing (with everything properly configured).

Aha, we are having stack overflow problems again. Fixed that by moving the stack from the 6K extra_RAM (M512 blocks) to the 64K working_memory (M-RAM block).

Now we are having apparent timing issues; recompiling at 400 MHz. That fixed it. Revisit performance later. Seems like fmax changes a lot (up or down!) every time we make any little tweak to the design.

(2) & (3) - Wi-Fi stuff.

Wi-Fi communication with the server. is working fine. I set up the Wi-Fi script to send the FEDM a "WIFI_READY" message when it is about to enter its main loop, and likewise the FEDM sends a "FEDM_READY" message when it is about to enter its main loop. For some reason, the FEDM isn't picking up the WIFI_READY.

First thing tomorrow: Re-burn the script.

Tuesday, August 9, 2011

Serial Killer Returns

Today I need to work on solving my serial input woes. The problems seem to have to do with my using the "Reduced Device Drivers" option, yet, without that option, I seem to be having problems with stack overflows. But I've maxed out the FPGA's M4Ks, so I can't make the RAM any bigger. It is 68KB currently. The RAM is using 557,056 M4K bits. However, I have lots of M-RAM and M512 RAM still available. I think what I need to do is make new RAM units that use those.

The M-RAM is 4Kx144 bits, which is 589,824 bits or 73,238 bytes or 72 KB.

There are 202 M512s, each 32x18=576 bits, for 116,352 bits, 14,544 bytes, or about 14.2 KB.
165 of the M512s are available; i.e., about 11.6 KB.

Then we have the 68KB worth of M4K blocks we are using; I think this is 120 out of the 144 M4K blocks, and the rest of the M4Ks are going to miscellaneous other things.

The M4Ks are faster than the other memory block types, but they are all plenty fast enough (420-550 MHz) considering our system clock is only 50 MHz.

So, what we could do is, say, use the M-RAM as a repository for program code - it is plenty big enough for that, since our program size is only 58 KB. We could create say a 10 KB working memory. And then we still have 68KB worth of M4Ks, which could be used to increase the size of the FIFOs, support all 4 input capture datapaths, and so forth. So, we're good...

Starting work on that now.

Oops, apparently M-RAM blocks cannot be initialized (at FPGA programming time I guess); so, I guess we'll have to use the M-RAM for our working memory, rather than for the program data. 72 KB of working memory sounds just dandy. :)

So, here is our new selection of memory modules:

name "program_ROM", type M4K, read-only, size 64KB. (can increase back to 68KB if needed)
name "working_memory", type M-RAM, writable, size 72KB. Program heap can go here.
name "extra_RAM", type M512, writable, size 10KB. This could be used for the stack.

Since this releases some pressure, let's try upgrading the processor back to the Nios II/f. This gives us back hardware multiply, and gives us an overall faster architecture.

Clicking "generate." After that's done, we will have to rebuild/reconfigure the BSP to use the new memories. Oh, but first we have to recompile in Quartus, to make sure it fits.

Updated project files to add the new SOPC modules, and remove old ones that are no longer used (old memories I experimented with).

Apparently we've got 46 too many M512 blocks to fit, probably due to the cache RAM in the Nios II/f. Let's see, that's about 3KB+ too much memory usage. Let's reduce the "extra_RAM" size to 6K. That's still large enough to serve as a stack region, most likely.

Now, for some reason, it thinks it needs 2 M-RAMs. Perhaps it can't use the full 144 bits of width, since we set the memory word size at 32 bits (16 bits could maybe have fit better, since 144 = 9x16). If we use only 128 bits of width, that cuts the M-RAM size to 64KB. Still plenty. Let's try that. Regenerated; recompiling now... The fitter has made it through placement without errors, which looks promising...

Compilation was successful, and slow model fmax was back up to 481 MHz for some reason! Don't look a gift horse in the mouth...

I'm getting better at using the BSP Editor; all I had to do in the "Linker Script" tab was hit "Restore Defaults..." in the top (linker regions) and bottom (linker sections) areas, then tweak the linker sections with the pulldown menu, to locate the .rodata section in the program ROM, and put the .stack section in the extra RAM.

Fixed a mangled debug flag option in the BSP settings...

OK, serial input from both sources is working now! However, I still have an issue where the line buffer module doesn't respond right away to carriage return. I need to modify it to treat carriage return as end-of-line, and go into a mode where it will ignore a subsequent line feed.

OK, that change is implemented now. Perhaps it is a good time to try connecting through the Wi-Fi board to the server?

OK, got the Wi-Fi board booting and connecting to the server and opening up all its terminal windows (MAIN, AUXIO, UART). It wasn't responding to commands properly and then I remembered I forgot the various AT+SET commands that are necessary. I modified the command parser to recognize all the new FEDM commands (REINT, HSC_{RESTART,RESET,STOP,GO}) and tested that with UwTerminal.

Now, I've realized that the null modem adapter isn't needed when connecting from FEDM<->UwTerminal directly (since the pins can just be flipped in Quartus); but when talking to the Wi-Fi board, it is needed, because (here's the right way to think about it), the level-shifter is designed to go at the DCE end of the cable, but we're using it at the DTE end (because the FPGA board is acting as the host).

Darryl came in and write the NEG_INPUT code. We're trying to run his changes but seem to be having stack overflows. I'm thinking that maybe the section assignments aren't having the effect one would expect. Oh, I think I found the problem - text section in the wrong region.

Got negative input pulses from the pulse generator.