Tuesday, January 17, 2012

Tue., Jan. 17th

The Quartus compile I started Friday finished after I left.  First thing I did today:  Burned it into the FEDM.  Next major thing on agenda:  Debug the timing input capture datapath (which is still producing no output with stream_pulse_test_out).

David texted that he is working from home on the paper today (and that this is OK with Dr. O'Neal).

OK, let's debug.  First, let's look at the HAVE_DATA signal from the timing-edge input capture datapath; this is HD_3.  If it's not rising, then that would explain why nothing else is working.

The HD_3 node is already tapped out to J55 pin 2.  Let's take a look at it...  Let's use channel 3 (previously used to monitor the threshold level from DAC#6).  Current scope configuration is:
  • Channel 4 (green) - J60 pin 1 - PMT_3 node (TimingSig)
  • Channel 2 (blue) - J79 pin 1 - pf3[5] - Output of comparator between TimingSig and DAC#6
  • Channel 3 (pink) - J55 pin 2 - HD_3 - HAVE_DATA output from tsedge_datapath_v1_56
Pink stays low.  So now, we're going to have to dig into the internals of the datapath to figure out why it's hosed.

First, let's see if the DP is reporting that it's stalled.  Hooking up the sync_error status output flag to pin T7 --> J46 pin 2 (behind J49, the DE9 header).  Since the input signal TimingSig is a known factor now (coming from the Waveform generator), we'll trigger off of the blue comparator output instead, and cannibalize the green probe (channel 4) to monitor sync_error.

Current scope configuration is:
  • Channel 2 (blue) - J79 pin 1 - node pf3[5] - Output of comparator between TimingSig and DAC#6
  • Channel 3 (pink) - J55 pin 2 - node HD_3 - HAVE_DATA output from tsedge_datapath_v1_56
  • Channel 4 (green) - J46 pin 2 - node sync_error - status flag output from  tsedge_datapath_v1_56
At this point we have to recompile in Quartus, since sync_error wasn't already tapped out.

OK, that compile finished and I burned it too the board.  sync_error is staying low, so tsedge_datapath_v1_56 isn't getting stalled (or at least, isn't detecting that it's stalled).  Let's crack open tsedge_datapath_v1_56.  The first part of it (which produces sync_error) is pulseform_cap_tsedge_56.  Let's look at its handshake output, hs_prod.  Let's tap it out.

Calling it int_hsprod_tap (output port of tsedge_datapath_v1_56) --> int_hsp_debug (top-level node) --> PIN_E12 --> J48 pin 2 (behind DE9 connector).

Let's cannibalize scope channel 3 (pink) to inspect it.  Current scope config is therefore:
  • Channel 2 (blue) - J79 pin 1 - node pf3[5] - Output of comparator between TimingSig and DAC#6  (Triggering on this one.)
  • Channel 4 (green) - J46 pin 2 - node sync_error - status flag output from tsedge_datapath_v1_56
  • Channel 3 (pink) - J48 pin 2 - node int_hsp_debug - Tap out of internal producer handshake from inside tsedge_datapath_v1_56.
OK, nothing from that either.  Therefore, we're going to have to crack open pulseform_cap_tsedge_56, to figure out why it isn't generating its producer handshake output (port hs_prod).  The internal node is named "prod".  It comes from pulse_combine_tsedge_56.  However, before this point is another, internal handshake, the hs_datarec output from pulse_prep_tsedge_56, which feeds into the hs_data1 input of pulse_combine_tsedge_56.  Let's tap this out and inspect it.  I'll just use the same output path as before.  Recompiling now...

Nope, nothing there either.  Now we have to crack open pulse_prep_tsedge_56.  Inside there is another internal handshake, between the hs_prod output of se_pulse_cap_tsedge_56 and the hs_prod input of cs_combine_tsedge_56.  Let's call that node int_prod_hs, and tap it out as port int_phs_tap, and again we'll reuse the same output pathway as before.  Again, it should come out on the pink trace.

Still nada!  Let's look inside se_pulse_cap_tsedge_56.vhd.  Let's tap out the current-state bits.  We'll put them on digital input bus 2 (B2) on the scope, taking them out on pin 1 of J76 & J77 (replacing pf3[1..0] which we aren't using).

Doing the Quartus recompile... I have to leave now for the EEP workshop; we'll have to continue this tomorrow.

Friday, January 13, 2012

Fri., Jan. 13th

Today both the CTU and the FEDM were acting pretty flaky.  I couldn't get the CTU to connect properly to the server so we set up the Tektronix as an input stub to the FEDM.  Then the FEDM wasn't reliably doing the threshold comparison on the input, and the input node was sitting at the wrong level.  Also the FEDM kept rebooting itself.  David and I spent a long time fiddling around and doing various tests.  Finally at one point David noticed a little fleck of metal sitting across two pins of one of the DACs.  After getting rid of that (and actually even before then!), we no longer had the 10 ohm short of PMT_3 to +2.5VCC.  However, we still had an unexplained 500 ohm short from there to ground.  So, we went ahead and biased PMT_3 to GND and reintroduced the DAC#6 setting at +300 mV.  Also, David noticed that the cooling plate wasn't seating properly on the metal block.  Now we finally have a nice reliable input pulse, and no more resets, but still no output from stream_pulse_out_test.  We'll have to finish debugging that next week.

Some goals for the coming months, from group meeting:
  • By end of January, have timing sync datapath tested, debugged, and validated, and a complete set of all data (timing data plus pulse data from all 3 detectors) streaming to server.  (Still at 200 MHz.)
  • By end of February, have the LogicLock work finished, and 500 MHz data streaming to server.  Also have the mechanical hardware substantially completed (for the midterm hardware demo).
  • By April have all hardware ready to install in CLC (if not installed), and have the server software in pretty good shape at least - improvements to it can always be made later.

Wednesday, January 11, 2012

Wed., Jan. 11th

We finally have the authorization codes from Mentor.  Late this afternoon, at lab, I will start downloading & installing the software, so we can begin designing new boards and/or making changes to Sachin's board.

We have the workshop scheduled from 2:00-4:00 pm today.  Need to find out from Aarmondas if a room was reserved, if so where.  I've loaded the presentation onto my iPad.  During my lunch hour I'm planning to go to Best Buy and buy the little VGA converter dongle, so I can hopefully give the presentation directly from the iPad.

Once I'm back at the lab, I also need to install and test the Quartus compile I started last night before I left.

Aarmondas and I reserved B202B, the small conference room next to the Dean's conference room, for the workshop.  I asked Aarmondas to call & email everyone to let them know the location.

Gave the workshop.  The ECE senior design students, as well as Brian Kirkland and Michael Sprouse were in attendance.  Going to post the slides on the group blog.

Burned the latest gelware (expecting a negated timing sync input which is compared with the first threshold) onto the FEDM board.  Now need to test with the scope.

No data yet from stream_pulse_out_test.  Let's examine the comparator output pf3[0].  Hooked it up to J76 pin 1.  Now doing a Quartus compile.

OK, pf3[0] is going to 0 shortly after PMT_3 (TimingSig) pulse crosses below the VTH1 threshold, as expected.  I measured the period of pf3[0], and it is 409.6 us, as expected, so any noise on the input node is not enough to cause glitching of the comparator output.  So why am I getting no data from stream_pulse_out_test?  Need to test some more signals tomorrow.

Tuesday, January 10, 2012

Tue., Jan. 10th

To do today: Modify firmware to set the last DAC level to +300 mV for detecting the timing-sync edge crossing.

David is out sick, but Darryl is here.  He is looking at some Altera online courses.

Have to leave a little early this evening to go to the entrepreneur workshop.  I guess.  (Not very excited about going.)

License server is not running, or not serving our floating licenses.  Starting LMTOOLS and re-reading license file.  Now the license server is up.

Examined the layout and the board carefully looking for things that might account for DAC #2 failing as well as the 10 ohm short between +2.5VCC and PMT_3.  I noticed on the layout that PMT_3 passes underneath the chip for DAC #2.  Also, it crosses right underneath a pad of C139 which is part of the +2.5VCC node.  A hole between layers in either location might account for the 10 ohm short (since DAC #2 is powered by the +2.5VCC supply).  However, peering closely at both parts thru multiple magnifying lenses, I didn't see anything that was clearly suspicious. You can't really see underneath the parts, anyway.  However, just in case, I blew on both parts with the dust remover spray (1,1-difluouroethane, from Radio Shack) - who knows, this might get rid of a bit of grit wedged underneath the chip.  Obviously this is just a desperation maneuver, and I don't expect it will necessarily help.

Tidied up a couple of slides for the workshop, which is tomorrow at 2.  Earlier today I emailed Aarmondas asking him to reserve a room.

I'm now modifying the init_dacs() function in dac_driver.c, to set the last threshold to +300 mV.  The others are still arranged in a logarithmic ramp from -200 mV to -1V, although now with one fewer step.

Compiled new code in Eclipse.  Compiling it into Quartus design.

CTU is connecting/running fine today, aside from no satellites acquired (unsurprising since GPS is cold-booting).  We probably really need to get a new GPS module that can connect more quickly.

Ah, I just remembered, due to the 10 ohm short between PMT_3 and +2.5VCC, when the power is on, that node floats at +2.5V instead of at GND.  Therefore, the timing sync pulse has to be a negative pulse.  That is accomplished easily enough by a NOT on CLK_OUT in the Quartus design for the CTU.

This means (on the good side) that we can go back to the 5 thresholds we had previously, and just re-use the first (-200 mV) threshold for the timing sync input.

Made those changes, now doing the Quartus recompile.  Have to leave now though.

Monday, January 9, 2012

Mon., Jan. 9th, '12

Plan for today:  Remove the surface-mount chip capacitor C148 (next to chip resistor R91) using a relatively sharp-tipped soldering iron, by going back and forth between the two pads heating them.  OK, we did that.

David and I measured the strength of the pulse on the PMT_3 (TimingSig) internal node at 360-400 mV (equilibrium "high" pulse level, although there is ringing up to about +900 mV with these probe cables).  This is, I think, a bit better than before the capacitor was removed.  But it's still not high enough to allow us to leave the current threshold levels unchanged.  We'll have to modify the firmware to set the last threshold level to about 300 mV (as opposed to the +1.5V where it is currently).  I am a little bit concerned that this 300 mV level is so low that we may see occasional noise pulses if we won't actively filter them out.  We can filter out noise pulses in software, except for the occasional one that is close in time to the expected timing pulses.

We also re-checked the resistance between node PMT_3 and the +2.5VCC node.  It is still only 10 ohms, which is probably still contributing to our problems.  I went through the layout & schematic and checked the nominal resistance of all the resistors connected to the +2.5VCC node - they are all high, and many of them only connect to disconnected paths, so the problem is still unexplained.  I also checked the R between +2.5VCC and GND, it is 365 ohms, so it doesn't account for the 10 ohms.  I'm about ready to give up on trying to track down this problem.  The only way to make progress on it at this point might be to get some kind of IR imaging camera and try to find an unexpected hot-spot on the board.  2.5V across 10 ohms is 250 mA, times 2.5V makes 625 mW; although this is less than a watt, it still is possibly enough to produce some observable local heating.

I took Samad's power distribution board out of the loop for now, because it is too difficult to maintain a reliable power connection to the CTU with it in place.  This problem needs to be addressed sometime.  For now, we will just power the CTU directly from the power supply.

The CTU's Wi-Fi module is not reliably connecting to the server today for some reason.  It did manage to connect once, but it didn't feed through any data.  Need to run some more tests sometime to try to track that problem down.

I had a minor syntax bug in the new appdefs.py file, which was quickly fixed.

Wednesday, January 4, 2012

Wed., Jan. 4th

David texted and said he'd come in on Monday.

Called Sonja about the timesheets today, she said they'll be ready soon...

Thought I'd add a little more material to the presentation today to help the students get started on the Python coding.

FUCK DAMMIT, it looks like my fricken' computer rebooted itself overnight (power outage? Windows update?) and I lost all (or most) of the work I did yesterday on the presentation.  I must have forgotten to save the file.  Goddammit!

Checked OpenOffice's backups folder, and it wasn't there either.  Why didn't it autosave, or offer recovery, or something?!?!?!?!?  Bleargh.  I hate free software sometimes.

Tuesday, January 3, 2012

Tue., Jan. 3rd

First day back since break.

I don't seem to have my Spring semester timesheets yet.  Emailed Sonja to see if they are ready.

Some facilities people came by and cut away the loose pieces of flooring.

Bumped into Aarmondas this AM and reminded him to schedule the Python workshop soon.

Re: The Python presentation.  I found a printout of my old class inheritance diagram for the communication classes.  I couldn't find the file - it may be on my computer at hope.  I scanned the printout so I could include this in the presentation.  Continuing work on presentation.  Also added a class hierarchy diagram for the major classes in the communication subsystem (communicator, bridge, mainserver).

Brought in my pack of jumper bridges.  Used them to replace the jumper-wire loops we had previously on a couple of boards.

I also need to work on recommendations ASAP.