Thursday, June 30, 2011

FI-FO, FI-FO, the buffer's ready to go!

So, we are finished testing pulseform_cap (last thing Friday I changed it to 1/2 sec. between pulses) and so I suggested to the guys that they go ahead and hook it up to the FIFO and check the output of the stream_pulse_data.  We got nuthin'!

Then we looked at the handshaking signals to our_FIFO - they seem correct (toggling every 1/2 sec. with current pulse generator) but they aren't stalling after 16 cycles (when the FIFO fills up) as expected!

Then we looked at the raw signal to the FIFO megafunction, and it looks like the write signal is never being asserted.

Mike inspected his logic in his FIFO_writer module, and identified one possible bug.  We are recompiling now.  Still doesn't seem to be stalling correctly...  We did manage to find the write pulse though, after coaxing the scope into triggering properly.

OK, we are inspecting some more signals now.  The 2-bit state of the FIFO_writer module, and the 4-bit count from the FIFO of the number of stored entries.

OK, apparently FIFO_writer is actually working correctly now, the problem is that something is reading out the data shortly after it's added.  The bug could be in FIFO_reader!

Aha, saw from looking at the FIFO_reader state that it was getting stuck for 25 cycles, suggesting that the 25 words of data were getting streamed despite the absence of the pump signal.  Looked at the stream_pulse_data module, and indeed, the wait_pump_low state forgot to actually wait for the pump signal to go low before advancing the index variable configuration to go to the next word.  Fixed that, and now the output freezes after 16 pulses (after the FIFO fills up) as expected.  We can see the number of words in the FIFO increasing.

So anyway, we have pretty good confidence that the FIFO is working now, and (guys) it is now ready for stream_pulse_data to be tested more thoroughly by the new stub module, which will operate the pump handle as required and send output data to our scope bus.  The output words are 32 bits long, and you can ignore alternate ones (since they are just the upper 32-bits of the 64-bit words).  You might want to take just the low bytes of the other words, and do subtractions like in the other stub module, so that the time values in your output are all relative to the time of the first edge (leading edge of threshold 1).

Please feel free to text me with any questions you may have.

Wednesday, June 29, 2011

Climbing the Pyramid

Today, we successfully got our test bench working for the pulse waveform input capture circuit, at a PLL clock speed of 250 MHz (which gives us a sampling rate of 500 Msps with the dual-edge triggered front-end modules).  This achieves a time resolution of 2 ns, which meets our goal of a time uncertainty (imprecision) of within +/- 1 ns (just barely!).

The logic was actually already correct at the end of the day yesterday.  Today, we just honed in empirically on the maximum PLL frequency still producing reliable results (with no glitches cased by register setup time requirements not being met).  Also, Mike rewrote David's input stub to make it easier to modify the delay between pulses.

Here is a video showing correct results for "fake" comparator outputs modeling sawtooth-shaped input pulses crossing anywhere from 1 to 6 threshold levels.

Tuesday, June 28, 2011

Firming up the Warez...

Today I want to start putting together the firmware.

David is here and compiling the code Darrel wrote last night.  Darrel isn't going to be here today.  We don't know about Tyler.

Output was improved (only one batch of data per synthesized input pulse) but still had numerous problems.

First we set the proper threshold levels for the digital scope inputs, to get rid of the glitching.

We fixed several errors in the program logic (control flow) and a couple of places where signals were used where variables should have been used.

Mike added a state to give input data time to settle.  Still having a problem with one output value though (in the 6-threshold case).  Finish tomorrow.

Monday, June 27, 2011

Being Rumplestiltskin...

I haven't really been asleep for 20 years, just the last 2 weeks... Back to work today.

Trying to get reoriented...

Darrel & Tyler are here; giving them some tips on the VHDL coding for the output stub module for the pulse form digitizer...

I am currently adding the following PIOs to the Nios SOPC system in my test project in C:\LOCAL\FEDM_gelware\COSMICi-mods\Stratix-TDC-V1:
  • icdp_ctrl (Input Capture Datapath Control) - 8 bits, output.  For controlling the datapath.

    - Bit #0:  RUN_PAUSEn - Pause the whole datapath (0), or let it run normally (1).
    - Bit #1:  PUMP_DATA - Rise = pull next word.  Fall = done with word.
    - Bits #2-7:  Reserved for possible future use.

  • icdp_stat (Input Capture Datapath Status) - 8 bits, input, with synchronous rising-edge capture IRQs with single-bit reset for the edge capture register.  This will allow the CPU to independently manage interrupts on each of the kinds of events that can occur.

    - Bit #0:  BUF_FULL - Datapath is stalled b/c FIFO buffer is full (1), or not (0).
    - Bit #1:  HAVE_DATA - Data is available to be pulled (1), or none is left (0).
    - Bits #2-7:  Reserved for possible future use.

  • icdp_data (Input Capture Datapath Data) -32 bits, input.  No interrupts.
Re-generated the design files.  Now working on re-wiring the top-level schematic to add the new Nios system, and wire it up to the PMT_IC_Datapath module.  That's done.


The guys got their output module to the point of testing, and identified another 6 output pins to use so that they can display the input pulse alongside the output data.

There is some unexpected high-frequency glitching of the data from the output stub; not yet diagnosed.

There was a misunderstanding about I wanted WRT the output data; that is  being fixed.

For me to do tomorrow: 
  1. Import their (the students') input stub (which is working) to my top-level schematic, to feed test inputs into the input capture datapath;
  2. Start slapping together the C code to drive the PMT input capture datapath, and output diagnostic information to the JTAG stdout, and formatted data to the serial port.

    Thursday, June 9, 2011

    Testing, 1, 2, 3...

    Today, one thing I want to do is to come up with a big list of things for the students to do, so that they can work a little more independently of me over the next couple of weeks.

    First, I need to give them accounts on COSMICi, since they will need to be able to log into it to burn designs through the ByteBlaster cable. (Either that, or we need to get the XP installation on the Acer PC working again.) And I need to show them how to start up the Altera license server.

    Here are some things the students can work on:
    1. Write a test rig for the module they just finished (pulseform_cap). This can include the following:

      (a) An input stub that periodically generates a "fake" set of waveforms for all 6 thresholds, as if there was a triangle pulse coming in from the PMT input. It should scan through the number of threshold crossings: e.g., first generate a 1-threshold pulse, then a 2-threshold pulse, ..., up to a 6-threshold pulse, and then back down to 1. This will ensure pulseform_cap is exercised in all these test cases.

      (b) An output stub that sends the data packet from pulseform_cap sequentially, in 16-bit chunks, say, out to the digital scope. (They will need to identify 8 more output pins, or else just send the data in 8-bit chunks.) Send data most-significant word first, so that all 4 words (4 hex digits each) making up each 64-bit value will be in the same order on the scope as if they were written on paper (this will make them easier to check). Another tweak: If the leading-edge time for crossing threshold 1 is subtracted from all the values before outputting, then all the higher bits will be stable (at 0), and so the results will easier to read, and in fact, we could then get by with outputting only the lower bits - maybe just 1 byte for each threshold-crossing.

      Then, they just need to capture a sequence of data values on the scope for each of the 6 test cases, and verify by hand that the all the values returned make sense given the timing and height of the fake input waveform that was generated.

    2. When that's finished & working, they can then work on testing with "real" analog pulses from outside. That will include the following steps:

      (a) Put Sachin's code back in - we will need it to set the thresholds in Labview.

      (b) Run Labview on PC (we'll need XP again for this), set thresholds, test with multimeter.

      (c) Connect an external source (e.g., waveform generator, generating triangular pulses) to one of the PMT inputs of the board (I suggest PMT_1). Identify the 6 LVDS input pins that serve as the comparators to detect threshold crossings for this input.

      (d) Verify that the sequence of values returned by the test stub makes sense, given the parameters of the external pulse. Vary the pulse height and width using knobs, and check that the output data varies accordingly.

    3. Finally, they can test their stuff together with the FIFO and the Nios firmware (C code) that I (Mike) will be working on in the meantime.
    My plan for this at the moment is:
    1. To reduce the number of PIOs that will be needed, define a new sequential module to break the data into 32-bit chunks, and send them, one at a time, to a single data-input PIO. The CPU can toggle a control signal to "pump" the data out of my sequential module, one word at a time. For a pulse that did not even cross all the thresholds, we can just send data for the thresholds that were crossed.

    2. Modify the Nios system design in SOPC builder to include the needed PIOs. These will include (at this point, in the envisioned design), for each PMT input:

      (a) An 8-bit output-only PIO, used bitwise for control of the datapath. Bits to include:

      * RUN_PAUSEn - Suspends the entire datapath when asserted.
      * PUMP_DATA - CPU raises this to cause the datapath to transmit the next word of data to the data-input PIO. CPU lowers this when it is finished reading that data word.

      (b) An 8-bit input-only PIOs, used bitwise to receive various status & interrupt signals from the datapath.

      * BUF_FULL - The FIFO is full, and so some pulse data may be being lost.
      * HAVE_DATA - Asserted when there is data for the current pulse left to stream. De-asserted when done.

      (c) A 32-bit input-only PIO, used for streaming of pulse data from the datapath, one 32-bit chunk at a time.

    3. Write the C code to receive the pulse data. This will include the following tasks:

      (a) Write the code to set up and receive the interrupt, which will get triggered when the producer handshaking signal (HAVE_DATA) from the IC datapath (indicating new pulse data is available to stream) is asserted.

      (b) Write the code to pump the PUMP_DATA signal, read the words from the datapath, and put them together. (The code can verify that after the data for all the levels is received, HAVE_DATA goes low shortly after.)

      (c) For testing purposes, we can output the PMT data words in decimal (ASCII) to the serial port.
    2:59 pm - I finished step 1 above, i.e., created a VHDL module "stream_pulse_data" that will stream the pulse data to the CPU one word at a time. Next, I need to add the appropriate PIOs to the Nios system ("FEDM_NiosSys") in SOPC Builder.

    Wednesday, June 8, 2011

    Path the Data, pleath...

    Yesterday, David worked some more on the pulseform_cap code, and I worked a little on the back end of the datapath.

    Today: Continue that work.

    On my way in this morning, I picked up the highest-end Radeon graphics card that they had at the local Best Buy. It was only $170, and I calculate it should pay for itself by mining bitcoins within 2 or 3 weeks. After that, it should help me make it through the summer...

    In actual work news, today the guys finished writing their pulseform_cap module, integrated it into the PMT_IC_datapath module, and got that to compile. Next up: testing.

    Tuesday, June 7, 2011

    Gold Finger

    Today, I sold my MIT class ring and my wedding ring, so that I can survive the next 10 days without starving (and so that the check for the down payment on the car doesn't bounce). Boy, those retail gold buyers sure are stingy about their payouts. But, I didn't have much choice. I'm looking forward to building up my future assets in the Bitcoin economy instead, so that the transaction fees to quickly convert to/from other virtual currencies (like USD) will be lower... Cut out all those damn greedy middlemen (retail shop, gold refinery, jeweler, etc.). After my next paycheck would be a good time for me to plunk for a nice fast ATI card for my mining rig...

    Stuff to do today:
    1. Wire the FIFO up to some PIOs in SOPC builder to control & read out data from one PMT input capture datapath.

    2. Write a stub module (test rig), to generate fake data to input to the datapath for testing purposes.

    3. Start writing & debugging C code to accept data through the datapath (from the test rig).

    4. Meanwhile, the students are working on their pulseform_cap module. When they finish it, have them start creating a test rig for it. This plus pulseform_cap can then be used as another test rig for the FIFO.

    Monday, June 6, 2011

    The Title Goes Here

    Still wrestling with car transition issues. I called SunTrust, and they are going to mail me a printout of the image of the paper title of my old car from their archives. It should be here by the end of the week. DMV could give me a duplicate title right away, but they charge $85 for the same-day service, so I may be better off waiting. However, actually making it through the end of the week may be difficult; I am dead broke until I get the old car sold. In the meantime, I might need to pawn my class ring, or something.

    I started mining for Bitcoins at home (a block of 50 is worth more than $800 now!), but with those old computers, I'm not sure it's even worth the cost of the electricity. I'd probably be better off buying a nice fast ATI graphics card (once I get some money again!) and running off the GPU instead. E.g., I could buy an ATI Radeon HD 5770 for $124 and it could do about 200 Mhash/s, for a cost-efficiency of 1.61 Mhash/s/$ (not including power). Mhash/J is about 1.45, or in other words power consumption is about 138 W. That costs about $0.35/day or $2.45/wk in terms of electricity. According to the Bitcoin Mining Calculator, right now 200 Mhash/s will earn about $48.60/wk., so in other words after paying for the electricity, this card will earn about $45/wk. Seems well worth it! Anyway, we'll worry about that later...

    As for real work... For today, here are some things for us to work on this week:

    1) Inspect technology map of pulse_cap circuit, look for more pipelining opportunities. Try increasing speed further.

    2) Write pulseform_cap module, or (probably better) have the students write it, as an exercise.

    3) Finish writing FIFO_reader module, start writing C code to test it.

    Tyler and David are here today. Tyler spent a little time looking through the Technology Map to see if there were any more two-LUT combinational bottlenecks, but he didn't see any. We should probably not spend too much more time right now on performance optimization. Let's go ahead and start writing additional modules.

    I explained to them the expectations for the input to the our_FIFO module, and asked them to write the pulseform_cap module to combine the data from the 6 thresholds. Meanwhile, I will work on finishing the FIFO_reader module which I started writing last week.

    Friday, June 3, 2011

    Rebecca Black Day

    It's a furlough day, and my usual day off this summer, but I was in town anyway to pick up my new (used) car from the transmission shop and run some other errands, so I thought I'd stop by the lab for a bit.

    Regarding my old car, I need to call my last note holder before 5 and see if they still have the title, and have them send it to me if so; if not, I need to spend some time digging around in my storage room this weekend, hoping to find it. I will need it in order to sell the car for parts. Looks like I can get at least $389 for it. That'll at least help to cover the cost of the last (fruitless) repair job that was done on it.

    Meanwhile, I am almost broke again now, so I need to come up with a way to earn some extra money over the next month or two. Thinking of posting a profile on oDesk.

    Later, if I have time to do some work before heading home, here are some things I can work on:
    1. Try other waveform generator, for shorter pulse widths.
    2. Begin working on hand-optimizing pulse-cap module for increased speed.
    3. Finish writing FIFO_READER module.
    The 4-ns pulse reads properly as ~1-2 delays of 2.5-ns. Trying now to see if I can get circuit working reliably at 250 MHz (500 Msps). Adding an input buffer to pcaptest to delay the producer handshake for at least 10 ns. This is to make sure the input data bits all have plenty of time to settle down before we latch them. OK, that didn't help.

    Now adding a pipeline buffer to the output of the logic that computes when to enable the rise/fall time capture registers. This will effectively increase all time values captured by 1, but will not otherwise affect the results. That fixed the problem! Running reliably at 2 ns now (PLL clock multiplier factor = 5x). Trying 6x = 1.67 ns - that works too. Trying 7x = 1.42 ns - no dice! (No handshake.) Trying (20/3)x = 1.5 ns... (333 MHz, 666 Msps). That gives good values some of the time, but not very reliably.

    OK, I have to leave now, but I can check early next week to see if there are any more opportunities to improve performance of pulse_cap module through pipelining. It should be easy - just look for signal paths that pass through more than 1 LUT.

    Thursday, June 2, 2011

    Thirsty Thursday

    Boy it is hot out!

    I think I have my car found. I just need to wrap up a couple of things first - give them my proof of income, and take the car to AAA-1 to have the dripping transmission pan re-sealed - I asked the dealer to drop it off for me.

    David was out today due to car problems of his own.

    Today, Darrel and Tyler and I hooked up the external input from the pulse generator, and did some more testing of the input capture circuit. We reduced the ringing in the input signal significantly by reducing the size of the spaghetti loop between signal and ground inputs from the pulse generator. We had to turn the clock speed down to 200 MHz (400 Msps) to get good reliability. But, we verified correct pulse width measurements (to 2.5ns resolution) throughout the range from 12 ns (minimum pulse width from the pulser) up to 635 ns (above which the 8-bit output bus rolled over). Here is a video.

    Wednesday, June 1, 2011

    June Bug

    I found a used car I think I might buy, if I can get the financing approved... I will broach that subject with the dealer tomorrow.

    It's a little difficult to concentrate on work right now, what with my worries about my transportation, summer finances, and Fall employment. Still no word on my pending job inquiry. The guy I am supposed to talk to was not there again today... (Turns out he is out shopping around for a minivan, just like me!)

    I posted a "seeking work" Tweet with a link to my CV... Perhaps, I will try some freelancing site, elance or odesk or some such.

    ACTUAL WORK:

    Mike started writing the FIFO_read module.

    David was here and he and Mike spent some time debugging. We lowered the PLL speed to 250 MHz and verified that the counter (sum bits, at least) is still working, and that the rise-time output from the PULSE_CAP module is working. Then we tested the handshake signals and realized that the producer handshake was not working. Mike looked back at his code and realized that he neglected to put an explicit dual-edged DFF driving the producer handshake output, so it was trying (and of course failing) to shake the producer's hand continuously during the RAISE_HS state, instead of toggling it just once. Mike fixed that, and how the test works - the time-difference output from the PCAPTEST module holds steady at hex 0A (10 decimal). Now trying to ratchet the clock back up. Unfortunately, at multiples of 50 MHz, the highest speed at which the PULSE_CAP module seems to be working reasonably reliably is only 250 MHz. Still, with the dual-edged architecture, that gives us 2-ns resolution, so at least we're within striking range of the 1-ns target. But, further optimization of the PULSE_CAP module may be necessary.

    Scope trace showing (bottom to top): (Yellow) Internally generated test pulse (20 ns width); (light blue) Producer handshake signal from pulse-capture module (toggles on data ready); (pink) Consumer handshake signal from pulse-capture test module (toggles on data received); (blue/green lines) low 8 bits of time difference, in 2 ns half-cycles, between rising and falling edge times, (purple) binary value of bits = hex 0A = decimal 10.

    Some other next steps: (1) Test with externally-supplied input pulses from Ray's pulse generator again (our test pulses are kinda "cheating" since they are actually synchronous with our clock); (2) start writing the module to gather up the rise/fall times from all 6 comparators.