checkTransaction failure

XMLWordPrintable

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major
    • Component/s: FW, SW
    • None
    • Environment:

      On the SMuRF test crate at SLAC b33, using the development version of the pysmurf code which is being upgraded to support rogue 6.

      rogue 6.8.0

      FPGA FW git hash: 208e799

      the system is started using the smurf-streamer from SO

      When running through the tuning procedure with the pysmurf code upgrade to support rogue 6, I have occasionally noticed that one of the steps fails with an error in `checkTransaction`. This is not deterministic, and relatively infrequent, but I've seen it in the `SerialGradientDescent` function failing on the `etaI` register at a specific index, and in the `SerialEtaScan` function when setting the array `feedbackEnable`.

      I've collected some information on two of these instances below, although this has occurredon many more occasions.

      Failure in SerialGradientDescent

      Relvant logs:

      [2025-12-12 03:35:18,513] ERROR:pyrogue.Device.SerialGradientDescent.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.SerialGradientDescent: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.etaI with address 0x81310000. Byte: 62. Got: 0x00, Exp: 0x81, Mask: 0xff
      Traceback (most recent call last):
        File "/usr/local/src/rogue/python/pyrogue/_Process.py", line 140, in _run
          self._process()
        File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialGradientDescent.py", line 135, in _process
          complete, f = gradDescent(channel)
                        ^^^^^^^^^^^^^^^^^^^^
        File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialGradientDescent.py", line 106, in gradDescent
          g = calcGrad(channel, f, step, numAverages)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialGradientDescent.py", line 45, in calcGrad
          etaPhaseVar.set(0)
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1689, in set
          raise e
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1685, in set
          self._linkedSetWrap(function=self._linkedSet, dev=self.parent, var=self, value=value, write=write, index=index, verify=verify, check=check)
        File "<string>", line 1, in <lambda>
        File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_CryoChannel.py", line 52, in <lambda>
          linkedSet    = lambda value, write: parent.etaPhase.set(value=value*np.pi/180, write=write, index=index),
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1689, in set
          raise e
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1685, in set
          self._linkedSetWrap(function=self._linkedSet, dev=self.parent, var=self, value=value, write=write, index=index, verify=verify, check=check)
        File "<string>", line 1, in <lambda>
        File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_CryoChannels.py", line 208, in <lambda>
          linkedSet    = lambda value, write, index: self.setEtaPhase(value=value, write=write, index=index),
                                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_CryoChannels.py", line 514, in setEtaPhase
          self.etaI.set(value=etaI, write=write, index=index)
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1250, in set
          raise e
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1245, in set
          self._parent.checkBlocks(recurse=False, variable=self)
        File "/usr/local/src/rogue/python/pyrogue/_Device.py", line 638, in checkBlocks
          pr.checkTransaction(variable._block, **kwargs)
        File "/usr/local/src/rogue/python/pyrogue/_Block.py", line 71, in checkTransaction
          block._checkTransaction()
      rogue.GeneralError: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.etaI with address 0x81310000. Byte: 62. Got: 0x00, Exp: 0x81, Mask: 0xff 

      In this case, it occurs within the `SerialGradientDescent` function of `cryo-det`, where the phase of the `eta` parameter is being set repeatedly to measure the real and imaginary parts of the response for a given channel, iterating until it converges on a minimum. Each channel is fitted this way in sequence.

      • it fails on `etaPhaseVar.set(0)`. This is a `LinkVariable` that refers to the registers `etaI` and `etaQ` at a specific index.
      • `set` is linked to `setEtaPhase`, which computes the magnitude of the complex eta from current `etaI` and `etaQ`, and then rotates them to the requested phase. Notably, it gets `etaI` and `etaQ`to compute the magnitude with `read=False` set
      • it fails when setting `etaI` to it's new value (`etaI.set(value=etaI, write=write, index=index)`
      • this triggers the pyrogue function `Variable.set`, which calls
      • `self._set` – set the local in-memory representation
      • `self._parent.writeBlocks(force=True, recurse=False, variable=self, index=index)` – register transaction to write to hardware for the block containing this index
      • `self._parent.verifyBlocks(recurse=False, variable=self)` – read back the registers at the address last written to
      • `self._parent.checkBlocks(recurse=False, variable=self)` – compare the verification read to the in-memory value
      • the last step is where it fails: `Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.etaI with address 0x81310000. Byte: 62. Got: 0x00, Exp: 0x81, Mask: 0xff`
      • `0x81310000` is the correct address for the band 3 `CryoChannels`
      • `etaI/etaQ` are stored as an interleaved array of 16bit words, so byte 62 corresponds *to the 16th entry of `etaQ`*
      • the verification read found `0x00`, but expected `0x81`. *This is strange*
      • When we set `etaPhase` to 0, we expect `etaQ=0`, which matches the register value. *The question is why is the expected value `0x81`*?
        I don't know for sure what index of `etaPhase` it is trying to write to, but the minimum access size in this case is 4B, so it shouldn't need to read/write more than the single relevant `etaI/etaQ` index.

        Failure in SerialEtaScan

      Relevant logs

      [2025-11-17 23:13:41,396] ERROR:pyrogue.Variable.RemoteVariable.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff
      Traceback (most recent call last):
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1245, in set
          self._parent.checkBlocks(recurse=False, variable=self)
        File "/usr/local/src/rogue/python/pyrogue/_Device.py", line 638, in checkBlocks
          pr.checkTransaction(variable._block, **kwargs)
        File "/usr/local/src/rogue/python/pyrogue/_Block.py", line 71, in checkTransaction
          block._checkTransaction()
      rogue.GeneralError: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff
      [2025-11-17 23:13:41,401] ERROR:pyrogue.Variable.RemoteVariable.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable: Error setting value '[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]' to variable 'AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable' with type uint32(512,). Exception=Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff
      [2025-11-17 23:13:41,403] ERROR:pyrogue.Device.SerialEtaScan.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.SerialEtaScan: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff
      Traceback (most recent call last):
        File "/usr/local/src/rogue/python/pyrogue/_Process.py", line 140, in _run
          self._process()
        File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialEtaScan.py", line 37, in _process
          self.parent.feedbackEnable.set(np.zeros(512, dtype=np.uint))
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1250, in set
          raise e
        File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1245, in set
          self._parent.checkBlocks(recurse=False, variable=self)
        File "/usr/local/src/rogue/python/pyrogue/_Device.py", line 638, in checkBlocks
          pr.checkTransaction(variable._block, **kwargs)
        File "/usr/local/src/rogue/python/pyrogue/_Block.py", line 71, in checkTransaction
          block._checkTransaction()
      rogue.GeneralError: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff 

      In this case, it looks like both the in-memory and verification read are wrong, since the intended write was an array of zeros.

              Assignee:
              Ryan Herbst
              Reporter:
              Pinsonneault-Marotte, Tristan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: