-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major
-
None
-
Environment:
On the SMuRF test crate at SLAC b33, using the development version of the pysmurf code which is being upgraded to support rogue 6.
rogue 6.8.0
FPGA FW git hash: 208e799
the system is started using the smurf-streamer from SO
When running through the tuning procedure with the pysmurf code upgrade to support rogue 6, I have occasionally noticed that one of the steps fails with an error in `checkTransaction`. This is not deterministic, and relatively infrequent, but I've seen it in the `SerialGradientDescent` function failing on the `etaI` register at a specific index, and in the `SerialEtaScan` function when setting the array `feedbackEnable`.
I've collected some information on two of these instances below, although this has occurredon many more occasions.
Failure in SerialGradientDescent
Relvant logs:
[2025-12-12 03:35:18,513] ERROR:pyrogue.Device.SerialGradientDescent.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.SerialGradientDescent: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.etaI with address 0x81310000. Byte: 62. Got: 0x00, Exp: 0x81, Mask: 0xff Traceback (most recent call last): File "/usr/local/src/rogue/python/pyrogue/_Process.py", line 140, in _run self._process() File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialGradientDescent.py", line 135, in _process complete, f = gradDescent(channel) ^^^^^^^^^^^^^^^^^^^^ File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialGradientDescent.py", line 106, in gradDescent g = calcGrad(channel, f, step, numAverages) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialGradientDescent.py", line 45, in calcGrad etaPhaseVar.set(0) File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1689, in set raise e File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1685, in set self._linkedSetWrap(function=self._linkedSet, dev=self.parent, var=self, value=value, write=write, index=index, verify=verify, check=check) File "<string>", line 1, in <lambda> File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_CryoChannel.py", line 52, in <lambda> linkedSet = lambda value, write: parent.etaPhase.set(value=value*np.pi/180, write=write, index=index), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1689, in set raise e File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1685, in set self._linkedSetWrap(function=self._linkedSet, dev=self.parent, var=self, value=value, write=write, index=index, verify=verify, check=check) File "<string>", line 1, in <lambda> File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_CryoChannels.py", line 208, in <lambda> linkedSet = lambda value, write, index: self.setEtaPhase(value=value, write=write, index=index), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_CryoChannels.py", line 514, in setEtaPhase self.etaI.set(value=etaI, write=write, index=index) File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1250, in set raise e File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1245, in set self._parent.checkBlocks(recurse=False, variable=self) File "/usr/local/src/rogue/python/pyrogue/_Device.py", line 638, in checkBlocks pr.checkTransaction(variable._block, **kwargs) File "/usr/local/src/rogue/python/pyrogue/_Block.py", line 71, in checkTransaction block._checkTransaction() rogue.GeneralError: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.etaI with address 0x81310000. Byte: 62. Got: 0x00, Exp: 0x81, Mask: 0xff
In this case, it occurs within the `SerialGradientDescent` function of `cryo-det`, where the phase of the `eta` parameter is being set repeatedly to measure the real and imaginary parts of the response for a given channel, iterating until it converges on a minimum. Each channel is fitted this way in sequence.
- it fails on `etaPhaseVar.set(0)`. This is a `LinkVariable` that refers to the registers `etaI` and `etaQ` at a specific index.
- `set` is linked to `setEtaPhase`, which computes the magnitude of the complex eta from current `etaI` and `etaQ`, and then rotates them to the requested phase. Notably, it gets `etaI` and `etaQ`to compute the magnitude with `read=False` set
- it fails when setting `etaI` to it's new value (`etaI.set(value=etaI, write=write, index=index)`
- this triggers the pyrogue function `Variable.set`, which calls
- `self._set` – set the local in-memory representation
- `self._parent.writeBlocks(force=True, recurse=False, variable=self, index=index)` – register transaction to write to hardware for the block containing this index
- `self._parent.verifyBlocks(recurse=False, variable=self)` – read back the registers at the address last written to
- `self._parent.checkBlocks(recurse=False, variable=self)` – compare the verification read to the in-memory value
- the last step is where it fails: `Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.etaI with address 0x81310000. Byte: 62. Got: 0x00, Exp: 0x81, Mask: 0xff`
- `0x81310000` is the correct address for the band 3 `CryoChannels`
- `etaI/etaQ` are stored as an interleaved array of 16bit words, so byte 62 corresponds *to the 16th entry of `etaQ`*
- the verification read found `0x00`, but expected `0x81`. *This is strange*
- When we set `etaPhase` to 0, we expect `etaQ=0`, which matches the register value. *The question is why is the expected value `0x81`*?
I don't know for sure what index of `etaPhase` it is trying to write to, but the minimum access size in this case is 4B, so it shouldn't need to read/write more than the single relevant `etaI/etaQ` index.Failure in SerialEtaScan
Relevant logs
[2025-11-17 23:13:41,396] ERROR:pyrogue.Variable.RemoteVariable.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff Traceback (most recent call last): File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1245, in set self._parent.checkBlocks(recurse=False, variable=self) File "/usr/local/src/rogue/python/pyrogue/_Device.py", line 638, in checkBlocks pr.checkTransaction(variable._block, **kwargs) File "/usr/local/src/rogue/python/pyrogue/_Block.py", line 71, in checkTransaction block._checkTransaction() rogue.GeneralError: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff [2025-11-17 23:13:41,401] ERROR:pyrogue.Variable.RemoteVariable.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable: Error setting value '[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]' to variable 'AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable' with type uint32(512,). Exception=Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff [2025-11-17 23:13:41,403] ERROR:pyrogue.Device.SerialEtaScan.AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.SerialEtaScan: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff Traceback (most recent call last): File "/usr/local/src/rogue/python/pyrogue/_Process.py", line 140, in _run self._process() File "/tmp/fw/cryo-det/firmware/python/CryoDet/DspCoreLib/CryoDetCmbHcd/_SerialEtaScan.py", line 37, in _process self.parent.feedbackEnable.set(np.zeros(512, dtype=np.uint)) File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1250, in set raise e File "/usr/local/src/rogue/python/pyrogue/_Variable.py", line 1245, in set self._parent.checkBlocks(recurse=False, variable=self) File "/usr/local/src/rogue/python/pyrogue/_Device.py", line 638, in checkBlocks pr.checkTransaction(variable._block, **kwargs) File "/usr/local/src/rogue/python/pyrogue/_Block.py", line 71, in checkTransaction block._checkTransaction() rogue.GeneralError: Block::checkTransaction: General Error: Verify error for block AMCc.FpgaTopLevel.AppTop.AppCore.SysgenCryo.Base[3].CryoChannels.feedbackEnable with address 0x81310800. Byte: 40. Got: 0x74, Exp: 0x9d, Mask: 0xff
In this case, it looks like both the in-memory and verification read are wrong, since the intended write was an array of zeros.