-
Bug
-
Resolution: resolved
-
Major
-
None
-
None
Using latest aes-stream-driver (5.19.3), I have observe this bug on both Ubuntu 20.04 (Kernel 5) and Ubuntu 22.04 (Kernel 6) on both older bios (~1 year old) to the latest bios releases on both Ubuntu systems.
When the system boots up, I get the following dmesg:
[ 20.568305] datadev: Init [ 20.568336] datadev 0000:01:00.0: enabling device (0000 -> 0002) [ 20.568420] (NULL device *): Init: Mapping Register space 0xf3000000 with size 0x1000000. [ 20.568507] (NULL device *): Init: Mapped to 0xffffbe4947000000. [ 20.568509] datadev 0000:01:00.0: Init: Setting user reset [ 20.568511] datadev 0000:01:00.0: Init: Clearing user reset [ 20.568513] datadev 0000:01:00.0: Init: Using 40-bit DMA mask. [ 20.568515] datadev 0000:01:00.0: Init: Creating device class [ 20.568581] datadev 0000:01:00.0: Init: Creating 16 TX Buffers. Size=2097152 Bytes. Mode=1. [ 20.569725] datadev 0000:01:00.0: Init: Created 16 out of 16 TX Buffers. 33554432 Bytes. [ 20.569773] datadev 0000:01:00.0: Init: Creating 256 RX Buffers. Size=2097152 Bytes. Mode=1. [ 20.592384] datadev 0000:01:00.0: Init: Created 256 out of 256 RX Buffers. 536870912 Bytes. [ 20.592492] datadev 0000:01:00.0: Init: Read ring at: sw 0xffffbe4941621000 -> hw 0x86df0000. [ 20.592494] datadev 0000:01:00.0: Init: Write ring at: sw 0xffffbe4941689000 -> hw 0x86de0000. [ 20.592567] datadev 0000:01:00.0: Init: Found Version 2 Device. Desc128En=1 [ 20.592568] datadev 0000:01:00.0: Init: IRQ 162 [ 20.592706] datadev 0000:01:00.0: Init: Reg space mapped to 0xffffbe4947000000. [ 20.592710] datadev 0000:01:00.0: Init: User space mapped to 0xffffbe4947010000 with size 0xff0000. [ 20.592714] datadev 0000:01:00.0: Init: Top Register = 0x4010101 [ 20.808061] audit: type=1400 audit(1707436728.533:49): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3452 comm="snap-confine" capability=12 capname="net_admin" [ 20.808066] audit: type=1400 audit(1707436728.533:50): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3452 comm="snap-confine" capability=38 capname="perfmon" [ 21.930956] rfkill: input handler disabled [ 114.401790] audit: type=1400 audit(1707436822.423:51): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3827 comm="snap-confine" capability=12 capname="net_admin" [ 114.401796] audit: type=1400 audit(1707436822.423:52): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3827 comm="snap-confine" capability=38 capname="perfmon" [ 114.406707] audit: type=1400 audit(1707436822.427:53): apparmor="DENIED" operation="mkdir" class="file" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" name="/u/" pid=3827 comm="snap-confine" requested_mask="c" denied_mask="c" fsuid=9518 ouid=9518
... which looks normal. Then I do a "cat /proc/datadev_0" ....
ruckman@rdsrv403:~$ cat /proc/datadev_0 ---------- Firmware Axi Version ----------- Firmware Version : 0x2040000 ScratchPad : 0x0 Up Time Count : 148 Git Hash : d3fb5e20e1e283aa1d12ece56c7311a5a5a6f9be DNA Value : 0x0000000040020000013aca034d40a285 Build String : XilinxVariumC1100Pgp4_6Gbps: Vivado v2023.1, rdsrv403 (Ubuntu 22.04.3 LTS), Built Mon Feb 5 12:35:13 PM PST 2024 by ruckman ---------- DMA Firmware General ---------- Int Req Count : 0 Continue Count : 0 Address Count : 4096 Hw Write Buff Count : 256 Hw Read Buff Count : 0 Cache Config : 0x0 Desc 128 En : 1 Enable Ver : 0x4010101 Driver Load Count : 1 IRQ Hold : 10000 BG Enable : 0x0 -------- DMA Kernel Driver General -------- DMA Driver's Git Version : 5.19.3 DMA Driver's API Version : 0x6 ---- Read Buffers (Firmware->Software) ---- Buffer Count : 256 Buffer Size : 2097152 Buffer Mode : 1 Buffers In User : 0 Buffers In Hw : 256 Buffers In Pre-Hw Q : 0 Buffers In Rx Queue : 0 ---- Write Buffers (Software->Firmware) --- Buffer Count : 16 Buffer Size : 2097152 Buffer Mode : 1 Buffers In User : 0 Buffers In Hw : 0 Buffers In Pre-Hw Q : 0 Buffers In Sw Queue : 16
.... which also looks normal. Then I do a dmesg after that "cat /proc/datadev_0" and see the following kernel panic message ....
[ 20.568305] datadev: Init [ 20.568336] datadev 0000:01:00.0: enabling device (0000 -> 0002) [ 20.568420] (NULL device *): Init: Mapping Register space 0xf3000000 with size 0x1000000. [ 20.568507] (NULL device *): Init: Mapped to 0xffffbe4947000000. [ 20.568509] datadev 0000:01:00.0: Init: Setting user reset [ 20.568511] datadev 0000:01:00.0: Init: Clearing user reset [ 20.568513] datadev 0000:01:00.0: Init: Using 40-bit DMA mask. [ 20.568515] datadev 0000:01:00.0: Init: Creating device class [ 20.568581] datadev 0000:01:00.0: Init: Creating 16 TX Buffers. Size=2097152 Bytes. Mode=1. [ 20.569725] datadev 0000:01:00.0: Init: Created 16 out of 16 TX Buffers. 33554432 Bytes. [ 20.569773] datadev 0000:01:00.0: Init: Creating 256 RX Buffers. Size=2097152 Bytes. Mode=1. [ 20.592384] datadev 0000:01:00.0: Init: Created 256 out of 256 RX Buffers. 536870912 Bytes. [ 20.592492] datadev 0000:01:00.0: Init: Read ring at: sw 0xffffbe4941621000 -> hw 0x86df0000. [ 20.592494] datadev 0000:01:00.0: Init: Write ring at: sw 0xffffbe4941689000 -> hw 0x86de0000. [ 20.592567] datadev 0000:01:00.0: Init: Found Version 2 Device. Desc128En=1 [ 20.592568] datadev 0000:01:00.0: Init: IRQ 162 [ 20.592706] datadev 0000:01:00.0: Init: Reg space mapped to 0xffffbe4947000000. [ 20.592710] datadev 0000:01:00.0: Init: User space mapped to 0xffffbe4947010000 with size 0xff0000. [ 20.592714] datadev 0000:01:00.0: Init: Top Register = 0x4010101 [ 20.808061] audit: type=1400 audit(1707436728.533:49): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3452 comm="snap-confine" capability=12 capname="net_admin" [ 20.808066] audit: type=1400 audit(1707436728.533:50): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3452 comm="snap-confine" capability=38 capname="perfmon" [ 21.930956] rfkill: input handler disabled [ 114.401790] audit: type=1400 audit(1707436822.423:51): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3827 comm="snap-confine" capability=12 capname="net_admin" [ 114.401796] audit: type=1400 audit(1707436822.423:52): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" pid=3827 comm="snap-confine" capability=38 capname="perfmon" [ 114.406707] audit: type=1400 audit(1707436822.427:53): apparmor="DENIED" operation="mkdir" class="file" profile="/snap/snapd/20671/usr/lib/snapd/snap-confine" name="/u/" pid=3827 comm="snap-confine" requested_mask="c" denied_mask="c" fsuid=9518 ouid=9518 [ 132.656177] ================================================================================ [ 132.656182] UBSAN: array-index-out-of-bounds in /u1/aes-stream-drivers/data_dev/driver/src/axi_version.c:55:63 [ 132.656185] index 41 is out of range for type 'uint32_t [40]' [ 132.656187] CPU: 5 PID: 3901 Comm: cat Tainted: P OE 6.5.0-17-generic #17~22.04.1-Ubuntu [ 132.656189] Hardware name: Gigabyte Technology Co., Ltd. X670 AORUS ELITE AX/X670 AORUS ELITE AX, BIOS F22b 02/06/2024 [ 132.656190] Call Trace: [ 132.656192] <TASK> [ 132.656195] dump_stack_lvl+0x48/0x70 [ 132.656202] dump_stack+0x10/0x20 [ 132.656204] __ubsan_handle_out_of_bounds+0xc6/0x110 [ 132.656209] AxiVersion_Read+0x18c/0x1c0 [datadev] [ 132.656214] DataDev_SeqShow+0x52/0xa0 [datadev] [ 132.656227] Dma_SeqShow+0x2d/0x320 [datadev] [ 132.656231] seq_read_iter+0x132/0x4a0 [ 132.656234] ? srso_alias_return_thunk+0x5/0x7f [ 132.656237] ? __mod_lruvec_state+0x36/0x50 [ 132.656241] seq_read+0xcd/0x110 [ 132.656244] proc_reg_read+0x69/0xb0 [ 132.656246] ? srso_alias_return_thunk+0x5/0x7f [ 132.656248] vfs_read+0xb1/0x360 [ 132.656252] ksys_read+0x73/0x100 [ 132.656254] __x64_sys_read+0x19/0x30 [ 132.656256] do_syscall_64+0x58/0x90 [ 132.656258] ? exit_to_user_mode_prepare+0x30/0xb0 [ 132.656261] ? srso_alias_return_thunk+0x5/0x7f [ 132.656262] ? irqentry_exit_to_user_mode+0x17/0x20 [ 132.656265] ? srso_alias_return_thunk+0x5/0x7f [ 132.656266] ? irqentry_exit+0x43/0x50 [ 132.656268] ? srso_alias_return_thunk+0x5/0x7f [ 132.656270] ? exc_page_fault+0x94/0x1b0 [ 132.656272] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 132.656275] RIP: 0033:0x7fe2cdb147e2 [ 132.656299] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 8a b4 0c 00 e8 a5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 132.656300] RSP: 002b:00007ffc79506a48 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 132.656303] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fe2cdb147e2 [ 132.656304] RDX: 0000000000020000 RSI: 00007fe2cdcff000 RDI: 0000000000000003 [ 132.656305] RBP: 00007fe2cdcff000 R08: 00007fe2cdcfe010 R09: 00007fe2cdcfe010 [ 132.656306] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [ 132.656307] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [ 132.656310] </TASK> [ 132.656311] ================================================================================ [ 132.656380] seq_file: buggy .next function Dma_SeqNext [datadev] did not update position index