カーネルアップグレード後のLinux I/Oデッドロック

カーネルアップグレード後のLinux I/Oデッドロック

4.19からアップグレードした後、Linuxカーネル5.4.35以降を使用していましたが、その後数日(2〜3日)後にhpsa md RAID 0がクラッシュし、RAIDが読み取り専用/ I / Odenyに変更されました。 (Debian「バニラカーネル」でコンパイル)

SMART統計を確認しても、致命的/重要なエラーは表示されません。

私もGithubで見つけることができるhpsahbaの6つのパッチを使用します。ここ

対応するシステムログは次のとおりです。完全なシステムログはPastebinにあります。ここ

Apr 30 15:58:31 srv381 kernel: [544209.588021] sd 0:0:10:0: [sdj] tag#173 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:58:31 srv381 kernel: [544209.588026] sd 0:0:10:0: [sdj] tag#173 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:58:31 srv381 kernel: [544209.588028] sd 0:0:10:0: [sdj] tag#173 Add. Sense: Record not found
Apr 30 15:58:31 srv381 kernel: [544209.588032] sd 0:0:10:0: [sdj] tag#173 CDB: Write(16) 8a 00 00 00 00 01 91 28 00 00 00 00 01 30 00 00
Apr 30 15:58:31 srv381 kernel: [544209.588035] blk_update_request: critical medium error, dev sdj, sector 6730285056 op 0x1:(WRITE) flags 0x100000 phys_seg 5 prio class 0
Apr 30 15:58:42 srv381 kernel: [544220.603519] sd 0:0:10:0: [sdj] tag#179 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:58:42 srv381 kernel: [544220.603523] sd 0:0:10:0: [sdj] tag#179 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:58:42 srv381 kernel: [544220.603527] sd 0:0:10:0: [sdj] tag#179 Add. Sense: Unrecovered read error
Apr 30 15:58:42 srv381 kernel: [544220.603530] sd 0:0:10:0: [sdj] tag#179 CDB: Read(16) 88 00 00 00 00 00 4a d1 69 b0 00 00 02 50 00 00
Apr 30 15:58:42 srv381 kernel: [544220.603533] blk_update_request: critical medium error, dev sdj, sector 1255238064 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 0
Apr 30 15:59:05 srv381 kernel: [544243.400236] XFS (md0p2): writeback error on sector 6730284320
Apr 30 15:59:41 srv381 kernel: [544279.528345] sd 0:0:10:0: [sdj] tag#143 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:59:41 srv381 kernel: [544279.528352] sd 0:0:10:0: [sdj] tag#143 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:59:41 srv381 kernel: [544279.528354] sd 0:0:10:0: [sdj] tag#143 Add. Sense: Record not found
Apr 30 15:59:41 srv381 kernel: [544279.528358] sd 0:0:10:0: [sdj] tag#143 CDB: Write(16) 8a 00 00 00 00 01 91 2c c2 c8 00 00 01 38 00 00
Apr 30 15:59:41 srv381 kernel: [544279.528361] blk_update_request: critical medium error, dev sdj, sector 6730597064 op 0x1:(WRITE) flags 0x100000 phys_seg 20 prio class 0
Apr 30 15:59:41 srv381 kernel: [544279.557380] XFS (md0p2): writeback error on sector 6730597056
Apr 30 16:00:19 srv381 kernel: [544317.433932] hpsa 0000:05:00.0: scsi 0:0:10:0: resetting physical  Direct-Access     ATA      TP04000GB        PHYS DRV SSDSmartPathCap- En- Exp=1
Apr 30 16:00:24 srv381 kernel: [544322.470747] hpsa 0000:05:00.0: waiting 2 secs for device to become ready.
Apr 30 16:00:26 srv381 kernel: [544324.497534] hpsa 0000:05:00.0: waiting 4 secs for device to become ready.
Apr 30 16:00:30 srv381 kernel: [544328.529549] hpsa 0000:05:00.0: waiting 8 secs for device to become ready.
Apr 30 16:00:38 srv381 kernel: [544336.721590] hpsa 0000:05:00.0: waiting 16 secs for device to become ready.
Apr 30 16:00:54 srv381 kernel: [544352.849662] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:01:27 srv381 kernel: [544385.617802] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:00 srv381 kernel: [544418.386133] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:32 srv381 kernel: [544451.154095] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:55 srv381 kernel: [544473.682061] INFO: task jbd2/sda2-8:270 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682101]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682164] jbd2/sda2-8     D    0   270      2 0x80004000
Apr 30 16:02:55 srv381 kernel: [544473.682166] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682176]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682178]  ? bit_wait_timeout+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682179]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682181]  io_schedule+0x12/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682182]  bit_wait_io+0xd/0x50
Apr 30 16:02:55 srv381 kernel: [544473.682184]  __wait_on_bit+0x2a/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682186]  out_of_line_wait_on_bit+0x92/0xb0
Apr 30 16:02:55 srv381 kernel: [544473.682190]  ? var_wake_function+0x20/0x20
Apr 30 16:02:55 srv381 kernel: [544473.682198]  jbd2_journal_commit_transaction+0x107c/0x1930 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682203]  ? try_to_del_timer_sync+0x4f/0x80
Apr 30 16:02:55 srv381 kernel: [544473.682208]  kjournald2+0xb7/0x280 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682210]  ? finish_wait+0x80/0x80
Apr 30 16:02:55 srv381 kernel: [544473.682213]  kthread+0xf9/0x130
Apr 30 16:02:55 srv381 kernel: [544473.682217]  ? commit_timeout+0x10/0x10 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682219]  ? kthread_park+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682222]  ret_from_fork+0x35/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682228] INFO: task rs:main Q:Reg:917 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682261]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682288] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682323] rs:main Q:Reg   D    0   917      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682325] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682328]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682329]  ? _cond_resched+0x15/0x30
Apr 30 16:02:55 srv381 kernel: [544473.682331]  ? bit_wait_timeout+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682332]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682334]  io_schedule+0x12/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682335]  bit_wait_io+0xd/0x50
Apr 30 16:02:55 srv381 kernel: [544473.682337]  __wait_on_bit+0x2a/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682338]  out_of_line_wait_on_bit+0x92/0xb0
Apr 30 16:02:55 srv381 kernel: [544473.682340]  ? var_wake_function+0x20/0x20
Apr 30 16:02:55 srv381 kernel: [544473.682345]  do_get_write_access+0x297/0x3e0 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682350]  jbd2_journal_get_write_access+0x5c/0x80 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682372]  __ext4_journal_get_write_access+0x37/0x80 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682385]  ? ext4_dirty_inode+0x44/0x60 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682398]  ext4_reserve_inode_write+0x93/0xc0 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682412]  ext4_mark_inode_dirty+0x51/0x1d0 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682416]  ? jbd2__journal_start+0xdc/0x1e0 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682429]  ext4_dirty_inode+0x44/0x60 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682432]  __mark_inode_dirty+0x262/0x380
Apr 30 16:02:55 srv381 kernel: [544473.682435]  generic_update_time+0x9d/0xc0
Apr 30 16:02:55 srv381 kernel: [544473.682437]  file_update_time+0xeb/0x140
Apr 30 16:02:55 srv381 kernel: [544473.682439]  __generic_file_write_iter+0x96/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682452]  ext4_file_write_iter+0xb6/0x360 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682456]  new_sync_write+0x12d/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682459]  vfs_write+0xb6/0x1a0
Apr 30 16:02:55 srv381 kernel: [544473.682461]  ksys_write+0x5f/0xe0
Apr 30 16:02:55 srv381 kernel: [544473.682465]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682467]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682469] RIP: 0033:0x7ffa65862e0f
Apr 30 16:02:55 srv381 kernel: [544473.682474] Code: Bad RIP value.
Apr 30 16:02:55 srv381 kernel: [544473.682475] RSP: 002b:00007ffa64936860 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682477] RAX: ffffffffffffffda RBX: 00007ffa5c06b7a0 RCX: 00007ffa65862e0f
Apr 30 16:02:55 srv381 kernel: [544473.682478] RDX: 000000000000006d RSI: 00007ffa5c06b7a0 RDI: 000000000000000c
Apr 30 16:02:55 srv381 kernel: [544473.682479] RBP: 00007ffa5c004ea0 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682480] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffa5c00a120
Apr 30 16:02:55 srv381 kernel: [544473.682481] R13: 000000000000006d R14: 0000000000000000 R15: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682491] INFO: task deluged:10450 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682523]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682550] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682585] deluged         D    0 10450      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682587] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682590]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682592]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682595]  rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.682649]  ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682684]  xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682719]  xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682724]  do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682726]  do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.682727]  vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.682731]  ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.682733]  ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.682734]  do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.682737]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682739]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682741] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682743] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.682744] RSP: 002b:00007fc0d5b510f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.682745] RAX: ffffffffffffffda RBX: 00007fc0d5b51190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682746] RDX: 0000000000000001 RSI: 00007fc0d5b51190 RDI: 0000000000001696
Apr 30 16:02:55 srv381 kernel: [544473.682747] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682748] R10: 0000000004dbadbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682749] R13: 0000000000001696 R14: 0000000000000001 R15: 0000000004dbadbe
Apr 30 16:02:55 srv381 kernel: [544473.682751] INFO: task deluged:10452 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682783]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682845] deluged         D    0 10452      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682847] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682849]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682853]  ? enqueue_task_fair+0x8c/0x4c0
Apr 30 16:02:55 srv381 kernel: [544473.682854]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682856]  rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.682894]  ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682928]  xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682963]  xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682967]  do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682969]  do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.682970]  vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.682973]  ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.682975]  ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.682976]  do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.682979]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682981]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682982] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682984] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.682985] RSP: 002b:00007fc0d491f0f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.682986] RAX: ffffffffffffffda RBX: 00007fc0d491f190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682987] RDX: 0000000000000001 RSI: 00007fc0d491f190 RDI: 0000000000001697
Apr 30 16:02:55 srv381 kernel: [544473.682988] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682989] R10: 0000000005e5ccbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682990] R13: 0000000000001697 R14: 0000000000000001 R15: 0000000005e5ccbe
Apr 30 16:02:55 srv381 kernel: [544473.682992] INFO: task deluged:10454 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.683024]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.683051] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.683086] deluged         D    0 10454      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.683088] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.683090]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.683092]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.683094]  rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.683131]  ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683166]  xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683201]  xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683204]  do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.683206]  do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.683208]  vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.683210]  ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.683212]  ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.683214]  do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.683216]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.683218]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.683220] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.683221] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.683222] RSP: 002b:00007fc0cf7f70f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.683224] RAX: ffffffffffffffda RBX: 00007fc0cf7f7190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.683225] RDX: 0000000000000001 RSI: 00007fc0cf7f7190 RDI: 0000000000001697
Apr 30 16:02:55 srv381 kernel: [544473.683226] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.683226] R10: 0000000002e5ccbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.683227] R13: 0000000000001697 R14: 0000000000000001 R15: 0000000002e5ccbe
Apr 30 16:02:55 srv381 kernel: [544473.683235] INFO: task kworker/2:2:21309 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.683268]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.683295] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.683330] kworker/2:2     D    0 21309      2 0x80004000
Apr 30 16:02:55 srv381 kernel: [544473.683370] Workqueue: xfs-sync/md0p2 xfs_log_worker [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683372] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.683375]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.683376]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.683385]  md_flush_request+0xa8/0x1b0 [md_mod]

答え1

SMARTエラーはありませんが、sdj実際にディスクを使用するとエラーが報告されており、これはRAIDボリュームに影響を与えるようですmd0p2

メッセージを残した後

hpsa 0000:05:00.0: scsi 0:0:10:0: resetting physical  Direct-Access     ATA      TP04000GB        PHYS DRV SSDSmartPathCap- En- Exp=1

問題のディスクが完全に応答を停止したようです。これは書き込み保存エラーであるため、カーネルが書き込み操作をキャッシュしてディスクに書き込むことをユーザー空間アプリケーションに「約束」したことを意味します。これで実際に書き込みが不可能であることがわかりました。RAID 0では、ディスクが再び応答するのを待つ以外に回復する方法はありません。別のオプションは、意図的にデータを失うことです。これはカーネルの問題です。ただ私一人で何をすべきかわからない

4月30日16:00:19に、カーネルはエラーから回復するためにディスクにリセットコマンドを実行しましたが、ディスクは明らかにコマンドを完了できませんでした。

システムログに基づいてディスクが破損していると宣言する準備が整いました。死亡時間は4月30日16時0分24秒頃だった。

電源を入れ直してディスクが回復したら、コンテンツをバックアップします。他の措置を取る前にできるだけ早く

関連情報