カーネルエラーのため、私のサーバーの1つにアクセスできません。以下に記載されているすべてのカーネルバージョンを試しましたが、残念ながらそれらのどれも問題を解決できませんでした。
この問題を解決するにはどうすればよいですか?
Ubuntuバージョン:Ubuntu 16.04.3 LTS
カーネルバージョン:
- 4.13.0
- 2017年4月14日
- 4.15.2
- 4.15.3
ネットワークカード:
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
Subsystem: Fujitsu Technology Solutions Ethernet Connection (2) I219-LM
Kernel driver in use: e1000e
Kernel modules: e1000e
システムログ:
Feb 16 09:26:19 foxtrot kernel: [ 6315.103309] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Feb 16 09:26:46 foxtrot kernel: [ 6341.860523] e1000e 0000:00:1f.6 eth0: Reset adapter unexpectedly
Feb 16 09:26:46 foxtrot kernel: [ 6341.880459] ------------[ cut here ]------------
Feb 16 09:26:46 foxtrot kernel: [ 6341.880461] kernel BUG at /home/kernel/COD/linux/drivers/net/ethernet/intel/e1000e/netdev.c:3836!
Feb 16 09:26:46 foxtrot kernel: [ 6341.880609] invalid opcode: 0000 [#1] SMP PTI
Feb 16 09:26:46 foxtrot kernel: [ 6341.880702] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype nf_nat br_netfilter bridge stp llc xt_tcpudp overlay nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf serio_raw intel_pch_thermal mac_hid acpi_pad autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear raid1 e1000e psmouse ptp ahci pps_core libahci wmi video
Feb 16 09:26:46 foxtrot kernel: [ 6341.881046] CPU: 7 PID: 72 Comm: kworker/7:1 Tainted: G W 4.15.3-041503-generic #201802120730
Feb 16 09:26:46 foxtrot kernel: [ 6341.881156] Hardware name: FUJITSU /D3401-H2, BIOS V5.0.0.12 R1.8.0 for D3401-H2x 05/15/2017
Feb 16 09:26:46 foxtrot kernel: [ 6341.881275] Workqueue: events e1000_reset_task [e1000e]
Feb 16 09:26:46 foxtrot kernel: [ 6341.881373] RIP: 0010:e1000_flush_desc_rings+0x2cb/0x2e0 [e1000e]
Feb 16 09:26:46 foxtrot kernel: [ 6341.881465] RSP: 0018:ffff9ff6033f3d88 EFLAGS: 00010202
Feb 16 09:26:46 foxtrot kernel: [ 6341.881555] RAX: 00000000000000d3 RBX: ffff8f0d2ee048c0 RCX: 00000000000000e9
Feb 16 09:26:46 foxtrot kernel: [ 6341.881648] RDX: 00000000000000d3 RSI: 0000000000000246 RDI: 0000000000000246
Feb 16 09:26:46 foxtrot kernel: [ 6341.881742] RBP: ffff9ff6033f3dc0 R08: 0000000000000002 R09: ffff9ff6033f3d54
Feb 16 09:26:46 foxtrot kernel: [ 6341.881835] R10: 00000000000000fe R11: 0000000000000000 R12: 000000003103f0fa
Feb 16 09:26:46 foxtrot kernel: [ 6341.881946] R13: ffff8f0d2ee04d78 R14: ffff8f0d39ca9480 R15: 0000000004008000
Feb 16 09:26:46 foxtrot kernel: [ 6341.882071] FS: 0000000000000000(0000) GS:ffff8f0d5e5c0000(0000) knlGS:0000000000000000
Feb 16 09:26:46 foxtrot kernel: [ 6341.882263] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 16 09:26:46 foxtrot kernel: [ 6341.882387] CR2: 00007fd08b9f7fd7 CR3: 0000000700a0a001 CR4: 00000000003606e0
Feb 16 09:26:46 foxtrot kernel: [ 6341.882481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 16 09:26:46 foxtrot kernel: [ 6341.882661] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb 16 09:26:46 foxtrot kernel: [ 6341.882787] Call Trace:
Feb 16 09:26:46 foxtrot kernel: [ 6341.882878] e1000e_reset+0x516/0x760 [e1000e]
Feb 16 09:26:46 foxtrot kernel: [ 6341.882968] e1000e_down+0x1db/0x210 [e1000e]
Feb 16 09:26:46 foxtrot kernel: [ 6341.883064] e1000e_reinit_locked+0x4c/0x70 [e1000e]
Feb 16 09:26:46 foxtrot kernel: [ 6341.883156] e1000_reset_task+0x59/0x60 [e1000e]
Feb 16 09:26:46 foxtrot kernel: [ 6341.883250] process_one_work+0x1ef/0x410
Feb 16 09:26:46 foxtrot kernel: [ 6341.883338] worker_thread+0x32/0x410
Feb 16 09:26:46 foxtrot kernel: [ 6341.883419] kthread+0x121/0x140
Feb 16 09:26:46 foxtrot kernel: [ 6341.883506] ? process_one_work+0x410/0x410
Feb 16 09:26:46 foxtrot kernel: [ 6341.883594] ? kthread_create_worker_on_cpu+0x70/0x70
Feb 16 09:26:46 foxtrot kernel: [ 6341.883685] ret_from_fork+0x35/0x40
Feb 16 09:26:46 foxtrot kernel: [ 6341.883772] Code: e8 fb fc ff ff eb d6 4c 89 ef e8 f1 fc ff ff eb 95 4c 89 ef e8 e7 fc ff ff e9 66 ff ff ff 4c 89 ef e8 da fc ff ff e9 02 ff ff ff <0f> 0b e8 5e fb 13 d8 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00
Feb 16 09:26:46 foxtrot kernel: [ 6341.883949] RIP: e1000_flush_desc_rings+0x2cb/0x2e0 [e1000e] RSP: ffff9ff6033f3d88
Feb 16 09:26:46 foxtrot kernel: [ 6341.884056] ---[ end trace abbf45ab36b73ab9 ]---
Feb 16 09:28:38 foxtrot autossh[1513]: ssh exited with error status 255; restarting ssh
Feb 16 09:28:38 foxtrot autossh[1513]: starting ssh (count 2)
Feb 16 09:28:38 foxtrot autossh[1513]: ssh child pid is 20383
Feb 16 09:28:40 foxtrot autossh[1513]: ssh exited with error status 255; restarting ssh
Feb 16 09:28:40 foxtrot autossh[1513]: starting ssh (count 3)
答え1
次のコマンドを使用してTSO、GSO、およびGROを無効にすることでこの問題を解決できました。このコマンドは、サーバーを再起動して再実行する必要があるか、rc.localに追加できます。
ethtool -K eth0 gso off gro off tso off
この機能を無効にしてから6ヶ月以上経っても問題は再発生しませんでした。