View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0003382 | OpenFOAM | Bug | public | 2019-11-11 05:24 | 2019-11-14 20:25 |
Reporter | sunil_jain | Assigned To | henry | ||
Priority | normal | Severity | crash | Reproducibility | random |
Status | closed | Resolution | unable to reproduce | ||
Platform | Unix | OS | Other | OS Version | (please specify) |
Summary | 0003382: Open Foam Software crash randonly on Intel Skylake processors but runs fine on sandybrifge perocessors. | ||||
Description | OpenFoam software crash logs 2018-09-09 17:14:34 [179203.697285] simd exception: 0000 [#1] SMP 2018-09-09 17:14:34 [179203.701527] Modules linked in: squashfs loop 8021q garp mrp stp llc nvidia_uvm(POE) nvidia(POE) xfs skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi irqbypass crc32_pclmul ghash_cl_] 2018-09-09 17:14:34 [179203.784359] CPU: 2 PID: 159455 Comm: shuangTwoPhaseE Tainted: P OE ------------ T 3.10.0-862.9.1.el7.x86_64 #1 2018-09-09 17:14:34 [179203.795389] Hardware name: Dell Inc. PowerEdge R740/06G98X, BIOS 1.4.8 05/21/2018 2018-09-09 17:14:34 [179203.802958] task: ffff995c1aee8fd0 ti: ffff995c1988c000 task.ti: ffff995c1988c000 2018-09-09 17:14:34 [179203.810539] RIP: 0010:[<ffffffffbe121791>] [<ffffffffbe121791>] apic_timer_interrupt+0x141/0x170 2018-09-09 17:14:34 [179203.819515] RSP: 0000:ffff995c1da46200 EFLAGS: 00010082 2018-09-09 17:14:34 [179203.824928] RAX: ffff995c1988ff70 RBX: 0000000001a95e00 RCX: 0000000000000090 2018-09-09 17:14:34 [179203.832146] RDX: 0000000000000000 RSI: ffff995c1da46200 RDI: ffff995c1988ff70 2018-09-09 17:14:34 [179203.839364] RBP: 00007ffd8b8ba848 R08: 0000000000000c40 R09: 0000000000000031 2018-09-09 17:14:34 [179203.846591] R10: 0000000000000000 R11: 0000000000e72148 R12: 0000000001c4e770 2018-09-09 17:14:34 [179203.853827] R13: 0000000000000007 R14: 00000000011935b0 R15: 0000000000000038 2018-09-09 17:14:34 [179203.861040] FS: 00002ad83f7afa00(0000) GS:ffff995c1da40000(0000) knlGS:0000000000000000 2018-09-09 17:14:34 [179203.869213] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2018-09-09 17:14:34 [179203.875042] CR2: 0000000002a18000 CR3: 00000017963f8000 CR4: 00000000007607e0 2018-09-09 17:14:34 [179203.882274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2018-09-09 17:14:34 [179203.889495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2018-09-09 17:14:34 [179203.896714] PKRU: 55555554 2018-09-09 17:14:34 [179203.899530] Call Trace: 2018-09-09 17:14:34 [179203.902065] Code: 48 39 cc 77 2f 48 8d 81 00 fe ff ff 48 39 e0 77 23 57 48 29 e1 65 48 8b 3c 25 78 0e 01 00 48 83 c7 28 48 29 cf 48 89 f8 48 89 e6 <f3> a4 48 89 c4 5f 48 89 e6 65 ff 04 26 2018-09-09 17:14:34 [179203.922628] RIP [<ffffffffbe121791>] apic_timer_interrupt+0x141/0x170 2018-09-09 17:14:34 [179203.929259] RSP <ffff995c1da46200> 2018-09-09 17:14:34 [179203.933970] ---[ end trace 3912e5e8b3b86da4 ]--- 2018-09-09 17:14:34 [179203.984039] Kernel panic - not syncing: Fatal exception 2018-09-09 17:14:34 [179203.989451] Kernel Offset: 0x3ca00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) And a crashdump here: crash> bt PID: 138341 TASK: ffff9fd7eb3c6eb0 CPU: 27 COMMAND: "shuangTwoPhaseE" #0 [ffff9ff02ee6bc38] machine_kexec at ffffffff938629da #1 [ffff9ff02ee6bc98] __crash_kexec at ffffffff93916692 #2 [ffff9ff02ee6bd68] crash_kexec at ffffffff93916780 #3 [ffff9ff02ee6bd80] oops_end at ffffffff93f1d738 #4 [ffff9ff02ee6bda8] die at ffffffff9382f96b #5 [ffff9ff02ee6bdd8] math_error at ffffffff9382cca8 #6 [ffff9ff02ee6be98] do_simd_coprocessor_error at ffffffff9382cec8 #7 [ffff9ff02ee6bec0] simd_coprocessor_error at ffffffff93f28c9e #8 [ffff9ff02ee6bf48] apic_timer_interrupt at ffffffff93f26791 [...] | ||||
Additional Information | This issue is seen on Servers with Skylake processors from HP/Dell but works fine on SandyBridge Procesors | ||||
Tags | No tags attached. | ||||
|
We have a dual Xeon Skylake machine here and have used it for more than a year without any problems. Can you provide enough information so that we can reproduce the problem here? |
|
I can also confirm that we have run OpenFOAM on several different Skylakes without issues. |
|
Too little information is provided about the system on which the problems exists or how to reproduce it on another system. |
Date Modified | Username | Field | Change |
---|---|---|---|
2019-11-11 05:24 | sunil_jain | New Issue | |
2019-11-11 08:37 | henry | Note Added: 0010876 | |
2019-11-11 12:54 | tniemi | Note Added: 0010884 | |
2019-11-14 20:25 | henry | Assigned To | => henry |
2019-11-14 20:25 | henry | Status | new => closed |
2019-11-14 20:25 | henry | Resolution | open => unable to reproduce |
2019-11-14 20:25 | henry | Note Added: 0010906 |