View Issue Details

IDProjectCategoryView StatusLast Update
0003189OpenFOAMBugpublic2019-03-08 09:19
ReporterBlumenkind Assigned Towill  
PrioritynormalSeveritycrashReproducibilitysometimes
Status resolvedResolutionfixed 
PlatformLinuxOSUbuntuOS Version18.04 and 16.04
Product Versiondev 
Summary0003189: MPIRun with interFoam crashes
DescriptionThe case is a rotating cylinder with a liquid/air-mixture, solved with interFoam
It runs smoothly on single core and parallel on OpenFOAM-5.x, however crashes in parallel with OpenFOAM-Dev
The crash occurs usually after 1-3 timesteps, when it will stop. The last output is usually the "PIMPLE converged after..." or the "Execution time..." line, probably depending on which processor crashes.
I have reproduced this crash on computers with Ubuntu 16.04 and 18.04

Sometimes it gives out an error message like:

PIMPLE: Converged in 3 iterations
ExecutionTime = 0.24 s ClockTime = 0 s

[Kiruna:23922] *** An error occurred in MPI_Recv
[Kiruna:23922] *** reported by process [3668180993,7]
[Kiruna:23922] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[Kiruna:23922] *** MPI_ERR_TRUNCATE: message truncated
[Kiruna:23922] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[Kiruna:23922] *** and potentially your MPI job)
Steps To Reproducecopy the tar
extract it
run prepareCase.sh
decomposePar
mpirun -np 8 interFoam -parallel
TagsNo tags attached.

Activities

Blumenkind

2019-03-07 13:51

reporter  

cylinder.tar.gz (5,644 bytes)

will

2019-03-07 15:31

manager   ~0010352

If you do a debug build of libfiniteVolume.so then it fails in serial, too.

The problem is the alpha sub cycling. It changes the time index which causes the solver performace data to be cleared. This happens on every pimple iteration. When the second iteration comes along all the data from the first is missing, and so the attempt to index into it fails.

I'm not sure how to deal with this yet.

will

2019-03-08 09:19

manager   ~0010354

I've made the solver data reset check for sub-cycling, and use the index of the outer time state if necessary. This resolves the issue. See:

https://github.com/OpenFOAM/OpenFOAM-dev/commit/88bea2740c889b4773b35a0624bea6c4befbfafa
https://github.com/OpenFOAM/OpenFOAM-6/commit/fdfb5b1825a3237bd2a6e0b04530a24947c8a753

Incidentally, the behaviour in 5.x probably isn't correct either. It might not crash, but the reset issue means the residuals used by the pimple corrector convergence control aren't likely to be the right ones.

Issue History

Date Modified Username Field Change
2019-03-07 13:51 Blumenkind New Issue
2019-03-07 13:51 Blumenkind File Added: cylinder.tar.gz
2019-03-07 15:31 will Note Added: 0010352
2019-03-08 09:19 will Assigned To => will
2019-03-08 09:19 will Status new => resolved
2019-03-08 09:19 will Resolution open => fixed
2019-03-08 09:19 will Fixed in Version => 6
2019-03-08 09:19 will Note Added: 0010354