View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0002462 | OpenFOAM | Bug | public | 2017-02-17 08:39 | 2017-03-05 15:08 |
Reporter | fsch1 | Assigned To | wyldckat | ||
Priority | normal | Severity | crash | Reproducibility | sometimes |
Status | closed | Resolution | unable to reproduce | ||
Summary | 0002462: particleCollector crashes sometimes while writing when run with mpirun | ||||
Description | the particleCollector in the CloudFunctions crashes sometimes when run in parallel with mpirun. I cant reproduce it all the time. But when it crashes, it crashes when writing the second write interval is completed and the state should be saved. It crashed with 12 cores on one computer, didn't crashed with 16... The logfile of the reactingParcelFilmFoam Courant Number mean: 0.001309656919 max: 0.337031983 Film max Courant number: 1.32441787 deltaT = 1.738564331e-05 Time = 0.2 Solving 3-D cloud reactingCloud1 Cloud: reactingCloud1 Current number of parcels = 242 Current mass in system = 0.0008483755848 Linear momentum = (0.0002180805787 -1.519212948e-05 0.0003142857867) |Linear momentum| = 0.0003828387328 Linear kinetic energy = 0.01112753596 model1: number of parcels added = 42 mass introduced = 0.0003045100188 Parcels absorbed into film = 115 New film detached parcels = 283 New film splash parcels = 70 Parcel fate (number, mass) - escape = 3, 2.317773034e-05 - stick = 0, 0 Temperature min/max = 300, 300 Mass transfer phase change = 0 particleCollector output: (END) And the ERROR mpirun --np 12 --hostfile AllMachines reactingParcelFilmFoam -parallel > Log [1] [5] [5] [5] --> FOAM FATAL IO ERROR: [5] wrong token type - expected Scalar, found on line 0 the punctuation token '-' [5] [5] file: IOstream.cloudFunctionObject.particleCollector1.massFlowRate at line [1] [1] --> FOAM FATAL IO ERROR: [1] wrong token type - expected Scalar, found on line 0 the punctuation token '-' [1] [1] file: IOstream.cloudFunctionObject.particleCollector1.massFlowRate at line 0. [1] [1] From function Foam::Istream& Foam::operator>>(Foam::Istream&, Foam::doubleScalar&) [1] in file lnInclude/Scalar.C at line 93. [1] FOAM parallel run exiting [1] 0. [5] [5] From function Foam::Istream& Foam::operator>>(Foam::Istream&, Foam::doubleScalar&) [5] in file lnInclude/Scalar.C at line 93. [5] FOAM parallel run exiting [5] -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 5 in communicator MPI_COMM_WORLD with errorcode 1. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun has exited due to process rank 5 with PID 12421 on node oita exiting improperly. There are two reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- [asdf:09226] 1 more process has sent help message help-mpi-api.txt / mpi-abort [asdf:09226] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages | ||||
Additional Information | My definition of the cloudFunctions. Log yes/no or resetOnWrite yes/no doesn't seem to change anything cloudFunctions { // delete all particle that pass this rectangle particleCollector1 { type particleCollector; mode polygon; polygons ( ( ($xPos_deleteParcels -2.0 0) ($xPos_deleteParcels -2.0 2) ($xPos_deleteParcels 2 2) ($xPos_deleteParcels 2 0) ) ); normal (1 0 0); negateParcelsOppositeNormal no; removeCollected yes; surfaceFormat vtk; resetOnWrite no; log no; } } | ||||
Tags | cloudFunctions, particleCollector | ||||
|
Can you please provide a test case with which we can replicate this error? I ask this because I can't find any indication in your report of which tutorial case from OpenFOAM that can be used to replicate this issue. |
|
It's a case compared to the hotBoxes. It happened two times, but I can't reproduce it right now :/ you could close the report and I would reopen it, if I can offer a case where it doesn't happen randomly... |
|
Closing it for now, since it's very hard to reproduce it. |
Date Modified | Username | Field | Change |
---|---|---|---|
2017-02-17 08:39 | fsch1 | New Issue | |
2017-02-17 08:39 | fsch1 | Tag Attached: cloudFunctions | |
2017-02-17 08:39 | fsch1 | Tag Attached: particleCollector | |
2017-02-20 23:20 | wyldckat | Note Added: 0007777 | |
2017-02-23 13:11 | fsch1 | Note Added: 0007818 | |
2017-03-05 15:08 | wyldckat | Assigned To | => wyldckat |
2017-03-05 15:08 | wyldckat | Status | new => closed |
2017-03-05 15:08 | wyldckat | Resolution | open => unable to reproduce |
2017-03-05 15:08 | wyldckat | Note Added: 0007859 |