View Issue Details

IDProjectCategoryView StatusLast Update
0002462OpenFOAMBugpublic2017-03-05 15:08
Reporterfsch1 Assigned Towyldckat  
PrioritynormalSeveritycrashReproducibilitysometimes
Status closedResolutionunable to reproduce 
Summary0002462: particleCollector crashes sometimes while writing when run with mpirun
Descriptionthe particleCollector in the CloudFunctions crashes sometimes when run in parallel with mpirun.
I cant reproduce it all the time. But when it crashes, it crashes when writing the second write interval is completed and the state should be saved.
It crashed with 12 cores on one computer, didn't crashed with 16...

The logfile of the reactingParcelFilmFoam

Courant Number mean: 0.001309656919 max: 0.337031983
Film max Courant number: 1.32441787
deltaT = 1.738564331e-05
Time = 0.2


Solving 3-D cloud reactingCloud1
Cloud: reactingCloud1
    Current number of parcels = 242
    Current mass in system = 0.0008483755848
    Linear momentum = (0.0002180805787 -1.519212948e-05 0.0003142857867)
   |Linear momentum| = 0.0003828387328
    Linear kinetic energy = 0.01112753596
    model1:
        number of parcels added = 42
        mass introduced = 0.0003045100188
    Parcels absorbed into film = 115
    New film detached parcels = 283
    New film splash parcels = 70
    Parcel fate (number, mass)
      - escape = 3, 2.317773034e-05
      - stick = 0, 0
    Temperature min/max = 300, 300
    Mass transfer phase change = 0

particleCollector output:
(END)


And the ERROR

mpirun --np 12 --hostfile AllMachines reactingParcelFilmFoam -parallel > Log
[1] [5]
[5]
[5] --> FOAM FATAL IO ERROR:
[5] wrong token type - expected Scalar, found on line 0 the punctuation token '-'
[5]
[5] file: IOstream.cloudFunctionObject.particleCollector1.massFlowRate at line
[1]
[1] --> FOAM FATAL IO ERROR:
[1] wrong token type - expected Scalar, found on line 0 the punctuation token '-'
[1]
[1] file: IOstream.cloudFunctionObject.particleCollector1.massFlowRate at line 0.
[1]
[1] From function Foam::Istream& Foam::operator>>(Foam::Istream&, Foam::doubleScalar&)
[1] in file lnInclude/Scalar.C at line 93.
[1]
FOAM parallel run exiting
[1]
0.
[5]
[5] From function Foam::Istream& Foam::operator>>(Foam::Istream&, Foam::doubleScalar&)
[5] in file lnInclude/Scalar.C at line 93.
[5]
FOAM parallel run exiting
[5]
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 5 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 5 with PID 12421 on
node oita exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[asdf:09226] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[asdf:09226] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages





Additional InformationMy definition of the cloudFunctions. Log yes/no or resetOnWrite yes/no doesn't seem to change anything

cloudFunctions
{
    // delete all particle that pass this rectangle
    particleCollector1
    {
        type particleCollector;

        mode polygon;
        polygons
        (

            (
                ($xPos_deleteParcels -2.0 0)
                ($xPos_deleteParcels -2.0 2)
                ($xPos_deleteParcels 2 2)
                ($xPos_deleteParcels 2 0)
            )
        );
        normal (1 0 0);

        negateParcelsOppositeNormal no;
        removeCollected yes;
        surfaceFormat vtk;
        resetOnWrite no;
        log no;
    }
}
TagscloudFunctions, particleCollector

Activities

wyldckat

2017-02-20 23:20

updater   ~0007777

Can you please provide a test case with which we can replicate this error?
I ask this because I can't find any indication in your report of which tutorial case from OpenFOAM that can be used to replicate this issue.

fsch1

2017-02-23 13:11

reporter   ~0007818

It's a case compared to the hotBoxes. It happened two times, but I can't reproduce it right now :/
you could close the report and I would reopen it, if I can offer a case where it doesn't happen randomly...

wyldckat

2017-03-05 15:08

updater   ~0007859

Closing it for now, since it's very hard to reproduce it.

Issue History

Date Modified Username Field Change
2017-02-17 08:39 fsch1 New Issue
2017-02-17 08:39 fsch1 Tag Attached: cloudFunctions
2017-02-17 08:39 fsch1 Tag Attached: particleCollector
2017-02-20 23:20 wyldckat Note Added: 0007777
2017-02-23 13:11 fsch1 Note Added: 0007818
2017-03-05 15:08 wyldckat Assigned To => wyldckat
2017-03-05 15:08 wyldckat Status new => closed
2017-03-05 15:08 wyldckat Resolution open => unable to reproduce
2017-03-05 15:08 wyldckat Note Added: 0007859