2017-06-25 06:13 BST

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0002463OpenFOAM[All Projects] Bugpublic2017-03-22 22:41
Reporterfsalmon 
Assigned Towyldckat 
PrioritynormalSeveritymajorReproducibilityalways
StatusclosedResolutionno change required 
PlatformOSLinux MintOS Version17.1
Product Versiondev 
Target VersionFixed in Version 
Summary0002463: Bad performance in parallel
DescriptionI am working on a cluster and I have very bad performance when I use more than one processor, except for some cases. But for example, with the tutorial smallPoolFire3D (fireFoam), I have better result in serial. With 1M cells, 4 processors are better than serial but it is so bad when I increase the number of cells to 4M. A parallelisation of 2 processors is better for this number of cells.
To sum up, I do not think it is possible to have the best performance only with one processor for a structured mesh of 1M cells. I do not understand.

The cluster SGI UV 2000 works fine for other software, the processors are Sandy Bridge E5-4640 @ 2.40GHz. If you want more information on the cluster, you can ask.
Additional InformationI have tested all the decomposition method and there is difference between them but performance is still very bad. I have also tried to change the solvers. Again, there are some differences but it is still very slow.
TagsNo tags attached.
Attached Files
  • ? file icon log.make (204,491 bytes) 2017-02-22 09:19

-Relationships
+Relationships

-Notes

~0007767

wyldckat (updater)

At the risk of us going through a query process similar to a support session, first it's necessary to assess if this is a bug/limitation in OpenFOAM or a set-up/installation problem.

Therefore, I have several questions to ask you:


1. Have you built OpenFOAM-dev from source code or are you using the weekly Deb packages for Ubuntu?

I ask this because the Deb packages may not have the best performance for the installation that you're using, given that the compiler can optimize the build for the processor/architecture it's being built on.



2. If you built from source code, which installation instructions did you follow? Furthermore:

  a. Did you do anything differently from those installation instructions?

  b. Which compiler and version did you use to build?

  c. Which precision and label size did you use? Was it the default 'DP' (Double Precision) and 32-bit labels?



3. The cluster may have its own MPI software stack that must be used in order for it to work properly.So the question here is: Did you build OpenFOAM with the required MPI software?



4. You mentioned that the performance varies depending on the case. Knowing which cases scale properly in the cluster you're using, would make it a lot easier to pinpoint the cause of the problem.



5. What does the solver header at the start of run provide? Specifically, I'm looking for the following information:

    Pstream initialized with:
        floatTransfer : 0
        nProcsSimpleSum : 0
        commsType : nonBlocking
        polling iterations : 0
    sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
    fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)



6. Given that the term "processor" depends on the context, we need to clarify to which context you're referring to. This is because there are 2-3 contexts that may be discussed here:

  i. Each sub-domain that is created with 'decomposePar' is assigned to a folder/directory named "processor??".

  ii. A single E5-4640 has 8 cores. This can be referred to as 1 CPU (processor) with 8 cores, or referred to as 8 logical processors.

  iii. The E5-4640 model can be placed in the same motherboard as 3 other E5-4640 models, which means that each E5-4640 can be referred to a processor, hence there being 4 E5-4640 units, which could be referred to as "processors".

Furthermore, it's useful to know if HyperThreading is turned on in the machines of the cluster, because this can throw off the logical core count, which would equate to twice as many logical cores. Hence the terminology for "real cores" vs "logical cores".



7. Are you using the correct command work-flow to launch the applications in parallel in that cluster, with the correct shell environment loaded in?

For example, if the wrong MPI settings are loaded in, you could be using the wrong memory/network interconnect.



8. When you perform these tests, are the machines in use for your run being shared with any other runs?
This could also affect performance, so I prefer to confirm with you if you're keeping track of this.



9. Have you tried running those cases in any other machines, outside of the cluster, to assess if the problem happens only on the cluster or not?

~0007768

fsalmon (reporter)

Thank you for your reply.

1) From source code. I followed this instruction : https://openfoamwiki.net/index.php/Installation/Linux/OpenFOAM-dev/Ubuntu

2) I copy what I have in my bashrc file:
WM_COMPILER_TYPE=ThirdParty WM_COMPILER=Gcc48 WM_MPLIB=OPENMPI FOAMY_HEX_MESH=yes

Just two things, perhaps it could come from that: I move "#include <mphi.h>" from extern "C" to the rest of include in the file src/parallel/decompose/ptscotchDecomp/ptscotchDecomp.C, and I ran "export WM_MPLIB=SYSTEMOPENMPI" in the terminal just before running the installation of OpenFOAM. It did not work without doing that.

3) Well, I think not

4) I said it depends of the case but results are always bad but as I remind, it is better (but bad) with a structured mesh.

5)
Pstream initialized with:
    floatTransfer : 0
    nProcsSimpleSum : 0
    commsType : nonBlocking
    polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations

6) I am not an expert in that so I will give you exactly what it is written, I hope it will help you:

432 cores Sandy Bridge E5-4640 @ 2.40GHz
1.7 To of RAM

So it is not written processor but core and when I run a calculation, I can run it on 8 cores for example, but not really processor I think.

7) I use the classical " runParallel `getApplication` ". Have I to use mpirun ... ? I can try it. I will and if it works I say to you.

8) Yes the cluster is shared. So there is other calculations from other people.

9) Yes I tried on my computer and it works well. The more processors I used, the faster. But the installation was without any problem so I followed exactly the link in the first point.

~0007776

wyldckat (updater)

Many thanks for the details. The diagnosis based on your feedback is as follows:

Regarding items 1 and 2)
Linux Mint 17.1 is based on Ubuntu 14.04, as indicated here: https://en.wikipedia.org/wiki/Linux_Mint_version_history

And based on the information that you provided regarding the installation instructions, it looks like you followed the wrong instructions, because that line that your reported is for Ubuntu 12.04 and not 14.04. That's why you had to change the "WM_MPLIB" shell variable export.

That said, the installation instructions you have indicated were not (yet) designed to be used on a cluster, they are unofficial community written instructions. Furthermore, those instructions explicitly request the readers to ask about problems with those instructions at the associated forum, it does not state to ask questions about it here on this bug tracker :(

Therefore, since you reported this issue here, please follow the official installation instructions: https://openfoam.org/download/source/



Regarding item 3)
Such a fairly expensive cluster should have specific instructions that should be followed for using the correct MPI toolbox. Not following the instructions provided by the system administrator of that cluster, is the most likely reason why you're having problems with using OpenFOAM on that cluster.



Regarding item 7)
Those are not good news. I suspect that the problem has to do with not following the instructions provided by the cluster administrator.
What I mean is that since you did not state what were all of the steps you did in order to launch the case in parallel, I suspect that the case was not launched correctly into the cluster and it was instead launched on the access terminal, which is likely a restricted access interface, which will likely only have 1 or 2 cores available.


Regarding items 8 and 9)
With these last two details and taking into account the details above, all signs point to this being a support related issue and not a bug.


Therefore, I'll be closing this issue for now, as "unable to reproduce". Please re-open this report if you have problems when using the official installation instructions: https://openfoam.org/download/source/
And please use the same procedure you use for running other applications in parallel on the cluster. If you do not follow the same procedure (which should have been provided by your cluster administrator), then it's only natural that OpenFOAM is not working properly.

~0007797

fsalmon (reporter)

Thanks very much for your time.

I made what you said and I have still the same error, I have a problem with MPLIB. This time, I can use what I want as MPLIB (SYSTEMOPENMPI by default), I have always the same error. I can write 'export WM_MPLIB=Hello' and the problem is the same with WMPLIB=SYSTEMOPENMPI or WMPLIB=OPENMPI like if I had not OPENMPI on the cluster.

When I type mpi-selector-menu, I have

Current system default: openmpi-1.8.1
Current user default: <none>

    "u" and "s" modifiers can be added to numeric and "U"
    commands to specify "user" or "system-wide".

1. openmpi-1.8.1
U. Unset default
Q. Quit


I am not sure it is good.

The error that I have is:

platforms/linux64GccDPInt32Opt/lib/openmpi-system/libPstream.so: undefined reference to `MPI_Waitall'

And I have a large list of errors where only 'MPI_Waitall' change.
I will try to attach the log file.

Sorry to reopen this thread.

~0007822

fsalmon (reporter)

Problem solved. You can close the issue.

The problem was MPI. I have to install openfoam with SGIMPI on the cluster. I do not think it is the same for all clusters but use SYSTEMOPENMPI was not a good thing for me.

~0007824

wyldckat (updater)

Many thanks for the feedback!

I did write in a previous comment of mine:

  I suspect that the problem has to do with not following the instructions provided by the cluster administrator.


Which does lead me to these questions:
1. How exactly did you find out that it was SGIMPI was the MPI you had to use on that cluster?
2. Did you get the information from the cluster administrator or did you search for the toolbox within the cluster?

I ask this because if you searched for the MPI on your own, knowing the searching strategy could help us write better instructions for these scenarios here on the openfoam.org website.

~0007831

fsalmon (reporter)

I ask to a user and actually, this is SGIMPI that we used on the cluster. I thought that SYSTEMOPENMPI would use the openmpi of the system and I think it did but with openmpi it does not work, I do not know why.

I cannot help more, it was just a guess that it could come from that and it was the case.

And we have to change it in the bashrc file (it is SYSTEMOPENMPI by default) but I think it is already written in the instructions.

~0007975

wyldckat (updater)

Thanks for the feedback! It took me a while to take a look into this again and it looks like that for now there isn't much more we can improve right now.

I'm closing this as "no change required".
+Notes

-Issue History
Date Modified Username Field Change
2017-02-17 10:33 fsalmon New Issue
2017-02-19 16:00 wyldckat Note Added: 0007767
2017-02-20 08:42 fsalmon Note Added: 0007768
2017-02-20 23:17 wyldckat Note Added: 0007776
2017-02-20 23:17 wyldckat Assigned To => wyldckat
2017-02-20 23:17 wyldckat Status new => closed
2017-02-20 23:17 wyldckat Resolution open => unable to reproduce
2017-02-22 09:19 fsalmon Status closed => feedback
2017-02-22 09:19 fsalmon Resolution unable to reproduce => reopened
2017-02-22 09:19 fsalmon Note Added: 0007797
2017-02-22 09:19 fsalmon File Added: log.make
2017-02-28 10:35 fsalmon Note Added: 0007822
2017-02-28 10:35 fsalmon Status feedback => assigned
2017-02-28 11:01 wyldckat Note Added: 0007824
2017-02-28 16:23 fsalmon Note Added: 0007831
2017-03-22 22:41 wyldckat Status assigned => closed
2017-03-22 22:41 wyldckat Resolution reopened => no change required
2017-03-22 22:41 wyldckat Note Added: 0007975
+Issue History