View Issue Details

IDProjectCategoryView StatusLast Update
0002097OpenFOAMContributionpublic2016-08-04 10:06
Reportera.weber Assigned Tohenry  
PrioritynormalSeveritytweakReproducibilityalways
Status resolvedResolutionfixed 
PlatformGNU/LinuxOSUbuntuOS Version14.10
Summary0002097: Major speedup for stochasticCollision
DescriptionI recomment using a better comparison algorithm for all stochasticCollsision algorithms.
Right now, all parcels are compared to each other needing n(n-1)/2 comparison steps. In a second step, it is checked if the parcels belong to the same cell for further computation.
A better approch is to turn the steps around, making a parcel list for each cell at first and then do the comparisation for particles belonging to the same cell.
Steps To ReproduceAny lagrangian simulation using stochasticCollision
Additional Informationmy implementation looks like this (ORourkeCollision.C):

old -----------
    label i = 0;
    forAllIter(typename CloudType, this->owner(), iter1)
    {
        label j = 0;
        forAllIter(typename CloudType, this->owner(), iter2)
        {
            if (j > i)
            {
             .......


 new ----------
   std::vector<parcelType*> pInCell [this->owner().mesh().nCells()];

   forAllIter(typename CloudType, this->owner(), iter1)
   {
       pInCell[iter1().cell()].push_back(&iter1());
   }

    //---- end
    for(label iter_cell(0);iter_cell<this->owner().mesh().nCells();iter_cell++)
    {
        if(pInCell[iter_cell].size() >= 2)
        {
            for(label i(0);i<label(pInCell[iter_cell].size());++i)
            {
                for(label j(0);j<label(pInCell[iter_cell].size());++j)
                {
                    if (j > i)
                    {
                .....

Some Info about this speeding up the computation can be found in
doi:10.1006/jcph.2001.6858
TagsNo tags attached.

Activities

henry

2016-05-23 15:18

manager   ~0006311

Thanks for the proposed optimization, I would be happy to include it but first I need to convert it into standard OpenFOAM form.

Can you propose a suitable test-case to validate the change and demonstrate the speed-up?

a.weber

2016-05-31 12:59

reporter  

newORourke.zip (285,518 bytes)

a.weber

2016-05-31 13:06

reporter   ~0006342

I did some test lately using my own solver for simulating a bubblecolumn. unfortunately the case data wouldnt fit to the standard sprayFoam anymore. You can still have a look at the case and the first CPU time comparison.
I therefore added CPU time checking only for the collide step including collideParcels substep in the code.

Downside of the testcase is, that it does not have a uniformly distribution of parcels in the domain, which will of corse influence the CPU time. Most parcels will accumulate in the middle (bubble plume).

henry

2016-05-31 13:33

manager   ~0006344

I have tried to run the case you provided with the solver specified: icoUncoupledKinematicParcelFoam but it fails to read transportProperties.

I need a case to test the changes to ORourkeCollision to ensure that I do not make any mistakes in updating the code you have provided. It can be any case which uses the ORourkeCollision model so I can make sure that the behavior of the model before and after the change is the same.

a.weber

2016-05-31 15:08

reporter  

newORourke2.zip (269,301 bytes)

a.weber

2016-05-31 15:10

reporter   ~0006346

I uploaded a case which should work with the standard sprayFoam solver. Nevertheless, there is a bug in ORourkeCollision wich will cause it to crash:


// interpolate to find average surface tension
    scalar sigmaAve = sigma1 + (sigma2 - sigma1)*(Tave - T1)/(T2 - T1);

solution:
scalar sigmaAve = sigma1 + (sigma2 - sigma1)*(Tave - T1)/max(ROOTVSMALL,(T2 - T1));

because T1 and T2 can be the same value

a.weber

2016-06-01 14:09

reporter   ~0006357

Found another major bug in ORourkeCollision:
    ...
    scalar collProb = exp(-nu);
    scalar xx = this->owner().rndGen().template sample01<scalar>();

    // collision occurs
    if (xx > collProb)

---- > has to be switched to:
    if (xx < collProb)

henry

2016-06-01 14:58

manager   ~0006358

Thank you for spending the time to sort-out the ORourkeCollision model, it has had very little testing so far and not validated.

henry

2016-06-02 10:53

manager   ~0006359

I don't think max(ROOTVSMALL,(T2 - T1)) is the appropriate stabilization, T2 might be less than T1.

henry

2016-06-04 11:53

manager   ~0006370

I could not get the code snippet you provided to compile and found that there were some important details missing, in particular regarding access to the particles and changes to i and j. In the end I had to write it from scratch using OpenFOAM classes rather than STL classes adding the missing components and updating some of the other functions consistently. Along the way I found some other bugs in the code which were causing spurious crashes and also corrected the stabilization of the division by (T2 -T1).

All of these changes are included in OpenFOAM-dev:
commit 91e84b9004c4a9be2c135465617783f21d28dc71

Could you test these changes and let me know if this model is now working correctly?

a.weber

2016-06-06 12:43

reporter   ~0006382

Last edited: 2016-06-06 13:06

It all seems to work perfectly. I'll convert all my other coalescence models accordingly and will have them tested.

I'll also do some more CPU time testing, I'll let you know about the results.

a.weber

2016-06-07 14:39

reporter  

CPUtime.png (7,394 bytes)   
CPUtime.png (7,394 bytes)   

a.weber

2016-06-07 14:51

reporter   ~0006405

CPU time testing done:
Clearly a faster code with the new algorithm.

I think it is the usage of the CompactListList which makes it a little bit slower when having a particle number under 300, but it is much more important do have no memory reallocation ( i haven't thought about that at first). Anyhow, a normal simulation (in my case) should easyly consist of over 300 parcels.

henry

2016-06-07 15:58

manager   ~0006406

Do you have the equivalent graph of the version with the List of DynamicList or your original version using std::vector? I would guess that using CompactListList will be slightly more expensive for very small numbers of parcels because the parcels have to be scanned twice but cheaper for larger numbers because there is only a single allocation for the ListList.

a.weber

2016-06-10 12:53

reporter  

CPUtime_detail.png (7,060 bytes)   
CPUtime_detail.png (7,060 bytes)   

a.weber

2016-06-10 12:55

reporter   ~0006422

Last edited: 2016-06-10 12:58

There you go.
For a low number of parcels, it seems that the compactListList is still better than using the std::vector.
dynamicList has only a slightly faster computation.

I've tested up to 10^5 particles, where there is no real difference between the three "new" algorithms.

henry

2016-06-10 13:29

manager   ~0006423

It is interesting that the DynamicList version is slightly faster but I think that using CompactListList is likely to give good bahavior on a wider range of hardware and we should stick with it.

henry

2016-06-10 13:30

manager   ~0006424

Resolved by commit 91e84b9004c4a9be2c135465617783f21d28dc71

a.weber

2016-08-04 09:25

reporter   ~0006649

I had a closer look again recently:

    // collision occurs
    if (xx > collProb)

is actually correct, sorry.

It works because it is not the direct propability nu0 but the exp(-nu)

    scalar collProb = exp(-nu);
    scalar xx = this->owner().rndGen().template sample01<scalar>();

henry

2016-08-04 10:06

manager   ~0006651

Resolved in OpenFOAM-4.x by commit 7dfa780c481b8b79b1ee4d5bcf3e6b839a5ef017
Resolved in OpenFOAM-dev by commit cc8781f47d11d4ee6a760bf356253b638851977d

Issue History

Date Modified Username Field Change
2016-05-23 15:04 a.weber New Issue
2016-05-23 15:18 henry Note Added: 0006311
2016-05-31 12:59 a.weber File Added: newORourke.zip
2016-05-31 13:06 a.weber Note Added: 0006342
2016-05-31 13:33 henry Note Added: 0006344
2016-05-31 15:08 a.weber File Added: newORourke2.zip
2016-05-31 15:10 a.weber Note Added: 0006346
2016-06-01 14:09 a.weber Note Added: 0006357
2016-06-01 14:58 henry Note Added: 0006358
2016-06-02 10:53 henry Note Added: 0006359
2016-06-04 11:53 henry Note Added: 0006370
2016-06-06 12:43 a.weber Note Added: 0006382
2016-06-06 13:06 a.weber Note Edited: 0006382
2016-06-07 14:39 a.weber File Added: CPUtime.png
2016-06-07 14:51 a.weber Note Added: 0006405
2016-06-07 15:58 henry Note Added: 0006406
2016-06-10 12:53 a.weber File Added: CPUtime_detail.png
2016-06-10 12:55 a.weber Note Added: 0006422
2016-06-10 12:58 a.weber Note Edited: 0006422
2016-06-10 12:58 a.weber Note Edited: 0006422
2016-06-10 13:29 henry Note Added: 0006423
2016-06-10 13:30 henry Note Added: 0006424
2016-06-10 13:30 henry Status new => resolved
2016-06-10 13:30 henry Fixed in Version => dev
2016-06-10 13:30 henry Resolution open => fixed
2016-06-10 13:30 henry Assigned To => henry
2016-08-04 09:25 a.weber Note Added: 0006649
2016-08-04 09:25 a.weber Status resolved => feedback
2016-08-04 09:25 a.weber Resolution fixed => reopened
2016-08-04 10:06 henry Note Added: 0006651
2016-08-04 10:06 henry Status feedback => resolved
2016-08-04 10:06 henry Fixed in Version dev => 4.x
2016-08-04 10:06 henry Resolution reopened => fixed