View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0003898 | OpenFOAM | Bug | public | 2022-10-07 16:15 | 2022-10-27 10:36 |
Reporter | wyldckat | Assigned To | wyldckat | ||
Priority | normal | Severity | crash | Reproducibility | sometimes |
Status | closed | Resolution | no change required | ||
Product Version | 10 | ||||
Summary | 0003898: Segmentation fault when using function entries due to incorrect global initialization | ||||
Description | I was intrigued about the summary conclusion for issue #3887 that it was a bug in CentOS 7.9, so I did a bit more digging into this issue. The diagnosis was that in the file "src/OpenFOAM/db/dictionary/entry/entryIO.C", in line 159: && functionName == functionEntries::inputSyntaxEntry::typeName this is what results in the segmentation fault when '#includeFunc' or similar, either from running 'blockMesh' or even 'foamDictionary'. The reason seems to be because the static variable "functionEntries::inputSyntaxEntry::typeName" is not always initialized on-time when using certain GCC builds, as the case with GCC builds provided by SCL. If we change that line to this: && functionName == functionEntries::inputSyntaxEntry::typeName_() It solves this issue. The annoying part now is figuring out why exactly the initializations are happening out-of-order from the expected way, when it works just fine with many of the GCC builds that people have been using. So my best guess is that it's a GCC feature that is turned on by default in some builds in the builds at SCL, not commonly used in other builds of GCC. | ||||
Steps To Reproduce | 1. Built OpenFOAM 10 from a git clone source code, as normal user on CentOS 7.9, using GCC 11 from SCL. 2. Copied the tutorial case 'tutorials/heatTransfer/chtMultiRegionFoam/reverseBurner' to the run folder. 3. Run: gdb blockMesh 4. Then: run bt 5. Not enough information, so added '-g' to 'src/OpenFOAM/Make/options' and ran 'wmake -j' within 'src/OpenFOAM'. 6. Did as steps 3 and 4. 7. Gotten the indication that the crash was originating from line 159. | ||||
Tags | No tags attached. | ||||
|
This reminded me of this old bug report https://bugs.openfoam.org/view.php?id=2645 Taking account that CentOS tries to be very backwards compatible and ancient, it might be possible that their gcc still processes the files in reverse order. If you want to investigate this, you could try to swap $(functionEntries)/inputSyntaxEntry/inputSyntaxEntry.C to occur before $(entry)/entryIO.C in src/OpenFOAM/Make/files and see if it makes a difference. Generally speaking the order of static initialization is not strictly defined across files/translation units, so problems like this can be hit occasionally. |
|
@tniemi: Many thanks! I've tried it just now, but it still gives the same issue. I've checked on the wmake output to confirm that I changed the order correctly and the order is updated to what you indicated. |
|
Ok. As a matter of fact, in this case the default order (entryIO.C before inputSyntaxEntry.C) would actually mean that a compiler that follows the current convention would actually always process entryIO.C first and then inputSyntaxEntry.C, which would mean there should be problems. Yet these do not happen.... Your proposed change would only need data from inputSyntaxEntry.H which gets included into entryIO.C, so it should be more safer. If it works with other compilers as well, it would probably be a good change. |
|
Quoting from file 'src/OpenFOAM/db/dictionary/functionEntries/inputSyntaxEntry/inputSyntaxEntry.C' - can be seen here https://github.com/OpenFOAM/OpenFOAM-10/blob/fdc522691c352db24630e49c2bf7decfbe8316b2/src/OpenFOAM/db/dictionary/functionEntries/inputSyntaxEntry/inputSyntaxEntry.C#L31 const Foam::word Foam::functionEntries::inputSyntaxEntry::typeName ( Foam::functionEntries::inputSyntaxEntry::typeName_() ); I'm simply unrolling manually the origin of the value defined in 'typeName', so it should still work as intended with all compilers. The downside is the very slight drop in performance when parsing through that line in 'entryIO.C'. |
|
Try moving the Foam::functionEntries::inputSyntaxEntry::typeName definition into entryIO.C |
|
@Henry: Unfortunately that did not fix the issue. I had tried on Friday to place that definition in the 'global.Cver' file and "wmake" within "src/OpenFOAM", but the same issue occurred. I've built again the OpenFOAM library with '-g' and the current gdb backtrace gives me: #0 std::operator==<char> (__lhs="includeFunc", __rhs=<error reading variable: Cannot access memory at address 0xffffffffffffffe8>, __lhs="includeFunc", __rhs=<error reading variable: Cannot access memory at address 0xffffffffffffffe8>) at /opt/rh/devtoolset-11/root/usr/include/c++/11/bits/basic_string.h:3963 #1 Foam::entry::New (parentDict=..., is=...) at db/dictionary/entry/entryIO.C:167 It seems more and more like this issue occurs due default GCC flags or absence of spcific default flags... I'm testing now on GCC 11 SCL with the following default flags from GCC 9 on Ubuntu 20.04: -fasynchronous-unwind-tables -fstack-protector-strong -Wformat -Wformat-security -fstack-clash-protection -fcf-protection I'll update here as soon as I can get results. |
|
So I have some good news and some annoying news: 1- It's not a compiler bug, in itself. 2- It is somewhat of a bug in the headers provided in SCL's GCC folder "/opt/rh/devtoolset-11/root/usr/include/c++/11/". 3- Therefore there is no clear way to fix this, unless it is fixed upstream on the SCL's provided GCC compilations. They seem to apply certain patches that are not part of GCC's own stack, among which one of them breaks this. For now, there is a workaround, as proposed in the Description of this report, but it does indeed seem to be some kind of bug in SCL's GCC patching. I'll try to report this to them. @Henry: Please feel free do undo the change made in commit 56023e97fb7d2f40b136e0d340c35a9d36e49640 at OpenFOAM-dev: https://github.com/OpenFOAM/OpenFOAM-dev/commit/56023e97fb7d2f40b136e0d340c35a9d36e49640 |
|
Good that you were able to figure this out. I was also just testing this with CentOS docker image and the behavior of the compiler is bizarre. Your workaround works, but if I eg. try to define a local static function which returns functionEntries::inputSyntaxEntry::typeName_() it crashes. To me it seems that the SCL toolset is clearly broken. |
|
@tniemi: I didn't go into much detail for my previous comment, but essentially I used the GCC main 'include' folder for 11.2.1 from SCL, in a custom build of GCC 11.3.0, and reproduced the same exact error. If I use the custom GCC 11.3.0 default 'include' folder, everything works just fine. SCL is using some patches of their own that clearly are attempting something specific, either for security or for something else, but I have yet to find proper records for those patches. I'm reporting this issue now at https://bugzilla.redhat.com - so that they can revise said patches. |
|
Reported at https://bugzilla.redhat.com/show_bug.cgi?id=2133780 |
|
It is possible that this problem is related to backward compatibility issues and if so, it might be unfixable in CentOS 7. Googling reveals that there are various issues related to C++11 conformance and string/list handling: https://stackoverflow.com/questions/73842752/is-this-a-bug-in-gcc-11-2-1-related-to-glibcxx-use-cxx11-abi https://bugzilla.redhat.com/show_bug.cgi?id=1546704 Basically what I gather is that SCL compilers (no matter the version) don't offer full C++11 support in RHEL7 in order to support RHEL6. So I guess the recommendation would be to always use custom built gcc's with CentOS 7. |
|
I tried setting D_GLIBCXX_USE_CXX11_ABI=0 in ubuntu gcc to replicate the behaviour of CentOS 7 and I got the segmentation fault. So it seems like this is the issue, SCL compilers force to use pre C++11 ABI and this does not play well with OF. Now for this specific case the problem can be avoided by calling "functionEntries::inputSyntaxEntry::typeName_()", but it is unclear whether this is the only problematic place. |
|
They now closed the bug report in redhat bugzilla, because C++11 ABI won't be supported on CentOS 7. It is not clear to me why using the old ABI causes problems, but somehow it does in this case. It might be related to relying on some undefined behaviour regarding static initialization order, but considering that normal gccs and clang work fine I don't think there is an issue. I also see no reason why somebody would use pre C++11 ABI these days unless specifically linking against ancient binaries. So this is quite much confined to CentOS 7 / RHEL 7 and their pre-built compilers. If the code is changed as suggested in the description, the seqfault goes away and SCL compilers might work fine. However, at least I would personally prefer to use a self built compiler on CentOS 7. |
|
Building a custom GCC isn't always the simplest thing to do, hence wanting to use SCL's GCC. Having the workaround documented for those that might want this capability should be good enough. It seems very strange to me that something as simple as an initialization failing due to an ABI issue... but everything points to GCC's own stack not properly supporting this specific use scenario, which isn't commonly triggered unless forced to go through the olden ways... Anyway, closing this as no fix required in OpenFOAM itself. |
Date Modified | Username | Field | Change |
---|---|---|---|
2022-10-07 16:15 | wyldckat | New Issue | |
2022-10-07 17:48 | tniemi | Note Added: 0012767 | |
2022-10-07 18:12 | wyldckat | Note Added: 0012768 | |
2022-10-07 18:29 | tniemi | Note Added: 0012769 | |
2022-10-07 18:43 | wyldckat | Note Added: 0012770 | |
2022-10-07 19:04 | henry | Note Added: 0012771 | |
2022-10-10 13:55 | wyldckat | Note Added: 0012784 | |
2022-10-11 11:19 | wyldckat | Relationship added | related to 0003887 |
2022-10-11 11:27 | wyldckat | Note Added: 0012787 | |
2022-10-11 11:40 | tniemi | Note Added: 0012789 | |
2022-10-11 12:01 | wyldckat | Note Added: 0012790 | |
2022-10-11 12:23 | wyldckat | Note Added: 0012791 | |
2022-10-11 13:07 | tniemi | Note Added: 0012792 | |
2022-10-13 12:23 | tniemi | Note Added: 0012806 | |
2022-10-26 21:25 | tniemi | Note Added: 0012835 | |
2022-10-27 10:36 | wyldckat | Note Added: 0012836 | |
2022-10-27 10:36 | wyldckat | Assigned To | => wyldckat |
2022-10-27 10:36 | wyldckat | Status | new => closed |
2022-10-27 10:36 | wyldckat | Resolution | open => no change required |