View Issue Details

IDProjectCategoryView StatusLast Update
0003550OpenFOAMPatchpublic2020-09-18 14:54
Reporterwyldckat Assigned Tohenry  
PrioritylowSeverityminorReproducibilitysometimes
Status resolvedResolutionfixed 
Product Versiondev 
Fixed in Versiondev 
Summary0003550: 'malloc' in 'wmkdep.l' assumes memory preset to zero, only one situation where it can cause issues
DescriptionIn the file 'wmake/src/wmkdep.l', these lines of code are risky and can cause issues in some situations:

        sourcePath = (char*)malloc(basePos - sourceFile + 1);
        strncpy(sourcePath, sourceFile, basePos - sourceFile);

In essence, the risk here is that 'sourcePath' gets a partial copy of the path on the left of the source file string, but does not assign the null character at the end of the copy.
Here is an example, namely if:

   sourceFile = "singlePhaseTransportModel/singlePhaseTransportModel.H";

then the result will be:

   sourcePath = "singlePhaseTransportModel?";

the '?' refers to the last byte of the 'sourcePath' char array, which is undefined when using 'strncpy', which quoting from here: http://www.cplusplus.com/reference/cstring/strncpy/

    No null-character is implicitly appended at the end of destination if source is longer than num. Thus, in this case, destination shall not be considered a null terminated C string (reading it as such would overflow).

This is usually not a problem in the majority of situations, at least on Linux, since the majority of situations the memory is already all preset to NULL... however, there are a few instances where this isn't true, even on Linux. In my situation, this was triggered on Windows, which usually initializes RAM with non-NULL values (allegedly to assist in catching memory leaks), which revealed this issue.

The proposed solution is to use the attached changes:

  - 'patch_wmkdep_v1.patch' shows the proposed change, namely to use 'calloc', which does initialize memory to zeros of the new array.

  - 'wmkdep.l' for replacing 'wmake/src/wmkdep.l' in OpenFOAM-dev (commit c4f98e7835ce, Sep 17 10:51:29 2020)

I've tested this on Windows and Linux and it does not have any other issues. All other 'malloc's in this code seem to be being used safely, since they are followed by string copy functions that always copy/include the null termination character.
TagsNo tags attached.

Activities

wyldckat

2020-09-18 13:29

updater  

patch_wmkdep_v1.patch (424 bytes)   
diff --git a/wmake/src/wmkdep.l b/wmake/src/wmkdep.l
index 166a219..b159ee2 100644
--- a/wmake/src/wmkdep.l
+++ b/wmake/src/wmkdep.l
@@ -144,7 +144,7 @@ int main(int argc, char* argv[])
     }
     else
     {
-        sourcePath = (char*)malloc(basePos - sourceFile + 1);
+        sourcePath = (char*)calloc(basePos - sourceFile + 1, sizeof(char*));
         strncpy(sourcePath, sourceFile, basePos - sourceFile);
     }
 
patch_wmkdep_v1.patch (424 bytes)   
wmkdep.l (12,169 bytes)   
%{
/*---------------------------------*- C -*-----------------------------------*\
  =========                 |
  \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox
   \\    /   O peration     | Website:  https://openfoam.org
    \\  /    A nd           | Copyright (C) 2011-2020 OpenFOAM Foundation
     \\/     M anipulation  |
-------------------------------------------------------------------------------
License
    This file is part of OpenFOAM.

    OpenFOAM is free software: you can redistribute it and/or modify it
    under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    OpenFOAM is distributed in the hope that it will be useful, but WITHOUT
    ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
    FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
    for more details.

    You should have received a copy of the GNU General Public License
    along with OpenFOAM.  If not, see <http://www.gnu.org/licenses/>.

Application
    wmkdep

Description
    A fast dependency list generator that emulates the behaviour and the
    output of cpp -M. However, the output contains no duplicates and
    is approx. 40% faster than cpp.

    The algorithm uses flex to scan the directories specified with '-Idir'
    options for include files and searches the files found.  Each file is
    entered into a hash table so that files are scanned only once.  The
    resulting file paths are added to the dependencies file after replacing
    strings specified with the '-R string replacement' options.

Usage
    wmkdep [ -R string replacement ... -R string replacement ] \
        [ -Idir ... -Idir ] <source file> <dependencies file>

\*---------------------------------------------------------------------------*/

#define HASH_TABLE_SIZE 500
#define REPLACEMENTS_SIZE 10
#define INITIAL_MAX_N_FILES 1000

#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <dirent.h>
#include <errno.h>

/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */

/* char* entry in hash table */
struct HashEntry
{
    char* name;
    struct HashEntry* next;
};

/* String search/replace pair */
struct searchReplace
{
    char* search;
    size_t searchLen;
    char* replace;
    size_t replaceLen;
};

int nDirectories = 0;
char** directories;
char* sourceFile = NULL;
char* sourcePath = NULL;
char* depFilePath = NULL;
char* depFileName = NULL;

int nReplacements = 0;
struct searchReplace replacements[REPLACEMENTS_SIZE];

/* Set of files already visited */
struct HashEntry* visitedFiles[HASH_TABLE_SIZE];

/* List of dependency files */
int nFiles = 0;
int currentFile = 0;
int maxNfiles = INITIAL_MAX_N_FILES;
char** files;

/* Current path of the dependency file */
const char* currentPath = NULL;

void nextFile(const char* fileName);
int lookUp(struct HashEntry** hashTable, const char* p);
void addFile(char* filePath);
void openFile(const char* filePath);
char* strRep(char* str, struct searchReplace* sr);
char* substitutePath(char* filePath);
void printFile(FILE* file, const char* filePath);

/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */

#undef yywrap        /* Sometimes a macro by default */

%}

%x CMNT CFNAME
%%

"//".*\n                  ;               /* Remove c++ style line comments */

"/*"                      BEGIN(CMNT);    /* Start removing c style comment */
<CMNT>.|\n                ;
<CMNT>"*/"                BEGIN(INITIAL); /* End removing c style comment */

^[ \t]*#[ \t]*include[ \t]+\"   BEGIN(CFNAME);  /* c-file name */
<CFNAME>[^"\n ]*        { BEGIN(INITIAL); nextFile(yytext); } /*"*/

.|\t|\n                   ;

%%


/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */

int main(int argc, char* argv[])
{
    if (argc == 1)
    {
        fprintf(stderr, "input file not supplied\n");
        exit(1);
    }

    sourceFile = strdup(argv[argc-2]);
    depFilePath = strdup(argv[argc-1]);

    char* basePos = strrchr(sourceFile, '/');

    if (basePos == NULL)
    {
        sourcePath = strdup(".");
    }
    else
    {
        sourcePath = (char*)calloc(basePos - sourceFile + 1, sizeof(char*));
        strncpy(sourcePath, sourceFile, basePos - sourceFile);
    }

    if (basePos == NULL)
    {
        basePos = sourceFile;
    }
    else
    {
        basePos++;
    }

    char* dotPos = strrchr(sourceFile, '.');
    if (dotPos == NULL || dotPos < basePos)
    {
        fprintf
        (
            stderr,
            "cannot find extension in source file name %s\n",
            sourceFile
        );
        exit(1);
    }

    /* Build list of string replacements */
    int i;
    for (i = 1; i < argc - 1; i++)
    {
        if (strncmp(argv[i], "-R", 2) == 0)
        {
            replacements[nReplacements].search = strdup(argv[++i]);
            replacements[nReplacements].searchLen =
                strlen(replacements[nReplacements].search);

            replacements[nReplacements].replace = strdup(argv[++i]);
            replacements[nReplacements].replaceLen =
                strlen(replacements[nReplacements].replace);

            nReplacements++;
        }
    }

    /* Count number of -I directories */
    nDirectories = 1;
    for (i = 1; i < argc - 1; i++)
    {
        if (strncmp(argv[i], "-I", 2) == 0)
        {
            if (strlen(argv[i]) > 2)
            {
                nDirectories++;
            }
        }
    }

    directories = (char**)malloc(sizeof(char*)*nDirectories);

    // Insert the source directory as the first directory searched
    directories[0] = strdup(sourcePath);

    /* Build list of -I directories */
    nDirectories = 1;
    for (i = 1; i < argc - 1; i++)
    {
        if (strncmp(argv[i], "-I", 2) == 0)
        {
            if (strlen(argv[i]) > 2)
            {
                directories[nDirectories++] = strdup(argv[i] + 2);
            }
        }
    }

    depFileName = substitutePath(strdup(depFilePath));

    /* Initialise storage for the list of the dependencies */
    files = (char**)malloc(sizeof(char*)*maxNfiles);

    openFile(sourceFile);
    yylex();


    /* Write dependencies file */

    FILE* depFile;
    if (!(depFile = fopen(depFilePath, "w")))
    {
        fprintf
        (
            stderr,
            "could not open dependencies file %s "
            "for source file %s due to %s\n",
            depFilePath, sourceFile, strerror(errno)
        );
        exit(1);
    }

    fprintf(depFile, "%s: \\\n", depFileName);
    printFile(depFile, sourceFile);
    for (i = 0; i < nFiles; i++)
    {
        files[i] = substitutePath(files[i]);
        printFile(depFile, files[i]);
    }

    fputs("\n", depFile);

    /* Write the dummy rules for the dependencies */
    for (i = 0; i < nFiles; i++)
    {
        fprintf(depFile, "%s :\n", files[i]);
    }

    fputs("\n", depFile);

    /* Clean-up storage */

    for (i = 0; i < nDirectories; i++)
    {
        free(directories[i]);
    }
    free(directories);

    free(sourceFile);
    free(depFilePath);
    free(depFileName);

    for (i = 0; i < nFiles; i++)
    {
        free(files[i]);
    }
    free(files);

    return 0;
}


/* Add a directory name to the file name */
char* addDirectoryName(const char* dirName, const char* fileName)
{
    char* filePath = (char*)malloc(strlen(dirName) + strlen(fileName) + 2);
    strcpy(filePath, dirName);

    if (dirName[strlen(dirName)-1] != '/')
    {
        strcat(filePath, "/");
    }

    strcat(filePath, fileName);

    return filePath;
}


/* Find path to next file and add it to the list */
void nextFile(const char* fileName)
{
    if (lookUp(visitedFiles, fileName))
    {
        return;   /* Already existed (did not insert) */
    }

    if (access(fileName, R_OK ) != -1)
    {
        addFile(strdup(fileName));
        currentPath = NULL;
    }
    else
    {
        int i;
        for (i = 0; i < nDirectories; i++)
        {
            char* filePath = addDirectoryName(directories[i], fileName);

            if (access(filePath, R_OK ) != -1)
            {
                addFile(filePath);
                currentPath = directories[i];

                return;
            }

            free(filePath);
        }

        if (nDirectories == 0)
        {
            fprintf
            (
                stderr,
                "could not open file %s for source file %s\n",
                fileName, sourceFile
            );
        }
        else
        {
            fprintf
            (
                stderr,
                "could not open file %s for source file %s due to %s\n",
                fileName, sourceFile, strerror(errno)
            );
        }

        /* Only report error on the first occurrence */
        lookUp(visitedFiles, fileName);
    }
}


/* Lookup name in hash table.
   If found - return 1
   If not found - insert in table and return 0
*/
int lookUp(struct HashEntry** hashTable, const char* p)
{
    int ii = 0;
    struct HashEntry* n;
    struct HashEntry* nn;

    /* Hash */
    const char* pp = p;
    while (*pp) ii = ii<<1 ^ *pp++;
    if (ii < 0) ii = -ii;
    ii %= HASH_TABLE_SIZE;

    /* Search */
    for (n = hashTable[ii]; n; n = n->next)
    {
        if (strcmp(p, n->name) == 0)
        {
            /* Entry found so return true */
            return 1;
        }
    }

    /* Insert */
    nn = (struct HashEntry*)malloc(sizeof(struct HashEntry));
    nn->name = strdup(p);
    nn->next = hashTable[ii];
    hashTable[ii] = nn;

    /* Entry not found, and therefore added. return false */
    return 0;
}


/* Add file to list */
void addFile(char* filePath)
{
    if (nFiles == maxNfiles - 1)
    {
        maxNfiles *= 2;
        files = (char**)realloc(files, sizeof(char*)*maxNfiles);
    }

    files[nFiles++] = filePath;
}


/* Open file and set yyin */
void openFile(const char* filePath)
{
    if (!(yyin = fopen(filePath, "r")))
    {
        fprintf
        (
            stderr,
            "could not open file %s for source file %s due to %s\n",
            filePath, sourceFile, strerror(errno)
        );
    }
}


/* String search/replace */
char* strRep(char* str, struct searchReplace* sr)
{
    char* searchStart = strstr(str, sr->search);

    if (searchStart  != NULL)
    {
        if (sr->replaceLen > sr->searchLen)
        {
            const size_t start = str - searchStart;
            str = realloc
            (
                str,
                strlen(str) + sr->replaceLen - sr->searchLen + 1
            );
            searchStart = str + start;
        }

        const size_t tailLen = strlen(searchStart + sr->searchLen);

        memmove
        (
            searchStart + sr->replaceLen,
            searchStart + sr->searchLen,
            tailLen + 1
        );

        memcpy(searchStart, sr->replace, sr->replaceLen);
    }

    return str;
}


/* Substitute path components with command-line replacements */
char* substitutePath(char* filePath)
{
    if (nReplacements)
    {
        int i;
        for (i = 0; i < nReplacements; i++)
        {
            filePath = strRep(filePath, &replacements[i]);
        }
    }

    return filePath;
}


/* Print file path to the dependencies file */
void printFile(FILE* file, const char* filePath)
{
    fprintf(file, "%s \\\n", filePath);
}


/* The lexer calls yywrap to handle EOF conditions */
int yywrap()
{
    /* Close the current file which has just reached EOF */
    fclose(yyin);

    if (currentFile == nFiles) /* There are no more files in the list */
    {
        /* Return 1 to inform lex finish now that all files have been read */
        return 1;
    }
    else
    {
        /* Open the next file on the list */
        openFile(files[currentFile++]);

        /* Return to the normal state for the next file */
        BEGIN(INITIAL);

        /* Return 0 to inform lex to continue reading */
        return 0;
    }
}


/*****************************************************************************/
wmkdep.l (12,169 bytes)   

henry

2020-09-18 13:36

manager   ~0011515

Using the the fact that calloc initialises the memory is a bit of a hack, wouldn't it be better to set the null character explicitly so that the intent is clear in the code rather than using a side-effect?

wyldckat

2020-09-18 13:42

updater   ~0011516

That's what I had initially thought of proposing, namely:

   sourcePath[basePos - sourceFile] = 0;

but then this felt more like a hack than using calloc... even though calloc likely ends up being slower.

There is also the function 'strncpy_s', which is standard as of C11: https://en.cppreference.com/w/c/string/byte/strncpy - quoting:

    Although truncation to fit the destination buffer is a security risk and therefore a runtime constraints violation for strncpy_s, it is possible to get the truncating behavior by specifying count equal to the size of the destination array minus one: it will copy the first count bytes and append the null terminator as always: strncpy_s(dst, sizeof dst, src, (sizeof dst)-1);

Although its usage is a bit more cumbersome to the eye and wasn't standard years ago...

henry

2020-09-18 13:56

manager   ~0011517

sourcePath[basePos - sourceFile] = '\0';

would be even more explicit, see

commit 067ab99f0fc238adca948b39687e4301b57c10d4

wyldckat

2020-09-18 14:54

updater   ~0011518

@Henry: Perfect, I've tested it on Windows and works as intended.

Marking this issue as resolved in the commit above.

PS: I'm getting rusty... I had forgotten about '\0' being the null character.

Issue History

Date Modified Username Field Change
2020-09-18 13:29 wyldckat New Issue
2020-09-18 13:29 wyldckat File Added: patch_wmkdep_v1.patch
2020-09-18 13:29 wyldckat File Added: wmkdep.l
2020-09-18 13:36 henry Note Added: 0011515
2020-09-18 13:42 wyldckat Note Added: 0011516
2020-09-18 13:56 henry Note Added: 0011517
2020-09-18 14:54 wyldckat Assigned To => henry
2020-09-18 14:54 wyldckat Status new => resolved
2020-09-18 14:54 wyldckat Resolution open => fixed
2020-09-18 14:54 wyldckat Fixed in Version => dev
2020-09-18 14:54 wyldckat Note Added: 0011518