Thursday, October 11, 2018

A trivial note on Vector.erase()

Recently, I encountered a problem with Vector.erase() function. Consider the below code.

#include <vector>
#include <iostream>

struct SpookyStruct
{
    int k = 10;
    std::string blank = "blank";

    void operator=(const SpookyStruct& rhs)
    {
        k = rhs.k;
        blank = blank;
    }
};

std::vector<SpookyStruct> spookyVector =
{
    {1, "This"},
    {2, "is"},
    {3, "Disaster"},
    {4, "You"},
    {5, "Know"}
};

int main(int argc, char** argv)
{
    for (auto entry = spookyVector.begin(); entry != spookyVector.end(); ++entry)
    {
        std::cout << "Address of iterator before deletion : " << &(*entry) << std::endl;
    }

    for (auto entry = spookyVector.begin(); entry != spookyVector.end();)
    {
        if (entry->k % 2)
        {
            ++entry;
            continue;
        }

        entry = spookyVector.erase(entry);
    }

    for (auto entry = spookyVector.begin(); entry != spookyVector.end(); ++entry)
    {
        std::cout << "Address of iterator after deletion : " << &(*entry) << std::endl;
    }

    for (const auto& oddOnly : spookyVector)
    {
        std::cout << oddOnly.k << std::endl;
        std::cout << oddOnly.blank << std::endl;
        std::cout << std::endl;
    }

    std::cout << spookyVector.capacity() << std::endl;
}

What should be expected output? Let's run the program

nanda@nanda-MS-7640:/media/nanda/PERSONAL/c14_code/vec_erase$ g++ -std=c++14 SpookyVector.cpp 
nanda@nanda-MS-7640:/media/nanda/PERSONAL/c14_code/vec_erase$ ./a.out 

1
This

3
is

5
Disaster

Yes, the result is a disaster. This is not the output we are expecting. So what's wrong with the code?

void operator=(const SpookyStruct& rhs)
 {
    k = rhs.k;
    blank = blank;
}

Yes! It is a trivial bug but sometimes hard to debug if you don't know what is happening inside erase function.

What is happening during erase?

You know that one of the overloaded erase deletes the data at the given position and returns the next valid location in vector. Basically the data is not deleted but invalidated! The vector.erase invalidates the current entry and copies the following entries to the top. In the process, the object's assignment overloaded operator invoked if present. Here is catch, we must be careful to write the accurate assignment overload. Or else, the results could be absurd and hard to debug fast. Our assignment implementation is partially tainted!

Lets look at the iterator address before and after deletion.

Address of iterator before deletion : 0x55586894be70
Address of iterator before deletion : 0x55586894be98
Address of iterator before deletion : 0x55586894bec0
Address of iterator before deletion : 0x55586894bee8
Address of iterator before deletion : 0x55586894bf10

Address of iterator after deletion : 0x55586894be70
Address of iterator after deletion : 0x55586894be98
Address of iterator after deletion : 0x55586894bec0

Yes it remains same because, the trailing elements are copied to one level up. This should be simple as

copyElements(deletedElemPtr, nextElemPtr, nextElemPtr + numofElementsFromNextElementPtr);
In process of copy, the assignment operator is invoked and our assignment operator has buggy logic. So it copies the "blank" from invalidated entry than the actual entry.
The Details!
Here 2nd element is deleted and 3rd element takes second position. Hence the string value of 3rd element overridden as "is". Similarly 4th element takes 3rd position whose string becomes "Disaster" while 5th becomes fourth with string value as "You". In next iteration, 3rd element is deleted and hence 4th moves to 3rd in which case, string value becomes "Disaster" again [both the output and for coder :-) ]

What's the capacity of vector at the end of exercise?

5

Yes, its still 5. The memory is not deleted assuming that space would be filled soon! This is optimization strategy employed by the library writers.

After correcting the assignment operator, we come to end of disaster and its time to recoup!

void operator=(const SpookyStruct& rhs)
{
    k = rhs.k;
    blank = rhs.blank;
}

Here is desired output:

1
This

3
Disaster

5
Know


Hope you learned something today!

PS:

Yeah, if you have copy assignment, you need to have copy constructor as well. That's bug in standard. But its not bug here. I just left it out for brevity.

Wednesday, September 19, 2018

The delete operator rant

I have seen many instances using below code (assuming older standard before nullptr).

if (myAllocatedObject != NULL)
{
    delete  myAllocatedObject;
}

This is absolutely ridiculous. C++ standard guarantees that delete on NULL pointer is harmless and has no effect. Of-course, this is true only if object was allocated using new. It would be bad coding too if you try to deallocate memory using delete which was not constructed by new. This in fact applies to every allocator/deallocator combination.

Suppose if you use your own allocator, then the deallocator also needs to be supplied. It will be absurd if you restrict only to allocator.

Conclusively, the redundant NULL check is worthless. Also the code looks downright ugly.

How does libstdc++ behave?

While peeking  source tree, you get from del_op.cc

_GLIBCXX_WEAK_DEFINITION void
operator delete(void* ptr) _GLIBCXX_USE_NOEXCEPT
{
     std::free(ptr);
}

And std::free leads to cstdlib.h which in turn includes glibc stdlib.h and then to

void
__libc_free (void *mem)


if (mem == 0)                              /* free(0) has no effect */
    return; 

Phew! We landed and also freed safely :-)

PS: I heard few static analyzers also complain if you don't wrap the delete with NULL check. I cannot vouch for this statement as of now since there was never an instance stumbled upon.

Saturday, April 14, 2018

C++ : Tampering with the private class variables

I have lot things to write in C++/OS/Network arena. I don't feel the urge to write unless clarity is gained over the subject. Thanks for waiting! Hopefully, the mute period will be broken and more technical topics in coming days.
From past 3 weeks, I have been intensely pondering on rvalue/lvalue semantics. To large extent I could comprehend as well. In the process of learning, I accidentally wrote below code. Hey! you cannot modify private in C++ but we can trick via pointers. If you argue that C++ have references, take a look at the below code.


#include <iostream>
#include <string>

class PrivateTest
{
    public:
        PrivateTest() = default;
        std::string& getMessage() { return message; }

    private:
        std::string message = "Nanda";
};

int main()
{
    PrivateTest test;

    std::cout << test.getMessage() << std::endl;
    std::string& corrupted = test.getMessage();
    corrupted = "gotcha";
    std::cout << test.getMessage() << std::endl;
}


Let's compile and execute!

nandakumar@heramba ~ $ g++ -std=c++14 PrivateTest.cpp
nandakumar@heramba ~ $ ./a.out
Nanda
gotcha


The private is exposed :-). Perhaps, this is an example of bad/vulnerable C++ class design. The encapsulation needs to be stronger so that private access violations are not broken. The code somewhat similar to pointer version replaced with reference. Ultimately, the assembly boils down to logical address. The jargon of pointers & references only fall under compiler paradigms and has less significance at assembly level.

This is not something new, but revelation to few coders happens slower ;-)