Thursday, October 11, 2018

A trivial note on Vector.erase()

Recently, I encountered a problem with Vector.erase() function. Consider the below code.

#include <vector>
#include <iostream>

struct SpookyStruct
{
    int k = 10;
    std::string blank = "blank";

    void operator=(const SpookyStruct& rhs)
    {
        k = rhs.k;
        blank = blank;
    }
};

std::vector<SpookyStruct> spookyVector =
{
    {1, "This"},
    {2, "is"},
    {3, "Disaster"},
    {4, "You"},
    {5, "Know"}
};

int main(int argc, char** argv)
{
    for (auto entry = spookyVector.begin(); entry != spookyVector.end(); ++entry)
    {
        std::cout << "Address of iterator before deletion : " << &(*entry) << std::endl;
    }

    for (auto entry = spookyVector.begin(); entry != spookyVector.end();)
    {
        if (entry->k % 2)
        {
            ++entry;
            continue;
        }

        entry = spookyVector.erase(entry);
    }

    for (auto entry = spookyVector.begin(); entry != spookyVector.end(); ++entry)
    {
        std::cout << "Address of iterator after deletion : " << &(*entry) << std::endl;
    }

    for (const auto& oddOnly : spookyVector)
    {
        std::cout << oddOnly.k << std::endl;
        std::cout << oddOnly.blank << std::endl;
        std::cout << std::endl;
    }

    std::cout << spookyVector.capacity() << std::endl;
}

What should be expected output? Let's run the program

nanda@nanda-MS-7640:/media/nanda/PERSONAL/c14_code/vec_erase$ g++ -std=c++14 SpookyVector.cpp 
nanda@nanda-MS-7640:/media/nanda/PERSONAL/c14_code/vec_erase$ ./a.out 

1
This

3
is

5
Disaster

Yes, the result is a disaster. This is not the output we are expecting. So what's wrong with the code?

void operator=(const SpookyStruct& rhs)
 {
    k = rhs.k;
    blank = blank;
}

Yes! It is a trivial bug but sometimes hard to debug if you don't know what is happening inside erase function.

What is happening during erase?

You know that one of the overloaded erase deletes the data at the given position and returns the next valid location in vector. Basically the data is not deleted but invalidated! The vector.erase invalidates the current entry and copies the following entries to the top. In the process, the object's assignment overloaded operator invoked if present. Here is catch, we must be careful to write the accurate assignment overload. Or else, the results could be absurd and hard to debug fast. Our assignment implementation is partially tainted!

Lets look at the iterator address before and after deletion.

Address of iterator before deletion : 0x55586894be70
Address of iterator before deletion : 0x55586894be98
Address of iterator before deletion : 0x55586894bec0
Address of iterator before deletion : 0x55586894bee8
Address of iterator before deletion : 0x55586894bf10

Address of iterator after deletion : 0x55586894be70
Address of iterator after deletion : 0x55586894be98
Address of iterator after deletion : 0x55586894bec0

Yes it remains same because, the trailing elements are copied to one level up. This should be simple as

copyElements(deletedElemPtr, nextElemPtr, nextElemPtr + numofElementsFromNextElementPtr);
In process of copy, the assignment operator is invoked and our assignment operator has buggy logic. So it copies the "blank" from invalidated entry than the actual entry.
The Details!
Here 2nd element is deleted and 3rd element takes second position. Hence the string value of 3rd element overridden as "is". Similarly 4th element takes 3rd position whose string becomes "Disaster" while 5th becomes fourth with string value as "You". In next iteration, 3rd element is deleted and hence 4th moves to 3rd in which case, string value becomes "Disaster" again [both the output and for coder :-) ]

What's the capacity of vector at the end of exercise?

5

Yes, its still 5. The memory is not deleted assuming that space would be filled soon! This is optimization strategy employed by the library writers.

After correcting the assignment operator, we come to end of disaster and its time to recoup!

void operator=(const SpookyStruct& rhs)
{
    k = rhs.k;
    blank = rhs.blank;
}

Here is desired output:

1
This

3
Disaster

5
Know


Hope you learned something today!

PS:

Yeah, if you have copy assignment, you need to have copy constructor as well. That's bug in standard. But its not bug here. I just left it out for brevity.