Quantcast
Channel: Code, the Universe and everything... » Async
Viewing all articles
Browse latest Browse all 5

C++ Async Development (not only for) for C# Developers Part II: Lambda functions – the most common pitfalls

$
0
0

Warning: Some examples in this post belong to the “don’t do this at home” category.

In the first part of C++ Async Development (not only for) for C# Developers we looked at lambda functions in C++. They are cool but as you probably can guess there are some “gotchas” you need to be aware of to avoid undefined behavior and crashes. Like many problems in C++ these are related to object lifetimes where a variable is used after it has already been destroyed. For instance what do you think will be printed (if anything) if you run the following code:

std::function<int()> get_func()
{
    auto x = 42;

    return [&x]() { return x; };
}

int main()
{
    std::cout << get_func()() << std::endl;
    return 0;
}

?

On my box it prints -858993460 but it could write anything as well as could it crash. I believe a crash would have actually been better because I would not have carried on with a random result. The reason why it happened is simple. The variable x was captured by reference meaning the code inside lambda referred to the actual local variable. The lambda was returned from the get_func() function extending the lifetime of the closure. However the local variable x went out of scope when we left the get_func() so the reference captured in the closure no longer referenced x but rather it referenced a random location in memory that has already been reclaimed and potentially reused. A variation of this problem can also be encountered when working with classes. Take a look at this code:

class widget
{
public:
    std::function<int()> get_func()
    {
        return [this]() { return m_x; };
    }

private:
    int m_x = 42;
};

std::function<int()> get_func()
{
    return widget{}.get_func();
}

int main()
{
    std::cout << get_func()() << std::endl;
    return 0;
}

It basically has the same flaw as the previous example. As you probably remember from the post about lambdas the this pointer is always captured by value. So, the lambda captures a copy of the this pointer to be able to access class members but by the time the lambda is actually invoked the object is gone so the pointer is God knows what…
Now let’s try a little game. Find one difference between this class:

class widget
{
public:
    std::function<int()> get_func()
    {
        return [=]() { return m_x; };
    }

private:
    int m_x = 42;
};

and the class from the previous example. If you have not found it yet look at how we capture variables. Before we captured the this pointer explicitly but now we don’t do it. So, we should good, right? Not so fast. If you look at the lambda body you will see we are still accessing the m_x variable. This is a class variable and class variables cannot be accessed without having an instance of the class the variable is a member of. So, what happened is that when we captured variables implicitly the compiler captured the this pointer because we used a class variable in the lambda body. Since the this pointer is always captured by value the new class has the same problem as the previous one. To make things even more interesting the result would be exactly the same if we captured variables implicitly by reference (e.g. if we used [&] to capture variables). This is because the compiler notices that we use class variables in our lambda and therefore the this pointer needs to be captured. However the this pointers is always captured by value (notice that [&this] won’t compile) so even though we used [&] to request capturing variables by reference the this pointer was still captured by reference.
If you think that the scenarios shown above rarely happen in real life you might be right. However the “real life” is a bit different when doing things asynchronously. In this case lambdas represent tasks that are to be run sometime in the future. In general you have little or no control over how or when these tasks will be run (there are some advanced settings you can pass when scheduling a task which may give you some control – I have not investigated them yet). Basically, when a thread becomes available a scheduled task will be run on this thread. This means that the task (i.e. the lambda) will run completely independently on the code that scheduled it and there is no guarantees that the variables captured when creating a task will be around when the task actually runs. As a result you can encounter scenarios similar to the ones I described above quite often. So, what to do then? Over the past few months I developed a few rules and patterns which are helpful in these situations.

Do not capture implicitly
This is probably a little controversial but I like to capture variables explicitly by listing them in the capture list. This prevents from introducing a bug where I capture the this pointer implicitly and use it even though I don’t have a guarantee that it will be valid when the lambda is invoked. This rule has unfortunately one drawback – sometimes after I refactor the code I forget to remove from the capture list a variable that is no longer used in the lambda. In general, if you understand when not to use the this pointer in your lambda you should be fine capturing implicitly.

Try to capture variables by value if possible/feasible (if you can’t guarantee the lifetime)
If you look at the first case we capture the local variable x by reference but it is not necessary. Had we captured it by value we would have avoided all the problems. For instance we could do this:

std::function<int()> get_func()
{
    auto x = 42;

    return [x]() { return x; };
}

Sometimes you may want to capture by reference only to avoid copying. It is a valid reason and it is fine as long as you can guarantee that the variable captured in the closure outlives the closure itself. Otherwise it is better to be slower and correct instead of faster and return invalid results or crash.
The same applies to capturing class variables. If you can capture them by value – do it. In the second case we could change the code as follows:

class widget
{
public:
    std::function<int()> get_func()
    {
        auto x = m_x;
        return [x]() { return x; };
    }

private:
    int m_x = 42;
};

Now instead of capturing the this pointer we copy the value of the class variable to a local variable and we capture it by value.
This works in simple cases where we don’t modify the variable value in the lambda and don’t expect the variable to be modified between it was captured and the lambda was invoked. Oftentimes this is however not enough – we want to be able to modify the value of the variable captured in the closure and be able to observe the new value. What do we do in these cases?

Use (smart)pointers
The problems outlined above were caused by automatic variables whose lifetimes are tied to their scope. Once they go out of scope they are destroyed and must no longer be used. However, if we create the variable on the heap we will control the lifetime of this variable. Since this is no longer 1998 we will not use raw pointers but smart pointers. Actually, there is even more important reason to not use raw pointers – since we allocated memory we should release it. In simple cases it may not be difficult. You just delete the variable in the lambda and you are done with it. More often than not it is not that simple – you may want to capture the same pointer in multiple closures, someone else may still want to use it after the lambda was invoked, you may want to invoke the lambda multiple times etc. In these cases the simplistic approach won’t work. Even if originally your code worked correctly it is very easy to inadvertently introduce issues like these later during maintenance. Using smart pointers makes life much easier and safer. So, yeah, we will use smart pointers. For local variables it is as easy as:

std::function<int()> get_func()
{
    auto x = std::make_shared<int>(42);

    return [x]() { return *x; };
}

Notice that we create a shared_ptr variable and capture it by value. When a shared_ptr is passed by value its internal ref count is incremented. This prevents from deleting the value the pointer points to. When the pointer goes out of scope the ref count is decremented. Once the ref count reaches 0 the pointer deletes the variable it tracks. In our case we create a closure (the ref count is incremented) and return it to the caller. The caller may extend its lifetime (e.g. by returning it to its caller) or can just invoke it and let it go out of scope. One way or another the closure will eventually go out of scope and when this happens the ref count will be decremented and – if no one else is using the shared_ptr – the memory will be released. One thing worth noting is how using shared_ptrs plays nicely with the previous rule about capturing variables by value.
We can also use shared_ptrs to capture the current instance of a class as a substitute for capturing the this pointer. It’s however a bit more involved than it was in case of variables. First, we cannot just do shared_ptr(this). Doing this could lead to creating multiple shared_ptr instances for a single object (e.g. when a function creating a shared_ptr this way would be called multiple times) which would lead to invoking the destructor of object tracked by the shared_ptr multiple times which not only creates undefined behavior but can also lead to dangling pointers. (Note that this is not specific to the this pointer but applies to any raw pointer). The way around it is to derive the class from the enable_shared_from_this<T> type. Unfortunately classes derived from enable_shared_from_this<T> are not very safe and using them incorrectly may put you into the realm of undefined behavior very quickly. The biggest issue is that to be able to create a shared_ptr from the this pointer for a class derived from enable_shared_from_this<T> the instance must already be owned by a shared_ptr. The easiest way to enforce this is probably by not allowing the user to create instances of the class directly by making all the constructors (including the ones created by the compiler) private and exposing a static factory method that creates a new instance of the class and returns a shared_ptr owning the newly created instance. This is how code capturing a shared_ptr tracking this would look like:

class widget : public std::enable_shared_from_this<widget>
{
public:

    static std::shared_ptr<widget> create()
    {
        return std::shared_ptr<widget>(new widget);
    }

    std::function<int()> get_func()
    {
        auto w = shared_from_this();

        return [w]() { return w->m_x; };
    }

private:

    widget() 
    { }

    widget(const widget&) = delete;

    int m_x = 42;
};

std::function<int()> get_func()
{
    return widget::create()->get_func();
}

int main()
{
    std::cout << get_func()() << std::endl;
    return 0;
}

A word of precaution. One general thing you need to aware of when using shared_ptrs is circular dependencies. If you create a circular dependency between two (or more) shared_ptrs their ref counts will never reach 0 because each shared_ptr is always being referenced by another shared_ptr. As a result the instances the shared_ptrs point to will never be released causing a memory leak. If you cannot avoid a dependency like this you can replace one of the shared_ptrs with a weak_ptr which will break the cycle. weak_ptrs similarly to shared_ptrs can be passed by value so they are easy to use with lambda functions. If you have a weak_ptr and you need a shared_ptr you just call the weak_ptr::lock() function. Note that you need to check the value returned by the weak_ptr::lock() function since it will return an empty shared_ptr if the instance managed by shared_ptr was deleted because all the strong references were removed in the meantime.

This post was quite long and heavy but knowing this stuff will be very helpful later when we dive in the actual async programming which I think we will start looking into in the next part.



Viewing all articles
Browse latest Browse all 5

Trending Articles