Asynchrony in C++

2025-12-24

With the Execution control library library on track to become part of the C++26 standard via P2300, this felt like the right moment to finally sit down and understand it properly.

At a high level, the Execution library introduces a set of abstractions—senders, receivers, schedulers, and operation states—for describing how work and data flow through a program. These descriptions can then be executed asynchronously by an appropriate execution context. While that characterization is technically correct, it is also abstract enough to be difficult to internalize from the specification alone. Rather than reason about the model purely on paper, I wanted to understand how it behaves when used to build something concrete. To that end, I decided to implement a simple, single-threaded HTTP server capable of handling multiple connections asynchronously.

An HTTP server is a natural fit for this kind of exploration. It is predominantly I/O-bound, heavily state-driven, and requires careful handling of lifetimes, cancellation, and backpressure—exactly the areas the Execution library is designed to address. There is also prior art in this space: several excellent talks demonstrate how the Senders/Receivers model can be applied to HTTP servers, including Dietmar Kühl’s ACCU and MUC++ presentations, which were invaluable references while working through this project.

Along the way, I took a few detours. In particular, I spent time understanding Linux’s io_uring interface for asynchronous I/O, and how it fits into a modern C++ execution model. I also explored how C++20 coroutines can be layered on top of senders to express asynchronous logic in a more linear, readable style.

The site you are currently reading is served by this HTTP server.

Before diving into implementation, however, I needed a solid grasp of the Execution library’s core abstractions. Reading P2300 or the draft standard end-to-end proved overwhelming, so I instead relied on the following resources, which I found especially effective at building intuition:

This blog documents what I learned while turning those abstractions into a working system, and what became clearer or more subtle once they were exercised in real code.

Status of `std::execution`

As of December 2025 the Execution Control library hasn't made its way into either libstdc++ or libc++ the two most widely used C++ standard library implementations. However the reference implementation: NVIDIA/Eric Niebler's stdexec aims to provide a conforming implementation of the standard library's execution components. We'll be using this library for our HTTP server implementation.

Show me the code: Code walkthrough of the core components of the HTTP server

You can find the complete source code for this blog post stdexec-server.

The main function

This is how the HTTP server's main function is structured:


int main()
{
    uint16_t port{8080};   // Listen on port 8080
    std::string root{"."}; // Serve files from current directory
    UringContext uring_ctx = UringContext(1024);
    auto server = Server(port, root);
    uring_ctx.run(listen(server, uring_ctx));
}

This should look fairly straightforward. We setup some configuration parameters, create an UringContext which will be our asynchronous I/O context, create a Server object that represents our HTTP server, and finally call uring_ctx.run(...) to start the server's event loop. But wait where is the asynchronous code? The listen(...) function is where the magic happens. It sets up the server to listen for incoming connections and handle them asynchronously using the Execution library's abstractions.

UringContext is a thin wrapper that utilizes the API's provided by Jens Axboe's liburing which provides a higher level interface to work with io_uring, a Linux kernel interface for asynchronous I/O operations. io_uring is one of the many asynchronous I/O interfaces that the Linux kernel provides. It tries to right the shortcomings of older interfaces like epoll and aio by providing a more efficient and flexible way to perform asynchronous I/O operations. For a simple application like this toy HTTP server, io_uring might be overkill, but I wanted to use it to explore how the Execution library can work with modern asynchronous I/O interfaces.
The Server struct has a few members that store the server's configuration and state, such as the listening socket and the root directory for serving files.
The listen(...) function is where the core logic of the server resides. It sets up a loop that continuously accepts incoming connections and handles them asynchronously. I previously called listen a function, this isn't quite right but without having shown the full implementation and just by looking at the call site one might assume it is. In reality, listen is a coroutine. Coroutines are a powerful feature that got added in C++20 that allow you to write asynchronous code in a more synchronous style. Coroutines are a generalization of functions in that they can be suspended and resumed. Coroutines in C++ are a complex topic. Lewis Baker has a great series of blog posts that explain coroutines in depth and help building intuition around them. Writing custom coroutine types is already a non-trivial undertaking, and manually adapting them to work with stdexec would add significant additional complexity. Fortunately, stdexec provides the necessary machinery to bridge this gap.

The `listen(...)` coroutine

Here's the listen(...) function/coroutine:

kev::task<void> listen(Server &server, UringContext &uring_ctx)
{
    exec::async_scope scope{};
    while (true)
    {
        RawFileDescriptor fd{-1};
        try
        {
            fd = co_await uring_ctx.async_accept(server.m_server_fd.get());
        }
        catch (const std::exception &e)
        {
            std::println(std::cerr, "Error accepting connection: {}", e.what());
            break;
        }
        auto client_pipeline = 
          handle_connection_coroutine(FileDescriptor(fd),
                                      server.m_context,
                                      uring_ctx) |
            stdexec::upon_error([](std::exception_ptr ptr) {
              try
              {
                  if (ptr)
                  {
                    std::rethrow_exception(ptr);
                  }
              }
              catch (const std::exception &e)
              {
                std::println(std::cerr, "Error in client connection: {}", e.what());
              }
            });
        scope.spawn(std::move(client_pipeline));
    }
}

Let's break down what's happening here. The giveaway that this is a coroutine is the co_await keyword. This indicates that execution can be suspended at that point and then resumed later.

We first setup an exec::async_scope which allows us to manage the lifetime of multiple asynchronous operations we spawn. We'll dig into the details of how this works later.

Inside the infinite loop we co-await on uring_ctx.async_accept(...) to accept incoming connections. This function is also a coroutine that performs the asynchronous accept operation using io_uring. In the underlying event loop when the continuation is resumed, the accepted file descriptor is returned as the result of the co_await expression and assigned to fd. Note that if an error occurs during the accept operation, an exception is thrown which we catch and log, breaking out of the loop to stop the server.

Once we have a valid file descriptor for the accepted connection, we "call" a new coroutine handle_connection_coroutine(...) to handle the client connection. This coroutine is responsible for reading the HTTP request, processing it, and sending back the appropriate response. We also attach an error handler using stdexec::upon_error(...) to log any exceptions that might occur during the handling of the connection. The full declaration of `handle_connection_coroutine is:

kev::task<void> handle_connection_coroutine(
    FileDescriptor client_fd,
    const ServerContext &server_context,
    UringContext &uring_ctx);

handle_connection_coroutine has a return type kev::task<void>. The kev::task<void> return type is special in that it is both awaitable and conforms to the Sender concept from the stdexec library. This means that it can be composed with other senders, allowing us to build complex asynchronous workflows in a modular way. In our case we use the pipe operator (|) to attach an error handler to the coroutine(or sender), which will be invoked if any exceptions are thrown during its execution. If any client were to error out, the error handler would log the error message to standard error output but would not affect the handling of other clients.

Another point is that the underlying promise_type of kev::task<void> will return std::suspend_always for its initial suspension point. This means that the coroutine is lazily evaluated and will not start executing immediately. This is explained really well in CppCon 2016: James McNellis “Introduction to C++ Coroutines".

Finally, we spawn the client pipeline coroutine within the async_scope, allowing it to run concurrently with other client connections.

Using coroutines in conjunction with async_scope simplifies the management of multiple concurrent operations. The async_scope ensures that all spawned operations are properly completed before the scope is destroyed, preventing potential resource leaks or dangling operations. This is particularly useful in our HTTP server scenario, where multiple client connections can be active simultaneously. See async_scope – Creating scopes for non-sequential concurrency.

The first version of this implementation didn't use coroutines at all. I was constructing sender pipelines and firing them off using `stdexec::start_detached(...)`. While this approach worked, it quickly became unwieldy to manage the lifetimes of various objects. The only clear cut solution at the time was to resort to using std::shared_ptr for shared state, which felt inelegant. Switching to coroutines made the code much cleaner and easier to reason about as objects in the scope are automatically managed by the coroutine framework, ensuring that resources are released properly when the scope is exited.

The `handle_connection_coroutine(...)` coroutine

kev::task<void> handle_connection_coroutine(FileDescriptor client_fd, ServerContext &ctx, UringContext &uring_ctx)
{
    // Client will get hoisted into the coroutine frame and it's lifetime is managed automatically
    // Passing by reference to read_requests_coroutine and handle_requests_coroutine is safe because
    // they are called from within this coroutine and thus cannot outlive it.
    auto client = Client(std::move(client_fd), uring_ctx);
    while (!client.m_disconnect_requested)
    {
        RequestList requests = co_await read_requests(client);
        co_await handle_requests(client, ctx, std::move(requests));
    }
}

The handle_connection_coroutine(...) coroutine is responsible for reading the HTTP request, processing it, and sending back the appropriate response.

We first create a Client object which represents the client connection. It's constructor takes ownership of the file descriptor and a reference to the UringContext which is used to perform asynchronous I/O operations.

We then enter an infinite loop where we read the HTTP request, process it, and send back the appropriate response.

The read_requests(...) coroutine is responsible for reading the HTTP request from the client.

The handle_requests(...) coroutine is responsible for processing the HTTP request and sending back the appropriate response.

If at any stage an exception is thrown, it will be caught and logged to standard error output. This is done using the stdexec::upon_error(...) operator that's attached to the client pipeline coroutine in the listen(...) coroutine.

The `read_requests(...)` coroutine

kev::task<RequestList> read_requests(Client &client)
{
    using namespace stdexec;

    auto &read_buffer = client.m_buffer;
    while (client.m_parser.get_number_of_completed_requests() == 0 && !client.m_disconnect_requested)
    {
        size_t bytes_read = co_await client.m_uring_ctx.async_read(client.m_fd.get(),
                                                                   std::span(read_buffer.data(), read_buffer.size()));
        if (bytes_read == 0)
        {
            client.m_disconnect_requested = true;
            break;
        }

        auto const data = std::span(read_buffer.data(), bytes_read);
        auto result = client.m_parser.parse(data);
        if (!result.has_value())
        {
            std::println(std::cerr, "Error parsing HTTP request. Message: {}", result.error());
            std::println("Resetting parser state.");
            client.m_parser.reset();
            throw std::runtime_error("Error parsing HTTP request");
        }
    }
    RequestList requests;
    client.m_parser.get_completed_requests(requests);
    co_return requests;
}

The read_requests(...) coroutine is responsible for reading the HTTP request from the client.

We first get a reference to the read buffer from the client. We enter a loop that will terminate when the HTTP parser has atleast one valid request.

Once we exit out of the loop we call the get_completed_requests(...) function on the HTTP parser to get the completed requests. We then return the completed requests.

The `handle_requests(...)` coroutine

kev::task<void> handle_requests(Client const &client, ServerContext &server_context, RequestList requests)
{
    using namespace stdexec;

    std::vector<HttpResponse> responses = generate_responses(requests, server_context.m_root_directory);

    std::vector<std::byte> bytes_to_write;
    bytes_to_write.reserve(4096);
    for (auto const &response : responses)
    {
        response.serialize_into(bytes_to_write);
    }

    co_await client.m_uring_ctx.async_write_all(
        client.m_fd.get(), std::span<const std::byte>(bytes_to_write.data(), bytes_to_write.size()));
}

The handle_requests(...) coroutine is responsible for processing the HTTP request and sending back the appropriate response. This coroutine does the following:

Generates the responses for the requests.
Serializes the responses into a vector of bytes.
Writes the responses to the client using the async_write_all(...) coroutine.

And with that we've covered the basic scaffolding of the HTTP server. In the next section we'll dig into the details of:

The kev::task coroutine type and how it bridges the gap between coroutines and the rest of the machinery provided by stdexec.
The UringContext class and how it's used to perform asynchronous I/O operations.
We'll also cover some of the Sender types we had to author to represent some of the asynchronous operations we need to perform.

The `kev::task` coroutine type

stdexec does provide a more complete task type in the form of exec::task but I want to build up the understanding from first principles. So I decided to author my own task type called kev::task. Please be warned that a production ready task type would have to deal with a lot more use cases and would be significantly more complex than what is presented here. This is the basic outline of kev::task:

namespace kev {
    template <typename T>
    struct task {
        // Opt into the stdexec sender model
        using sender_concept = stdexec::sender_t;
        using promise_type  = task_promise<T>;
        using handle_type   = std::coroutine_handle<promise_type>;

        // Awaiting a task transfers ownership of the coroutine
        auto operator co_await() && {
            return task_awaiter<T>(std::exchange(handle, {}));
        }

        // Tasks are move-only: a coroutine has a single consumer
        task(task&& other) noexcept
            : handle(std::exchange(other.handle, {})) {}

        ~task() {
            if (handle) handle.destroy();
        }

    private:
        friend promise_type;

        explicit task(handle_type h) : handle(h) {}
        handle_type handle{};
    };    
}

Key characteristics of `kev::task`

kev::task<T> represents an asynchronous computation. It is the return type of a coroutine and owns the coroutine frame. T is the type of the value that the coroutine will eventually produce.

kev::task<size_t> UringContext::async_read(RawFileDescriptor fd, std::span<std::byte> buffer)
{
    co_return co_await ReadSender{std::move(fd), *this, buffer};
}

In the above example, async_read(...) is a coroutine that returns a kev::task<size_t>, indicating that it will eventually produce a size_t value representing the number of bytes read. ReadSender is a custom sender type that encapsulates the asynchronous read operation. Since the promise_type of kev::task extends from stdexec::with_awaitable_senders<task_promise<T>>, we can co_await on senders directly within the coroutine body.

When the coroutine is invoked, it does not start executing immediately. Instead, it returns a kev::task object that represents the coroutine. The coroutine starts executing when the task is awaited. On reaching the co_await expression, the coroutine is suspended. The work encapsulated by ReadSender is scheduled for execution. In this case it involves submitting a read request to the io_uring instance managed by UringContext. In the event loop of UringContext, when the read operation completes the receiver associated with ReadSender is invoked, which resumes the coroutine. The result of the read operation (number of bytes read) is then returned from the co_await expression is then co_return'ed from the coroutine.

Caller
 |
 | call async_read(fd, buffer)
 v
 +------------------------------+
 | kev::task<size_t>            |
 | (coroutine created,          |
 |  not yet executing)          |
 +------------------------------+
 |
 | co_await task
 v
 +------------------------------+
 | Coroutine starts             |
 | executing                    |
 +------------------------------+
 |
 | co_await ReadSender{...}
 v
 +------------------------------+      submit read    +-------------------+
 | ReadSender                   | ------------------> | io_uring          |
 | (sender + receiver pair)     |                     | (kernel async I/O)|
 +------------------------------+                     +-------------------+
                                                                 |
                                                  read completes |
                                                                 v
 +------------------------------+      resume coroutine  +-------------------+
 | Receiver invoked             | <-------------------- | io_uring CQE      |
 +------------------------------+                        +-------------------+
 |
 | result (size_t bytes read)
 v
 +------------------------------+
 | Coroutine resumes            |
 | co_await yields size_t       |
 +------------------------------+
 |
 | co_return size_t
 v
 +------------------------------+
 | kev::task completes          |
 | result delivered to caller   |
 +------------------------------+

It is a stdexec sender. The sender_concept alias opts the type into the Senders/Receivers model, allowing it to compose with stdexec algorithms.
```
using namespace stdexec;
kev::task<int> simple_task() {
  co_return 42;
}

auto pipeline = simple_task() 
    | stdexec::then([](int result) {
        std::println("Task completed with result: {}", result);
    });
stdexec::sync_wait(std::move(pipeline));
```
In the above example, simple_task is a coroutine that returns a kev::task<int>, which will eventually produce an int value. We then create a sender pipeline by piping the task into stdexec::then(...), which attaches a continuation that will be invoked when the task completes. Finally, we use stdexec::sync_wait(...) to synchronously wait for the entire pipeline to complete, which will print the result of the task.
It is move-only. Moving a task transfers ownership of the underlying coroutine frame.
Operator co_await consumes the task. Awaiting a task transfers ownership of the coroutine to a task_awaiter, ensuring it can only be awaited once.
Unconsumed tasks clean up after themselves. If a task is never awaited, its destructor destroys the coroutine frame.

The full implementation of kev::task along with its associated promise_type and awaiter can be found here.

The `UringContext` class

At a high level, UringContext is responsible for:

Owning and managing an io_uring instance
Translating io_uring completions into coroutine resumptions
Providing high-level async operations (accept, read, write)
Running a cooperative event loop

The context owns an io_uring instance (m_ring) for its entire lifetime. Copying is explicitly disallowed, which avoids accidental sharing of a kernel ring across independent execution contexts. Move semantics are supported, allowing a context to be transferred without tearing down and rebuilding the underlying ring.

Asynchronous I/O Primitives

kev::task<int> UringContext::async_accept(RawFileDescriptor server_fd);
kev::task<size_t> UringContext::async_read(RawFileDescriptor fd, std::span<std::byte> buffer);
kev::task<void> UringContext::async_write_all(RawFileDescriptor fd, std::span<const std::byte> data);

Each of these functions returns a kev::task, meaning:

They are coroutine-based
They represent lazy asynchronous computations
Execution does not begin until the task is awaited or otherwise started

Internally, these operations submit an SQE to io_uring and suspend the coroutine. When the kernel posts a corresponding completion queue event (CQE), the coroutine is resumed and the task completes with the appropriate value. async_write_all is intentionally higher-level than a raw write: it guarantees that the entire buffer is written, retrying internally as needed. The lower-level ``async_write` helper (private) exposes the single-write semantics and returns the number of bytes written.

Event Loop Integration via Senders

The most important—and most subtle—part of UringContext is the run function:

template <stdexec::sender Sender> void UringContext::run(Sender sender)
{
    struct receiver
    {
        std::atomic<bool> *done;
        stdexec::inline_scheduler scheduler{};

        using is_receiver = void;
        static_assert(std::is_same_v<is_receiver, void>);

        auto get_env() const noexcept
        {
            return exec::make_env(exec::with(stdexec::get_scheduler, scheduler));
        }

        void set_value() noexcept
        {
            done->store(true, std::memory_order_release);
        }

        void set_error(std::exception_ptr) noexcept
        {
            done->store(true, std::memory_order_release);
        }

        void set_stopped() noexcept
        {
            done->store(true, std::memory_order_release);
        }
    };
    std::atomic<bool> done{false};
    auto op = stdexec::connect(std::move(sender), receiver{&done});
    stdexec::start(op);
    while (!done.load(std::memory_order_acquire))
    {
        run_once();
    }
}

Internally, run:

Constructs a small receiver that:

Stores a pointer to an atomic done flag. This flag need not be atomic in a single-threaded context, but using atomic operations ensures correctness even if the code is later adapted to a multi-threaded environment.
Publishes an inline_scheduler via get_env.
Connects the sender to this receiver.
Starts the resulting operation.
Pumps the io_uring event loop until the sender signals completion.

while (!done.load(std::memory_order_acquire)) {
    run_once();
}

Single-Step Progress: run_once

The run_once function is responsible for making progress on the io_uring event loop. It does this by:

Waiting for CQEs to become available (blocking if necessary)

  int ret = io_uring_wait_cqe(&m_ring, &cqe);
  if (ret < 0)
  {
      std::println(std::cerr, "io_uring_wait_cqe failed with error: {}", strerror(-ret));
      return;
  }
  Event *event = static_cast<Event *>(io_uring_cqe_get_data(cqe));

For each CQE:

Retrieves the associated Event which contains what event type it is a pointer to some user data.
Based on the event type, it resumes the appropriate coroutine by invoking the stored continuation/callback.

    switch (event->type)
    {
    case UringOpType::Accept: {
        auto *user_data = static_cast<AcceptOperationData *>(event->data);
        assert(user_data != nullptr);
        assert(user_data->completion_handler != nullptr);
        user_data->completion_handler(user_data->op_state_ptr, cqe->res);
        break;
    }
    case UringOpType::Read: {
        auto *user_data = static_cast<ReadOperationData *>(event->data);
        assert(user_data != nullptr);
        assert(user_data->completion_handler != nullptr);
        user_data->completion_handler(user_data->op_state_ptr, static_cast<ssize_t>(cqe->res));
        break;
    }
    case UringOpType::Write: {
        auto *user_data = static_cast<WriteOperationData *>(event->data);
        assert(user_data != nullptr);
        assert(user_data->completion_handler != nullptr);
        user_data->completion_handler(user_data->op_state_ptr, static_cast<ssize_t>(cqe->res));
        break;
    }

Finally, it marks the CQE as seen so that io_uring can reuse it.

    io_uring_cqe_seen(&m_ring, cqe);

In essence , run_once bridges the gap between the low-level io_uring interface and the high-level coroutine-based asynchronous programming model provided by stdexec. It ensures that as I/O operations complete, the corresponding coroutines are resumed with the appropriate results, allowing the program to continue executing asynchronously.

Custom Sender Types

To perform asynchronous operations using io_uring, we need to define custom sender types that encapsulate the logic for submitting requests and handling completions. Below are the key sender types used in our HTTP server implementation:

AcceptSender: Encapsulates the logic for accepting incoming connections asynchronously.
ReadSender: Encapsulates the logic for reading data from a file descriptor asynchronously.
WriteSender: Encapsulates the logic for writing data to a file descriptor asynchronously.

Let take a closer look at the ReadSender as an example/. ReadSender is the concrete adapter that turns a Linux io_uring read operation into a first-class stdexec sender. It is the missing link between kernel-level asynchronous I/O and high-level coroutine code that simply writes:

size_t bytes_read = co_await uring_ctx.async_read(client.m_fd.get(),
                                                   std::span(read_buffer.data(), read_buffer.size()));

At a conceptual level, ReadSender does three things:

Captures the parameters of an asynchronous read (fd, buffer, and context)
When started, submits an io_uring read request.
When the read completes, invokes the receiver's set_value/set_error/set_stopped methods.

Sender Metadata: Concept and Completion Signatures

struct ReadSender {
    using sender_concept = stdexec::sender_t;

    template <typename Env>
    using completion_signatures = stdexec::completion_signatures<
        stdexec::set_value_t(size_t), // on success, deliver number of bytes read
        stdexec::set_error_t(std::exception_ptr), // on error, deliver exception
        stdexec::set_stopped_t() // on cancellation
    >;
};

The sender_concept alias opts the type into the Senders/Receivers model. The completion_signatures template defines the possible ways this sender can complete:

set_value(size_t): Indicates successful completion, delivering the number of bytes read.
set_error(std::exception_ptr): Indicates an error occurred, delivering an exception pointer.
set_stopped(): Indicates the operation was cancelled.

When the sender is connected to a receiver, it materializes an operation state that owns all execution-critical state for exactly one read. This operation state submits a read request to the io_uring submission queue in its start() function and then suspends. Completion is driven entirely by the kernel: when io_uring produces a completion entry, the event loop dispatches it back to the operation state via a small, type-erased callback. That callback translates the raw kernel result into the sender/receiver contract—calling set_value with the number of bytes read on success, or set_error with an exception on failure—thereby resuming the awaiting coroutine.

The key design point is that ReadSender cleanly separates concerns. It does not manage an event loop, scheduling policy, retries, or buffering strategy; it only bridges kernel completion semantics into stdexec completion signals. Because kev::task supports awaiting senders, UringContext::async_read can simply co_await a ReadSender, making asynchronous I/O appear synchronous at the call site while remaining fully non-blocking and driven by io_uring underneath.

The full implementation of ReadSender, along with AcceptSender and WriteSender, can be found here.

Closing thoughts

Building this HTTP server using the Senders/Receivers model provided by the Execution library has been an enlightening experience. It has deepened my understanding of asynchronous programming in C++ and demonstrated the power of composable abstractions for managing concurrency. The combination of coroutines and senders allows for writing asynchronous code that is both expressive and efficient, making it easier to reason about complex workflows.

Comments or Corrections?

If you have any comments or corrections, please create an issue in the blog repository.

← Back to home

Asynchrony in C++

Status of std::execution