Asynchrony in C++

2025-12-24

With the Execution control library library on track to become part of the C++26 standard via P2300, this felt like the right moment to finally sit down and understand it properly.

At a high level, the Execution library introduces a set of abstractions—senders, receivers, schedulers, and operation states—for describing how work and data flow through a program. These descriptions can then be executed asynchronously by an appropriate execution context. While that characterization is technically correct, it is also abstract enough to be difficult to internalize from the specification alone. Rather than reason about the model purely on paper, I wanted to understand how it behaves when used to build something concrete. To that end, I decided to implement a simple, single-threaded HTTP server capable of handling multiple connections asynchronously.

An HTTP server is a natural fit for this kind of exploration. It is predominantly I/O-bound, heavily state-driven, and requires careful handling of lifetimes, cancellation, and backpressure—exactly the areas the Execution library is designed to address. There is also prior art in this space: several excellent talks demonstrate how the Senders/Receivers model can be applied to HTTP servers, including Dietmar Kühl’s ACCU and MUC++ presentations, which were invaluable references while working through this project.

Along the way, I took a few detours. In particular, I spent time understanding Linux’s io_uring interface for asynchronous I/O, and how it fits into a modern C++ execution model. I also explored how C++20 coroutines can be layered on top of senders to express asynchronous logic in a more linear, readable style.

The site you are currently reading is served by this HTTP server.

Before diving into implementation, however, I needed a solid grasp of the Execution library’s core abstractions. Reading P2300 or the draft standard end-to-end proved overwhelming, so I instead relied on the following resources, which I found especially effective at building intuition:

This blog documents what I learned while turning those abstractions into a working system, and what became clearer or more subtle once they were exercised in real code.


Status of std::execution

As of December 2025 the Execution Control library hasn't made its way into either libstdc++ or libc++ the two most widely used C++ standard library implementations. However the reference implementation: NVIDIA/Eric Niebler's stdexec aims to provide a conforming implementation of the standard library's execution components. We'll be using this library for our HTTP server implementation.


Show me the code: Code walkthrough of the core components of the HTTP server

You can find the complete source code for this blog post stdexec-server.

The main function

This is how the HTTP server's main function is structured:


int main()
{
    uint16_t port{8080};   // Listen on port 8080
    std::string root{"."}; // Serve files from current directory
    UringContext uring_ctx = UringContext(1024);
    auto server = Server(port, root);
    uring_ctx.run(listen(server, uring_ctx));
}

This should look fairly straightforward. We setup some configuration parameters, create an UringContext which will be our asynchronous I/O context, create a Server object that represents our HTTP server, and finally call uring_ctx.run(...) to start the server's event loop. But wait where is the asynchronous code? The listen(...) function is where the magic happens. It sets up the server to listen for incoming connections and handle them asynchronously using the Execution library's abstractions.

The listen(...) coroutine

Here's the listen(...) function/coroutine:

kev::task<void> listen(Server &server, UringContext &uring_ctx)
{
    exec::async_scope scope{};
    while (true)
    {
        RawFileDescriptor fd{-1};
        try
        {
            fd = co_await uring_ctx.async_accept(server.m_server_fd.get());
        }
        catch (const std::exception &e)
        {
            std::println(std::cerr, "Error accepting connection: {}", e.what());
            break;
        }
        auto client_pipeline = 
          handle_connection_coroutine(FileDescriptor(fd),
                                      server.m_context,
                                      uring_ctx) |
            stdexec::upon_error([](std::exception_ptr ptr) {
              try
              {
                  if (ptr)
                  {
                    std::rethrow_exception(ptr);
                  }
              }
              catch (const std::exception &e)
              {
                std::println(std::cerr, "Error in client connection: {}", e.what());
              }
            });
        scope.spawn(std::move(client_pipeline));
    }
}

Let's break down what's happening here. The giveaway that this is a coroutine is the co_await keyword. This indicates that execution can be suspended at that point and then resumed later.

We first setup an exec::async_scope which allows us to manage the lifetime of multiple asynchronous operations we spawn. We'll dig into the details of how this works later.

Inside the infinite loop we co-await on uring_ctx.async_accept(...) to accept incoming connections. This function is also a coroutine that performs the asynchronous accept operation using io_uring. In the underlying event loop when the continuation is resumed, the accepted file descriptor is returned as the result of the co_await expression and assigned to fd. Note that if an error occurs during the accept operation, an exception is thrown which we catch and log, breaking out of the loop to stop the server.

Once we have a valid file descriptor for the accepted connection, we "call" a new coroutine handle_connection_coroutine(...) to handle the client connection. This coroutine is responsible for reading the HTTP request, processing it, and sending back the appropriate response. We also attach an error handler using stdexec::upon_error(...) to log any exceptions that might occur during the handling of the connection. The full declaration of `handle_connection_coroutine is:

kev::task<void> handle_connection_coroutine(
    FileDescriptor client_fd,
    const ServerContext &server_context,
    UringContext &uring_ctx);

handle_connection_coroutine has a return type kev::task<void>. The kev::task<void> return type is special in that it is both awaitable and conforms to the Sender concept from the stdexec library. This means that it can be composed with other senders, allowing us to build complex asynchronous workflows in a modular way. In our case we use the pipe operator (|) to attach an error handler to the coroutine(or sender), which will be invoked if any exceptions are thrown during its execution. If any client were to error out, the error handler would log the error message to standard error output but would not affect the handling of other clients.

Another point is that the underlying promise_type of kev::task<void> will return std::suspend_always for its initial suspension point. This means that the coroutine is lazily evaluated and will not start executing immediately. This is explained really well in CppCon 2016: James McNellis “Introduction to C++ Coroutines".

Finally, we spawn the client pipeline coroutine within the async_scope, allowing it to run concurrently with other client connections.

Using coroutines in conjunction with async_scope simplifies the management of multiple concurrent operations. The async_scope ensures that all spawned operations are properly completed before the scope is destroyed, preventing potential resource leaks or dangling operations. This is particularly useful in our HTTP server scenario, where multiple client connections can be active simultaneously. See async_scope – Creating scopes for non-sequential concurrency.

The first version of this implementation didn't use coroutines at all. I was constructing sender pipelines and firing them off using `stdexec::start_detached(...)`. While this approach worked, it quickly became unwieldy to manage the lifetimes of various objects. The only clear cut solution at the time was to resort to using std::shared_ptr for shared state, which felt inelegant. Switching to coroutines made the code much cleaner and easier to reason about as objects in the scope are automatically managed by the coroutine framework, ensuring that resources are released properly when the scope is exited.

The handle_connection_coroutine(...) coroutine

kev::task<void> handle_connection_coroutine(FileDescriptor client_fd, ServerContext &ctx, UringContext &uring_ctx)
{
    // Client will get hoisted into the coroutine frame and it's lifetime is managed automatically
    // Passing by reference to read_requests_coroutine and handle_requests_coroutine is safe because
    // they are called from within this coroutine and thus cannot outlive it.
    auto client = Client(std::move(client_fd), uring_ctx);
    while (!client.m_disconnect_requested)
    {
        RequestList requests = co_await read_requests(client);
        co_await handle_requests(client, ctx, std::move(requests));
    }
}

The handle_connection_coroutine(...) coroutine is responsible for reading the HTTP request, processing it, and sending back the appropriate response.

We first create a Client object which represents the client connection. It's constructor takes ownership of the file descriptor and a reference to the UringContext which is used to perform asynchronous I/O operations.

We then enter an infinite loop where we read the HTTP request, process it, and send back the appropriate response.

The read_requests(...) coroutine is responsible for reading the HTTP request from the client.

The handle_requests(...) coroutine is responsible for processing the HTTP request and sending back the appropriate response.

If at any stage an exception is thrown, it will be caught and logged to standard error output. This is done using the stdexec::upon_error(...) operator that's attached to the client pipeline coroutine in the listen(...) coroutine.

The read_requests(...) coroutine

kev::task<RequestList> read_requests(Client &client)
{
    using namespace stdexec;

    auto &read_buffer = client.m_buffer;
    while (client.m_parser.get_number_of_completed_requests() == 0 && !client.m_disconnect_requested)
    {
        size_t bytes_read = co_await client.m_uring_ctx.async_read(client.m_fd.get(),
                                                                   std::span(read_buffer.data(), read_buffer.size()));
        if (bytes_read == 0)
        {
            client.m_disconnect_requested = true;
            break;
        }

        auto const data = std::span(read_buffer.data(), bytes_read);
        auto result = client.m_parser.parse(data);
        if (!result.has_value())
        {
            std::println(std::cerr, "Error parsing HTTP request. Message: {}", result.error());
            std::println("Resetting parser state.");
            client.m_parser.reset();
            throw std::runtime_error("Error parsing HTTP request");
        }
    }
    RequestList requests;
    client.m_parser.get_completed_requests(requests);
    co_return requests;
}

The read_requests(...) coroutine is responsible for reading the HTTP request from the client.

We first get a reference to the read buffer from the client. We enter a loop that will terminate when the HTTP parser has atleast one valid request.

Once we exit out of the loop we call the get_completed_requests(...) function on the HTTP parser to get the completed requests. We then return the completed requests.

The handle_requests(...) coroutine

kev::task<void> handle_requests(Client const &client, ServerContext &server_context, RequestList requests)
{
    using namespace stdexec;

    std::vector<HttpResponse> responses = generate_responses(requests, server_context.m_root_directory);

    std::vector<std::byte> bytes_to_write;
    bytes_to_write.reserve(4096);
    for (auto const &response : responses)
    {
        response.serialize_into(bytes_to_write);
    }

    co_await client.m_uring_ctx.async_write_all(
        client.m_fd.get(), std::span<const std::byte>(bytes_to_write.data(), bytes_to_write.size()));
}

The handle_requests(...) coroutine is responsible for processing the HTTP request and sending back the appropriate response. This coroutine does the following:

And with that we've covered the basic scaffolding of the HTTP server. In the next section we'll dig into the details of:


The kev::task coroutine type

stdexec does provide a more complete task type in the form of exec::task but I want to build up the understanding from first principles. So I decided to author my own task type called kev::task. Please be warned that a production ready task type would have to deal with a lot more use cases and would be significantly more complex than what is presented here. This is the basic outline of kev::task:

namespace kev {
    template <typename T>
    struct task {
        // Opt into the stdexec sender model
        using sender_concept = stdexec::sender_t;
        using promise_type  = task_promise<T>;
        using handle_type   = std::coroutine_handle<promise_type>;

        // Awaiting a task transfers ownership of the coroutine
        auto operator co_await() && {
            return task_awaiter<T>(std::exchange(handle, {}));
        }

        // Tasks are move-only: a coroutine has a single consumer
        task(task&& other) noexcept
            : handle(std::exchange(other.handle, {})) {}

        ~task() {
            if (handle) handle.destroy();
        }

    private:
        friend promise_type;

        explicit task(handle_type h) : handle(h) {}
        handle_type handle{};
    };    
}

Key characteristics of kev::task

The full implementation of kev::task along with its associated promise_type and awaiter can be found here.


The UringContext class

At a high level, UringContext is responsible for:

The context owns an io_uring instance (m_ring) for its entire lifetime. Copying is explicitly disallowed, which avoids accidental sharing of a kernel ring across independent execution contexts. Move semantics are supported, allowing a context to be transferred without tearing down and rebuilding the underlying ring.

Asynchronous I/O Primitives

kev::task<int> UringContext::async_accept(RawFileDescriptor server_fd);
kev::task<size_t> UringContext::async_read(RawFileDescriptor fd, std::span<std::byte> buffer);
kev::task<void> UringContext::async_write_all(RawFileDescriptor fd, std::span<const std::byte> data);

Each of these functions returns a kev::task, meaning:

Internally, these operations submit an SQE to io_uring and suspend the coroutine. When the kernel posts a corresponding completion queue event (CQE), the coroutine is resumed and the task completes with the appropriate value. async_write_all is intentionally higher-level than a raw write: it guarantees that the entire buffer is written, retrying internally as needed. The lower-level ``async_write` helper (private) exposes the single-write semantics and returns the number of bytes written.

Event Loop Integration via Senders

The most important—and most subtle—part of UringContext is the run function:

template <stdexec::sender Sender> void UringContext::run(Sender sender)
{
    struct receiver
    {
        std::atomic<bool> *done;
        stdexec::inline_scheduler scheduler{};

        using is_receiver = void;
        static_assert(std::is_same_v<is_receiver, void>);

        auto get_env() const noexcept
        {
            return exec::make_env(exec::with(stdexec::get_scheduler, scheduler));
        }

        void set_value() noexcept
        {
            done->store(true, std::memory_order_release);
        }

        void set_error(std::exception_ptr) noexcept
        {
            done->store(true, std::memory_order_release);
        }

        void set_stopped() noexcept
        {
            done->store(true, std::memory_order_release);
        }
    };
    std::atomic<bool> done{false};
    auto op = stdexec::connect(std::move(sender), receiver{&done});
    stdexec::start(op);
    while (!done.load(std::memory_order_acquire))
    {
        run_once();
    }
}

Internally, run:

Constructs a small receiver that:

while (!done.load(std::memory_order_acquire)) {
    run_once();
}

Single-Step Progress: run_once

The run_once function is responsible for making progress on the io_uring event loop. It does this by:

    io_uring_cqe_seen(&m_ring, cqe);

In essence , run_once bridges the gap between the low-level io_uring interface and the high-level coroutine-based asynchronous programming model provided by stdexec. It ensures that as I/O operations complete, the corresponding coroutines are resumed with the appropriate results, allowing the program to continue executing asynchronously.


Custom Sender Types

To perform asynchronous operations using io_uring, we need to define custom sender types that encapsulate the logic for submitting requests and handling completions. Below are the key sender types used in our HTTP server implementation:

Let take a closer look at the ReadSender as an example/. ReadSender is the concrete adapter that turns a Linux io_uring read operation into a first-class stdexec sender. It is the missing link between kernel-level asynchronous I/O and high-level coroutine code that simply writes:

size_t bytes_read = co_await uring_ctx.async_read(client.m_fd.get(),
                                                   std::span(read_buffer.data(), read_buffer.size()));

At a conceptual level, ReadSender does three things:

  1. Captures the parameters of an asynchronous read (fd, buffer, and context)
  2. When started, submits an io_uring read request.
  3. When the read completes, invokes the receiver's set_value/set_error/set_stopped methods.

Sender Metadata: Concept and Completion Signatures

struct ReadSender {
    using sender_concept = stdexec::sender_t;

    template <typename Env>
    using completion_signatures = stdexec::completion_signatures<
        stdexec::set_value_t(size_t), // on success, deliver number of bytes read
        stdexec::set_error_t(std::exception_ptr), // on error, deliver exception
        stdexec::set_stopped_t() // on cancellation
    >;
};

The sender_concept alias opts the type into the Senders/Receivers model. The completion_signatures template defines the possible ways this sender can complete:

When the sender is connected to a receiver, it materializes an operation state that owns all execution-critical state for exactly one read. This operation state submits a read request to the io_uring submission queue in its start() function and then suspends. Completion is driven entirely by the kernel: when io_uring produces a completion entry, the event loop dispatches it back to the operation state via a small, type-erased callback. That callback translates the raw kernel result into the sender/receiver contract—calling set_value with the number of bytes read on success, or set_error with an exception on failure—thereby resuming the awaiting coroutine.

The key design point is that ReadSender cleanly separates concerns. It does not manage an event loop, scheduling policy, retries, or buffering strategy; it only bridges kernel completion semantics into stdexec completion signals. Because kev::task supports awaiting senders, UringContext::async_read can simply co_await a ReadSender, making asynchronous I/O appear synchronous at the call site while remaining fully non-blocking and driven by io_uring underneath.

The full implementation of ReadSender, along with AcceptSender and WriteSender, can be found here.


Closing thoughts

Building this HTTP server using the Senders/Receivers model provided by the Execution library has been an enlightening experience. It has deepened my understanding of asynchronous programming in C++ and demonstrated the power of composable abstractions for managing concurrency. The combination of coroutines and senders allows for writing asynchronous code that is both expressive and efficient, making it easier to reason about complex workflows.


Comments or Corrections?

If you have any comments or corrections, please create an issue in the blog repository.