Trip Report: Summer ISO C++ Meeting in St. Louis, USA

Two weeks ago, I attended the spring 2024 meeting of the ISO C++ standardization committee in St. Louis, USA. This was the fourth meeting for the upcoming C++26 standard. For an overview of all the papers that made progress, check out the collaborative trip report on reddit. As usual, I spent most of my time in library evolution, where we discuss proposals for the C++ standard library, and I was chairing SG 9, the study group that gives early feedback for std::ranges proposals. This time, I also spent an entire day in SG 23, the study group that focuses on safety and security. Most of the discussions I participated in were uncontroversial (or extremely controversial, but ultimately not that important like naming things), but I have some thoughts on senders/receivers, reflection, and borrow checking in C++.

Senders/receivers

P2300 is a big proposal that adds a new way of writing concurrent programs. Just like std::ranges changed the way we write range algorithms, std::execution is going to change the way we write concurrent code. The idea is to combine senders, which are essentially lazy futures that will eventually send some values, with receivers, which are just fancy callbacks, to describe the data flow. Special senders can then be used to transfer execution to different threads or schedule them on thread pools. Just like composing views, composing senders is purely declarative, and execution only starts when you need the value. This allows you to build the execution graph without worrying about synchronization and execute it later on. It is a brilliant design.

It's just that the details involve a lot of complexity.

Being a typical C++ proposal, it aims to be a 100% solution for every use case. This requires a whole bunch of customization mechanisms to adopt algorithms for specific execution contexts, a whole lot of metaprogramming to compute the completion signature, and many new concepts and utility functions. If an end user wants to compose existing senders, that is simple enough by using the | operator or writing a coroutine and co_awaiting a sender, but the implementation below is complex and writing your own sender is non-trivial. Similar to std::ranges, this could lead to higher compile times, more work for the optimizer to eliminate all those abstractions, and unreadable error messages if you do something wrong.

One particular complexity I don't like is the idea of environments. A sender can have an associated environment, which is essentially a compile-time key-value-store in the form of a named tuple of properties. Senders can add new properties to the environment or query existing ones, and composition preserves those changes. That way, you can implement a form of dependency injection. Let's say you have a third-party library that implements a sender-based algorithm. You use it by composing it with your own sender that sends the input values, by e.g. reading them from a file:

std::execution::sender auto some_algorithm(std::execution::sender auto input); // third-party
std::execution::sender auto read_file(std::string_view path); // your own code

int main() {
	auto input = read_file(path);
	auto value = some_algorithm(input);
	...
}
Composing two senders to build an algorithm

Suppose that both read_file and some_algorithm need to allocate memory. Since you have written read_file yourself, you use your fancy allocator to do that, and would like some_algorithm to do that as well. This can be done using the environment: If read_file puts the allocator in the environment, the implementation of some_algorithm can query the environment for an allocator and use one if provided. That way, you can customize the behavior without requiring additional parameters to some_algorithm; they are injected by the earlier senders. If there are a lot of parameters, it can be convenient.

In addition, the environment can be used to query information about the current sender chain itself. For example, a sender that schedules work on a thread pool can put the thread pool in the environment. These scheduled senders can get access to it to schedule more work in the same pool they're executing on. In a way, the environment is the thread_local equivalent for senders/receivers.

However, environments also introduce the idea of environment-dependent senders. The prime example is std::execution::read_env(property). This is a sender that reads a specified property from the environment and sends its value. What is the type of the value it sends? Well, that depends on the environment in question. So, in general, the full completion signature is not known until all senders have been composed and we're ready to execute work. This can happen very far away from the actual site of the composition. Let's say that we accidentally associate a std::pmr::memory_resource* with the allocator property in our implementation of read_file, and not an Allocator. However, the implementation of some_algorithm expects an allocator, so when reading the property, it gets the wrong type, which will cause a compiler error. This error is only detected when we compose them in main. If we compose two senders from different libraries in a third library using the environment from a fourth place, it can require a lot of digging around to figure out what exactly went wrong.

In St. Louis, Eric Niebler presented P3164, which is a great paper that aims to improve compiler error messages when misusing senders/receivers. It has to go through a lot of trouble to handle environment-dependent senders, which made me wonder whether environments are really worth it.

In addition to the inherent complexity of senders/receivers, there are also procedural concerns. P2300 is a big, complicated paper with a lot of trusted authors. Concerns were raised that maybe it wasn't reviewed properly, as committee members were not able to fully understand the intricate design details, and instead just trusted the authors that they did a good enough job. I am also personally concerned about the large number of in-flight papers that modify the main proposal in some way. I don't want to end up in a situation where some of senders/receivers are in, but crucial modifications (such as the diagnostic improvements) missed the C++26 deadline and cannot be applied later on when breaking changes are no longer possible.

So, while P2300 was adopted in the plenary vote and is now a part of the working draft which will become the C++26 standard, it was a very narrow vote with 1/3 voting against adoption. I expect some controversy to continue into the next meetings.

Reflection

P2996 adds reflection to C++. It features the ability to reflect on an entity, which results in a std::meta::info object. You can then use a standard library API to query information about it, create new std::meta::info objects and turn it back into code, or generate entirely new code. This can then be used to finally implement enum-to-string conversions (among other things).

On the language side, it has passed the evolution subcommittee and is currently undergoing review of the specification. However, the selected syntax for reflection (using a prefix ^ operator as in ^foo) can lead to parsing ambiguities with Objective C blocks, which are supported in C++ by a clang compiler extension. The author's second choice, a prefix % operator as in %foo, is also not ideal because you might want to pass the resulting std::meta::info object as a template parameter, writing code like some_template<%foo>. Unfortunately, <% is a digraph...

Meanwhile, library evolution reviewed the proposed library API to query information about the std::meta::info objects. Essentially, a std::meta::info object is an opaque handle to the internal compiler AST, and there are a bunch of functions to get information. Note that it doesn't matter what you reflect, a function, a type, an expression, a namespace, they all result in the same type—std::meta::info. This is to ensure that future changes in C++ would not cause API or ABI breaks in the reflection API. As such, the compiler cannot detect nonsensical function calls using the type system. For example, it does not make sense to call std::meta::name_of on the reflection of the integer 42, but the call compiles. We spent a long time discussing whether that should be a precondition violation or return of an empty string (it is obviously a precondition violation). We spent even longer discussing the behavior of std::meta::name_of(reflection_of_anonymous_namespace), but ultimately (correctly?) decided that it should be an empty string, since an anonymous namespace is an entity that could have a name, it just doesn't—unlike the integer 42 where it does not make sense to ask for a name. The API review is far from done, but it is going to continue on telecons, to ensure that reflection becomes a part of C++26.

My only remaining concern with it is the coupling with the rest of the standard library. For example, std::meta::members_of returns a compile-time std::vector<std::meta::info> of the members of a class. I don't like that. Philosophically, a low-level API to query information from the compiler should not depend on the standard library. Many projects don't use std::vector, since they can implement it better. However, they cannot implement a reflection API, only the compiler can. So using the existing API forces everybody to depend on std::vector. Furthermore, unless you use modules, including <vector> and <string_view> (for std::meta::name_of) can significantly bloat compile times if you have carefully avoided the use of standard library headers. As reflection code has to live in a header file and cannot be hidden away in a .cpp, this is a problem. Implementers also told me that it is really expensive to construct a std::vector at compile-time as opposed to some custom build types that the compiler is aware about. As such, I will write a paper to propose a decoupling by switching to e.g. std::meta::info_array and const char* instead of std:: vector and std::string_view.

I am nevertheless excited to have reflection in C++26.

C++ borrow checking

Due to the increased push towards safe programming languages, the C++ committee has established study group SG 23 to discuss safety and security proposals. I attended their meetings on Tuesday.

Bjarne Stroustrup presented P3274, his proposal to add safety profiles. The idea is to selectively opt-in to compile-time and runtime checks in parts of your code using attributes. For example, you could opt-in to a profile that does range checks to prevent out-of-bounds range access. It's certainly the only way we can adopt more checks into C++ without breaking existing code, but the paper itself does not yet propose concrete profiles in much detail.

Meanwhile, Sean Baxter has implemented the entirety of Rust's borrow checker in his circle C++ compiler. It introduces borrow checked references with lifetime annotations, which is then checked just like in Rust. This is really impressive work, and shows that there is no technical problem with adding a safe subset of C++. However, there is a problem of adoption. Such an invasive change in the type system means that you will not have borrow checking when calling library functions that still use the regular references (which are all of them). For his demos, he had to write his own std2::string_view and std2::vector to get the benefits. This essentially forks the language into safe and unsafe libraries, and collectively everything has to be rewritten to the safe version. You also cannot mark a function as safe unless all called functions are safe (or you use unsafe blocks). This requires a bottom-up (partial) rewrite. However, if you're rewriting code anyway, why not rewrite it in Rust and get all the other benefits of adopting the Rust ecosystem for free?

Ultimately, I don't think C++ will, or needs to, become a safe language. We already have a safe systems programming language, Rust. If you're writing code where safety is important, you should use Rust. Instead we should focus our efforts on improving interop with Rust. That way, C++ can easily use safe modules for critical code like networking or parsing, and Rust can access the vast amount of legacy C++ code.

— by Jonathan Müller

Do you have feedback? Send us a message at devblog@think-cell.com !

Sign up for blog updates

Don't miss out on new posts! Sign up to receive a notification whenever we publish a new article.

Just submit your email address below. Be assured that we will not forward your email address to any third party.

Please refer to our privacy policy on how we protect your personal data.

Share