class: middle # From Iterators To Ranges ## The Upcoming Evolution Of the Standard Library --- # Questions - Who knows Boost.Range? - Who knows Eric Niebler's Range V3 library? - Who uses ranges in everyday programming? --- # Why Ranges? ```cpp std::vector
vec=...; std::sort( vec.begin(), vec.end() ); vec.erase( std::unique( vec.begin, vec.end() ), vec.end() ); ``` How often do we have to mention `vec`? Pairs of iterators belong together -> put into one object! ```cpp tc::unique_inplace(tc::sort(vec)); ``` Much nicer! --- # Why do I think I know something about ranges? - think-cell has a range library * evolved from Boost.Range - 1 million lines of production code use it - Library and production code evolve together * ready to change library and production code anytime * no obstacle to library design changes * large code base to try them out --- # Ranges in C++11 - range-based for ```cpp for( int& i :
) { ... } ``` - universal access to begin/end ```cpp std::begin/end(
) ``` -- - that's all ... :-( --- #Future of Ranges - Eric Niebler's pet project - Ranges TS ("Technical Specification") * `
` supports ranges ```cpp namespace ranges { template< typename Rng, typename What > decltype(auto) find( Rng && rng, What const& what ) { return std::find( ranges::begin(rng), ranges::end(rng), what ); } } ``` - Range V3 * preview of what Eric wants to standardize --- #What are Ranges? - Containers ``` vector string list ``` * own elements * deep copying * copying copies elements in O(N) * deep constness * `const` objects implies `const` elements -- - Views ``` Range /\ / \ / \ Container View ``` --- #Views ```cpp template
struct iterator_range { It m_itBegin; It m_itEnd; It begin() const { return m_itBegin; } It end() const { return m_itEnd; } }; ``` - reference elements - shallow copying * copying copies reference in O(1) - shallow constness * view object `const` independent of element `const` --- #More Interesting Views: Range Adaptors ```cpp std::vector
v; auto it=ranges::find( v, 4 ); // first element of value 4. ``` vs. ```cpp struct A { int id; double data; }; std::vector
v={...}; auto it=ranges::find_if( v, [](A const& a){ return a.id==4; } // first element of value 4 in id ); ``` - Similar in semantics - Not at all similar in syntax --- #Transform Adaptor ```cpp std::vector
v; auto it=ranges::find( v, 4 ); // first element of value 4. ``` vs. ```cpp struct A { int id; double data; }; std::vector
v={...}; auto it=ranges::find( tc::transform(v,std::mem_fn(&A::id)), 4 ); // first element of value 4 in id ``` --- #Transform Adaptor (2) ```cpp struct A { int id; double data; }; std::vector
v={...}; auto it=ranges::find( tc::transform(v,std::mem_fn(&A::id)), 4 ); // first element of value 4 in id ``` What is `it` pointing to? -- - `int`! -- What if I want `it` to point to `A`? -- ```cpp auto it=ranges::find( tc::transform(v,std::mem_fn(&A::id)), 4 ).base(); ``` --- #Transform Adaptor Implementation ```cpp template
struct transform_range { struct iterator { private: Func m_func; // in every iterator, hmmm... decltype( tc::begin(std::declval
()) ) m_it; public: decltype(auto) operator*() const { return m_func(*m_it); } decltype(auto) base() const { return (m_it); } ... }; }; ``` --- #Filter Adaptor Range of all `a` with `a.id==4`? ```cpp tc::filter( v, [](A const& a){ return 4==a.id; } ); ``` * Lazy! Filter executed while iterating --- #Filter Adaptor Implementation ```cpp template
struct filter_range { struct iterator { private: Func m_func; // functor and TWO iterators! decltype( ranges::begin(std::declval
()) ) m_it; decltype( ranges::begin(std::declval
()) ) m_itEnd; public: iterator& operator++() { ++m_it; while( m_it!=m_itEnd && !static_cast
(m_func(*m_it)) ) ++m_it; // why static_cast
? return *this; } ... }; }; ``` --- #How does iterator look like of `tc::filter(tc::filter(tc::filter(...)))` ? --- ```cpp m_func3 m_it3 m_func2 m_it2 m_func1 m_it1; m_itEnd1; m_itEnd2 m_func1 m_itEnd1; m_itEnd1; m_itEnd3 m_func2 m_it2 m_func1 m_itEnd1; m_itEnd1; m_itEnd2 m_func1 m_itEnd1; m_itEnd1; ``` Boost.Range does this! ARGH! --- #More Efficient Range Adaptors Must keep iterators small Idea: adaptor object carries everything that is common for all iterators ```cpp m_func m_itEnd ``` Iterators carry reference to adaptor object (for common stuff) and base iterator ```cpp *m_rng m_it ``` -- - Iterator cannot outlive its range --- #Again: How does iterator look like of ```cpp tc::filter(tc::filter(tc::filter(...))) ? ``` ```cpp m_rng3 m_it3 m_rng2 m_it2 m_rng1 m_it1 ``` - Range V3 State of The Art - Still not insanely great... --- #Index Concept Index - Like iterator - But all operations require its range object ```cpp template
struct index_range { ... using Index=...; Index begin_index() const; Index end_index() const; void increment_index( Index& idx ) const; void decrement_index( Index& idx ) const; reference dereference( Index& idx ) const; ... }; ``` --- #Index-Iterator Compatibility - Index from Iterator * `using Index = Iterator` * Index operations = Iterator operations - Iterator from Index ```cpp template
struct iterator_for_index { IndexRng* m_rng typename IndexRng::Index m_idx; iterator& operator++() { m_rng.increment_index(m_idx); return *this; } ... }; ``` --- #Super-Efficient Range Adaptors With Indices Index-based filter_range ```cpp template
struct filter_range { Func m_func; Base& m_base; using Index=typename Base::Index; void increment_index( Index& idx ) const { do { m_base.increment_index(idx); } while( idx!=m_base.end_index() && !m_func(m_base.dereference_index(idx)) ); } }; ``` --- #Super-Efficient Range Adaptors With Indices Index-based filter_range ```cpp template
struct filter_range { Func m_func; Base& m_base; using Index=typename Base::Index; ... ``` ```cpp template
struct iterator_for_index { IndexRng* m_rng typename IndexRng::Index m_idx; ... ``` - All iterators are two pointers * irrespective of stacking depth --- #Range V3 and rvalue containers If adaptor input is lvalue container - `view::filter` creates view - view is reference, O(1) copy, shallow constness etc. ```cpp auto rng=view::filter(vec,pred1); bool b=ranges::any_of(rng,pred2); ``` --- #Range V3 and rvalue containers If adaptor input is rvalue container - `view::filter` cannot create view - view would hold dangling reference to rvalue ```cpp auto rng=view::filter(create_vector(),pred1); // DOES NOT COMPILE bool b=ranges::any_of(rng,pred2); ``` --- #Range V3 and rvalue containers If adaptor input is rvalue container - `view::filter` cannot create view - view would hold dangling reference to rvalue ```cpp auto rng=action::filter(create_vector(),pred1); // COMPILES bool b=ranges::any_of(rng,pred2); ``` -- Big Trap - not lazy anymore! --- #think-cell and rvalue containers If adaptor input is lvalue container - `tc::filter` creates view - view is reference, O(1) copy, shallow constness etc. If adaptor input is rvalue container - `tc::filter` creates container - aggregates rvalue container, deep copy, deep constness etc. Always lazy - Laziness and container-ness are orthogonal concepts ```cpp auto rng=tc::filter(vec,pred1); bool b=ranges::any_of(rng,pred2); ``` ```cpp auto rng=tc::filter(creates_vector(),pred1); bool b=ranges::any_of(rng,pred2); ``` --- ##Beyond Range V3: #More Flexible Algorithm Returns ```cpp template< typename Rng, typename What > decltype(auto) find( Rng && rng, What const& what ) { auto const itEnd=ranges::end(rng); for( auto it=ranges::begin(rng); it!=itEnd; ++it ) if( *it==what ) return it; return itEnd; } ``` --- #More Flexible Algorithm Returns (2) ```cpp template< typename Pack, typename Rng, typename What > decltype(auto) find( Rng && rng, What const& what ) { auto const itEnd=ranges::end(rng); for( auto it=ranges::begin(rng); it!=itEnd; ++it ) if( *it==what ) return Pack::pack(it,rng); return Pack::pack_singleton(rng); } ``` ```cpp struct return_iterator_or_end { static auto pack(auto it, auto&& rng) { return it; } static auto pack_singleton(auto && rng) { return ranges::end(rng); } } ``` ```cpp auto it=find
(...) ``` --- #More Flexible Algorithm Returns (3) ```cpp template< typename Pack, typename Rng, typename What > decltype(auto) find( Rng && rng, What const& what ) { auto const itEnd=ranges::end(rng); for( auto it=ranges::begin(rng); it!=itEnd; ++it ) if( *it==what ) return Pack::pack(it,rng); return Pack::pack_singleton(rng); } ``` ```cpp *struct return_iterator { static auto pack(auto it, auto&& rng) { return it; } static auto pack_singleton(auto && rng) { * std::assert(false); return ranges::end(rng); } } ``` ```cpp auto it=find
(...) ``` --- #More Flexible Algorithm Returns (4) ```cpp template< typename Pack, typename Rng, typename What > decltype(auto) find( Rng && rng, What const& what ) { auto const itEnd=ranges::end(rng); for( auto it=ranges::begin(rng); it!=itEnd; ++it ) if( *it==what ) return Pack::pack(it,rng); return Pack::pack_singleton(rng); } ``` ```cpp *struct return_head { static auto pack(auto it, auto&& rng) { * return tc::take( std::forward
(rng), it); } static auto pack_singleton(auto&& rng) { std::assert(false); return tc::take( std::forward
(rng), ranges::begin(rng)); } } ``` ```cpp auto it=find
(...) ``` --- #Generator Ranges ```cpp template
void traverse_widgets( Func func ) { if( window1 ) { window1->traverse_widgets(std::ref(func)); } func(button1); func(listbox1); if( window2 ) { window2->traverse_widgets(std::ref(func)); } } ``` - like range of widgets - but no iterators --- #Generator Ranges ```cpp template
void traverse_widgets( Func func ) { if( window1 ) { window1->traverse_widgets(std::ref(func)); } func(button1); func(listbox1); if( window2 ) { window2->traverse_widgets(std::ref(func)); } } ``` ```cpp mouse_hit_any_widget=tc::any_of( [](auto func){ traverse_widgets(func); }, [](auto const& widget) { return widget.mouse_hit(); } ); ``` --- #External Iteration * Consumer calls producer to get new element * example: C++ iterators ``` ^ | Stack Producer Producer | / \ / \ Consumer Consumer Consumer ``` - Consumer is at bottom of stack - Producer is at top of stack --- #External iteration (2) Consumer is at bottom of stack * contiguous code path for whole range * easier to write * better performance * state encoded in instruction pointer * no limit for stack memory Producer is at top of stack * contiguous code path for each item * harder to write * worse performance * single entry point, must restore state * fixed amount of memory or go to heap --- #Internal Iteration * Producer calls consumer to offer new element * example: EnumThreadWindows ``` ^ | Stack Consumer Consumer | / \ / \ Producer Producer Producer ``` Producer is at bottom of stack * ... all the advantages of being bottom of stack ... Consumer is at top of stack * ... all the disadvantages of being top of stack ... --- #Coroutines Can both consumer and producer be bottom-of-stack? * Yes, with coroutines ```cpp // does not compile, conceptual generator
traverse_widgets() { if( window1 ) { window1->traverse_widgets(); } yield button1; yield listbox1; if( window2 ) { window2->traverse_widgets(); } } ``` --- #Coroutines (2) - Stackful * use two stacks and switch between them * very expensive - implemented as OS fibers - 1 MB of virtual memory per coroutine - Stackless (N4402) * can only yield in top-most function ```cpp // does not compile, conceptual generator
traverse_widgets() { if( window1 ) { * window1->traverse_widgets(); } yield button1; yield listbox1; if( window2 ) { * window2->traverse_widgets(); } } ``` --- #Coroutines (2) - Stackful * use two stacks and switch between them * very expensive - implemented as OS fibers - 1 MB of virtual memory per coroutine - Stackless (N4402) * can only yield in top-most function * still a bit expensive - dynamic jump to resume point - save/restore some registers - no aggressive inlining --- #Internal Iteration often good enough Algorithm | Internal Iteration? ----------------------- | ------------------- find | no (single pass iterators) binary_search | no (random access iterators) -- for_each | yes accumulate | yes all_of | yes any_of | yes none_of | yes ... | -- Adaptor | Internal Iteration? ----------------------- | ------------------- tc::filter | yes tc::transform | yes So allow ranges that support only internal iteration! --- #any_of implementation ```cpp namespace tc { template< typename Rng > bool any_of( Rng const& rng ) { bool bResult=false; tc::enumerate( rng, [&](bool_context b){ bResult=bResult || b; } ); return bResult; } } ``` - `tc::enumerate` is common interface for iterator, index and generator ranges - Ok? -- * std::any_of stops when true is encountered! --- #Interruptable Generator Ranges First idea: exception! -- - too slow:-( -- Second idea: ```cpp enum break_or_continue { break_, continue_ }; ``` ```cpp template< typename Rng > bool any_of( Rng const& rng ) { bool bResult=false; tc::enumerate( rng, [&](bool_context b){ bResult=bResult || b; return bResult ? break_ : continue_; } ); return bResult; } ``` --- #Interruptable Generator Ranges (2) - Generator Range can elide `break_` check * If functor returns `break_or_continue`, * break if `break_` is returned. * If functor returns anything else, * nothing to check, always continue --- #concat ```cpp std::list
lst; std::vector
vec; std::for_each( tc::concat(lst,vec), [](int i) { ... }); ``` --- #concat implementation with indices ```cpp template
struct concat_range { private: using Index1=typename range_index
::type; using Index2=typename range_index
::type; Rng1& m_rng1; Rng2& m_rng2; using index=std::variant
; public: ... ``` --- #concat implementation with indices (2) ```cpp ... void increment_index(index& idx) { idx.switch( [&](Index1& idx1){ m_rng1.increment_index(idx1); if (m_rng1.at_end_index(idx1)) { idx=m_rng2.begin_index(); } }, [&](Index2& idx2){ m_rng2.increment_index(idx2); } ); } ... ``` - Branch for each increment! --- #concat implementation with indices (3) ```cpp ... auto dereference_index(index const& idx) const { return idx.switch( [&](Index1 const& idx1){ return m_rng1.dereference(idx1); }, [&](Index2 const& idx2){ return m_rng2.dereference(idx2); } ); } ... }; ``` - Branch for each dereference! - How avoid all these branches? -- * With Generator Ranges! --- #concat implementation as generator range ```cpp template
struct concat_range { private: Rng1 m_rng1; Rng2 m_rng2; public: ... // version for non-breaking func template
void operator()(Func func) { tc::fn_enumerate(m_rng1, func); tc::fn_enumerate(m_rng2, func); } }; ``` - Even iterator-based ranges sometimes perform better with generator interface! --- Now that we have all this range stuff - URL of our range library: https://github.com/think-cell/think-cell-library -- #I hate the range-based for loop! -- because it encourages people to write this ```cpp bool b=false; for( int n : rng ) { if( is_prime(n) ) { b=true; break; } } ``` -- instead of this ```cpp bool b=tc::any_of( rng, is_prime ); ``` THANK YOU!
Compartilhar
E-mail
Facebook
LinkedIn
Twitter
XING
×