Implement P0040R3 parallel specialized memory algorithms #3145

frederick-vs-ja · 2022-10-08T04:06:52Z

Fixes #525.

I decided to parallelize the following algorithms:

destroy
destroy_n
uninitialized_default_construct
uninitialized_default_construct_n
uninitialized_value_construct
uninitialized_value_construct_n

because IMO they are similar enough to for_each and for_each_n.

The following algorithms are currently not parallelized:

uninitialized_copy
uninitialized_copy_n
uninitialized_fill
uninitialized_fill_n
uninitialized_move
uninitialized_move_n

because the normal corresponding algorithms in <algorithm> are also not parallelized.

It is doubtful to me whether copy/move/fill families are really not worth being parallelized. But perhaps we should address this in another issue.

Parallelized algorithms: * destroy * destroy_n * uninitialized_default_construct * uninitialized_default_construct_n * uninitialized_value_construct * uninitialized_value_construct_n Non-parallelized algorithms: * uninitialized_copy * uninitialized_copy_n * uninitialized_fill * uninitialized_fill_n * uninitialized_move * uninitialized_move_n

Why can't we just ignore it?

AlexGuteniev · 2022-10-09T11:14:17Z

Do we want to have some benchmark to prove parallelizing works?

stl/inc/execution

tests/std/tests/P0040R3_parallel_memory_algorithms/test.cpp

stl/inc/memory

CaseyCarter · 2023-01-05T00:03:33Z

tests/std/tests/P0040R3_parallel_memory_algorithms/test.cpp

+struct test_case_uninitialized_value_construct_unwrap_parallel {
+    template <class ExecutionPolicy>
+    void operator()(const size_t testSize, const ExecutionPolicy& exec) {
+        auto vec            = vector<int>(testSize, bad_int);


To other reviewers: I'm not complaining about the use of AAA here, because I appreciate how it increases consistency in these test cases. I prefer auto x = f(args...); and auto y = T(args...); to auto x = f(args...); and T y(args...);. YMMV. (Also on 208.) (No change requested.)

I'm (reluctantly) okay with this rationale.

StephanTLavavej · 2023-01-20T02:40:22Z

stl/inc/execution

+        for (; _Count > 0; --_Count, (void) ++_UFirst) {
+            _STD _Construct_in_place(*_UFirst);
+        }
+        _STD _Seek_wrapped(_First, _UFirst);


No change requested: I observe that we could _STD _Seek_wrapped(_First, _Operation._Basis._Populate(_Operation._Team, _UFirst)); above (right line 5186), then exhaust parallelism resources and fall through to here, where we _Seek_wrapped again. However, this is correct because _Seek_wrapped is a "seek to"/assignment, not a "seek by", and there are no other uses of _First that could be damaged by it having been unexpectedly updated.

StephanTLavavej · 2023-01-20T03:02:22Z

tests/std/tests/P0040R3_parallel_memory_algorithms/test.cpp

+    unique_ptr<T, deallocating_only_deleter<T>> up{al.allocate(n), deallocating_only_deleter<T>{n}};
+    for (size_t i = 0; i != n; ++i) {
+        allocator_traits<allocator<T>>::construct(al, up.get() + i);
+    }


No change requested: it seems unusual that an array of elements is being stored in unique_ptr<T, DEL> instead of unique_ptr<T[], DEL>. However, I am unsure of the implications of making such a change (some are positive; unique_ptr<T[]> disables conversions that are bogus for arrays). This is only test code, so it's allowed to be somewhat squirrelly. 🐿️

Changes requested in Nov 2022 have been made, purr!

StephanTLavavej · 2023-01-21T05:05:36Z

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

StephanTLavavej · 2023-01-21T09:47:52Z

I pushed a conflict-free merge with main, followed by a commit to disable the test for /clr due to VSO-1664463 "/clr C++20 alignas emits error C3821 'managed type or function cannot be used in an unmanaged function' instead of falling back to native codegen".

StephanTLavavej · 2023-01-22T05:46:32Z

Thanks for truly finishing C++17! 😻 💯 🎉

frederick-vs-ja added 3 commits October 8, 2022 11:46

Test files for parallel memory algorithms

2ab5b2d

Add P0040R3_parallel_memory_algorithms to list

bb38f91

frederick-vs-ja requested a review from a team as a code owner October 8, 2022 04:06

frederick-vs-ja added 3 commits October 8, 2022 12:47

Remove UTF-8 BOM

df9f83e

Why can't we just ignore it?

Fix uninitialized_move_n and uninitialized_fill

03d5cbb

Fix the test file

f12b6be

CaseyCarter added the performance Must go faster label Oct 10, 2022

CaseyCarter added bug Something isn't working and removed performance Must go faster labels Oct 10, 2022

StephanTLavavej self-assigned this Oct 12, 2022

StephanTLavavej assigned barcharcraz Nov 2, 2022

barcharcraz previously requested changes Nov 9, 2022

View reviewed changes

StephanTLavavej unassigned barcharcraz and StephanTLavavej Nov 9, 2022

frederick-vs-ja added 9 commits November 21, 2022 01:46

Address @barcharcraz's review comments

c387462

Address @barcharcraz's review comments (<execution>)

c48f9d7

Add memset test cases and check the returned iterators

97c4f54

Merge branch 'microsoft:main' into parallel-memory-algorithms

aa22fc2

Clang-format

d96ccc3

Clang-format

e963846

Use usual_17_matrix.lst for the test

4aa74a8

Fix missing _REQUIRE_CPP17_MUTABLE_LVALUE_ITERATOR

9294b4f

Remove DUPLICATED destroy!

c4d8a34

StephanTLavavej assigned barcharcraz Nov 30, 2022

Merge branch 'microsoft:main' into parallel-memory-algorithms

54404af

frederick-vs-ja and others added 2 commits December 17, 2022 01:43

Drop const for parameter objects in forward declarations

c02455d

Revert Casey's reinterpret_cast suggestion

300d413

CaseyCarter self-requested a review December 16, 2022 19:44

Consistency pass

f2ae6e7

CaseyCarter approved these changes Dec 16, 2022

View reviewed changes

frederick-vs-ja added 2 commits December 17, 2022 16:20

Test coverage for wrapped iterators

9760926

Eliminate possibly problematic iterator arithmetic

a8c8d54

StephanTLavavej requested a review from CaseyCarter January 4, 2023 22:28

StephanTLavavej assigned CaseyCarter and StephanTLavavej Jan 4, 2023

CaseyCarter approved these changes Jan 5, 2023

View reviewed changes

CaseyCarter removed their assignment Jan 5, 2023

StephanTLavavej approved these changes Jan 20, 2023

View reviewed changes

StephanTLavavej removed their assignment Jan 20, 2023

StephanTLavavej self-assigned this Jan 21, 2023

StephanTLavavej added 2 commits January 21, 2023 01:45

Merge branch 'main' into parallel-memory-algorithms

455223d

Fix /clr.

ca84b3f

StephanTLavavej approved these changes Jan 21, 2023

View reviewed changes

CaseyCarter approved these changes Jan 21, 2023

View reviewed changes

StephanTLavavej merged commit c42483d into microsoft:main Jan 22, 2023

StephanTLavavej mentioned this pull request Jan 22, 2023

ASAN tests are sporadically failing #2908

Closed

frederick-vs-ja deleted the parallel-memory-algorithms branch January 22, 2023 06:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement P0040R3 parallel specialized memory algorithms #3145

Implement P0040R3 parallel specialized memory algorithms #3145

frederick-vs-ja commented Oct 8, 2022 •

edited

Loading

AlexGuteniev commented Oct 9, 2022

CaseyCarter Jan 5, 2023

StephanTLavavej Jan 20, 2023

StephanTLavavej Jan 20, 2023

StephanTLavavej Jan 20, 2023

StephanTLavavej commented Jan 21, 2023

StephanTLavavej commented Jan 21, 2023

StephanTLavavej commented Jan 22, 2023

Implement P0040R3 parallel specialized memory algorithms #3145

Implement P0040R3 parallel specialized memory algorithms #3145

Conversation

frederick-vs-ja commented Oct 8, 2022 • edited Loading

AlexGuteniev commented Oct 9, 2022

CaseyCarter Jan 5, 2023

Choose a reason for hiding this comment

StephanTLavavej Jan 20, 2023

Choose a reason for hiding this comment

StephanTLavavej Jan 20, 2023

Choose a reason for hiding this comment

StephanTLavavej Jan 20, 2023

Choose a reason for hiding this comment

StephanTLavavej commented Jan 21, 2023

StephanTLavavej commented Jan 21, 2023

StephanTLavavej commented Jan 22, 2023

frederick-vs-ja commented Oct 8, 2022 •

edited

Loading