`<charconv>`: `_Divide` can trim `_Big_integer_flt` more efficiently #2623

StephanTLavavej · 2022-03-29T02:52:39Z

We have a comment explaining how _Big_integer_flt accesses only [0, _Myused) and tolerates all other elements being garbage:

STL/stl/inc/charconv

Lines 366 to 369 in 2787718

    
           // PERFORMANCE NOTE: We intentionally do not initialize the _Mydata array when a _Big_integer_flt object is constructed.
 
           // Profiling showed that zero-initialization caused a substantial performance hit. Initialization of the _Mydata
 
           // array is not necessary: all operations on the _Big_integer_flt type are carefully written to only access elements at
 
           // indices [0, _Myused), and all operations correctly update _Myused as the utilized size increases.

Now, look at the following code at the end of _Divide:

STL/stl/inc/charconv

Lines 1034 to 1045 in 2787718

    
           // Trim the remainder:
 
           for (uint32_t _Ix = _Max_numerator_element_index + 1; _Ix < _Numerator._Myused; ++_Ix) {
 
               _Numerator._Mydata[_Ix] = 0;
 
           }
 
           uint32_t _Used = _Max_numerator_element_index + 1;
 
           while (_Used != 0 && _Numerator._Mydata[_Used - 1] == 0) {
 
               --_Used;
 
           }
 
           _Numerator._Myused = _Used;

I claim that the first loop is unnecessary. It's zeroing out elements in the range [_Max_numerator_element_index + 1, original _Myused). Then, the following code updates _Myused to at most _Max_numerator_element_index + 1. (Possibly less, if we discover that there are more zero elements at the end of this range.)

According to _Big_integer_flt's invariants, the _Myused update already makes the contents of the range [_Max_numerator_element_index + 1, original _Myused) irrelevant - thus we don't need to spend extra time zeroing them out.

Noticed while working on #2366.

…mes garbage elements.

StephanTLavavej · 2022-04-01T20:20:34Z

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

This loop in _Divide is unnecessary because _Big_integer_flt assu…

16a4e7d

…mes garbage elements.

StephanTLavavej added the performance Must go faster label Mar 29, 2022

StephanTLavavej requested a review from a team as a code owner March 29, 2022 02:52

CaseyCarter approved these changes Mar 30, 2022

View reviewed changes

StephanTLavavej self-assigned this Apr 1, 2022

StephanTLavavej merged commit 57b28c7 into microsoft:main Apr 4, 2022

StephanTLavavej deleted the unnecessary-trimming branch April 4, 2022 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`<charconv>`: `_Divide` can trim `_Big_integer_flt` more efficiently #2623

`<charconv>`: `_Divide` can trim `_Big_integer_flt` more efficiently #2623

StephanTLavavej commented Mar 29, 2022

StephanTLavavej commented Apr 1, 2022

	// PERFORMANCE NOTE: We intentionally do not initialize the _Mydata array when a _Big_integer_flt object is constructed.
	// Profiling showed that zero-initialization caused a substantial performance hit. Initialization of the _Mydata
	// array is not necessary: all operations on the _Big_integer_flt type are carefully written to only access elements at
	// indices [0, _Myused), and all operations correctly update _Myused as the utilized size increases.

	// Trim the remainder:
	for (uint32_t _Ix = _Max_numerator_element_index + 1; _Ix < _Numerator._Myused; ++_Ix) {
	_Numerator._Mydata[_Ix] = 0;
	}

	uint32_t _Used = _Max_numerator_element_index + 1;

	while (_Used != 0 && _Numerator._Mydata[_Used - 1] == 0) {
	--_Used;
	}

	_Numerator._Myused = _Used;

<charconv>: _Divide can trim _Big_integer_flt more efficiently #2623

<charconv>: _Divide can trim _Big_integer_flt more efficiently #2623

Conversation

StephanTLavavej commented Mar 29, 2022

StephanTLavavej commented Apr 1, 2022

`<charconv>`: `_Divide` can trim `_Big_integer_flt` more efficiently #2623

`<charconv>`: `_Divide` can trim `_Big_integer_flt` more efficiently #2623