Print Page - small refactoring in globals.cpp

Title: small refactoring in globals.cpp
Post by: frithjofh on January 29, 2016, 05:44:37 PM

hi everybody,

took a look into file globals.cpp and I prose some refactoring of three functions:

- general code clean up
- improved readability
- simplification of logic
- removed some duplication
- speed up measured to be about 14% due, I suppose, to more cache locality (all is relative: on my machine, with my test case,... )

regards

frithjof

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on January 29, 2016, 08:31:55 PM

GetVectorFromString should not call wxArrayString, because there is an additional copy!

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on January 29, 2016, 09:51:11 PM

because first I have a const wxArrayString and then return a std::vector ?

my thoughts were: in the original function I have two vector's, one for the values in the function and the other as a copy during the return.

Now I have one wxArrayString and because the function is so "short" the unnamed object which is returned gets the benefit of RVO. so there are at most the same number of containers, at best even one less. the original function cannot count on RVO imho.

but maybe I get something wrong here...

thx for he feedback

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on January 29, 2016, 10:03:26 PM

nah... I think you are right... there is an extra copy of the array... damm

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on January 29, 2016, 10:11:19 PM

so maybe make a template function out of it in order to not textually duplicate the code... what do you think?

never mind... i am too tired to think strait

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on January 29, 2016, 10:23:25 PM

A template is fine, but generally doesn't make much difference.

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on January 29, 2016, 11:07:28 PM

not for the size of executable, but to avoid the duplicate

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on January 30, 2016, 09:43:49 AM

factory recall !

found a strange error in my patch. sorry, haven't tested it enough, I think.

will post new patch these days

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on January 30, 2016, 11:12:04 AM

here the new version of the patch.

discovered, that assign() for wxArrayString does something else than it should. it appends the given values instead of replacing the contents of the wxArrayString...

if anyone is interested, here my measuring in msec for the functions returning an array/vector from a string:

item count -	current arry -	new array -	current std::vec -	new std::vec
512	1	0	1	0
1.200	7	0	7	0
2.500	24	0	26	1
5.000	93	2	106	3
10.000	402	2	395	6

regards

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 04, 2016, 09:15:02 AM

reworked the next function: EscapeSpaces

speed up in msec:

words in text \|	current version \|	new version
32	0	0
64	1	0
128	6	0
256	28	0
512	97	0
1024	388	0
2048	1570	0
4096	6264	1

removed the comment about speeding this function up.

some minor changes

regards

Title: Re: small refactoring in globals.cpp
Post by: Alpha on February 04, 2016, 10:13:45 PM

Have not tested, but it looks like GetStringFromArray() will have problems with an empty array.

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on February 05, 2016, 01:07:41 AM

Why are you doing 3 iterations on the array in GetArrayFromString?
Why are you doing additional copy of the whole array?
And why are you doing Mid(a, npos-something) (hint happens when the array doesn't have a separator at the end)?

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on February 05, 2016, 01:14:06 AM

Code (diff) Select


-    if (!ret.IsEmpty() && ret[0] != _T('"') && ret[0] != _T('\''))
+    if (str.IsEmpty() || str.StartsWith(_T("\"")) || str.StartsWith(_T("\'")))

Why is this change needed? Why have you replaced the operator[] with startswith?

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 05, 2016, 10:43:12 AM

@alpha

thx, you are right. I will add a test for empty array.

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 05, 2016, 11:41:39 AM

@obfuscated

doing three separate iterations on the array results in the code running much faster due to more cache locality. my measurements confirmed the speedup. the original version continuously jumped from one context to another. putting each action in a separate loop is just faster.

for instance, the original version repeated the test for the value of trimspaces in each iteration. now it is done only once.

same for the third iteration of removing empty entries.

the part of the extra whole copy I don't understand. do you mean in the return statement? if so, that would happen in the original version too. or is there another extra copy that slipped me? are you looking at the last version of the patch? the one before had the problem of the extra copy in the std:vector version of the function, but I already changed that.

and if there would be an extra copy, wouldn't the gained speed justify it?

as to your last point: hit and sunk. my functions have a problem with the case where there is no separator at the end. it just omits the part after the last separator. I will correct it, slipped me, thx for the find.

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 05, 2016, 11:45:35 AM

@obfuscated

the change to StartsWith() was merely visual. i tried both versions and the StartsWith() seemed to me to result in a cleaner image to the eye, making it easier and faster to grasp that there are three separate conditions...

as I said, only a personal impression... blame my eyesight ;)

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 05, 2016, 12:51:20 PM

please revise and test attached corrected patch...

Title: Re: small refactoring in globals.cpp
Post by: MortenMacFly on February 05, 2016, 01:42:54 PM

Quote from: frithjofh on February 05, 2016, 12:51:20 PM
please revise and test attached corrected patch...

Sorry, but for me, this patch does not apply on SVN head, unfortunately.

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 05, 2016, 01:48:33 PM

i will correct it...

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on February 05, 2016, 09:00:33 PM

Quote from: frithjofh on February 05, 2016, 11:41:39 AM
doing three separate iterations on the array results in the code running much faster due to more cache locality.

The code runs faster because you've removed the calls to wxString::Remove.
This call is removing characters from the beginning of the string and copying the rest of the string to the front.
So the improvement is not because of better cache locality, but because there is less copying.

Quote from: frithjofh on February 05, 2016, 11:41:39 AM
for instance, the original version repeated the test for the value of trimspaces in each iteration. now it is done only once.

Probably the compiler has re-written the loop and has extracted the test outside of it.
I'm sure that if you remove the check from the original version it won't change its performance!

Quote from: frithjofh on February 05, 2016, 11:41:39 AM
same for the third iteration of removing empty entries.

Why do you need to add elements and then remove them?

Quote from: frithjofh on February 05, 2016, 11:41:39 AM
the part of the extra whole copy I don't understand. do you mean in the return statement?

The return statement is doing a redundant copy.

p.p. this post is made before I've looked at the latest patch.

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 05, 2016, 09:25:27 PM

here the corrected patch. search and destroy please... :)

@obfuscated:

QuoteThe code runs faster because you've removed the calls to wxString::Remove.
This call is removing characters from the beginning of the string and copying the rest of the string to the front.
So the improvement is not because of better cache locality, but because there is less copying.

agree about the removing of wxString::Remove. the most gain lies here in avoiding the copying and also in the avoiding of modification of the wxString in place. building the output piece by piece is faster. as to cache locality: I was just making an assumption, nothing more. I am no professional programmer, I did not measure cache misses.

QuoteProbably the compiler has re-written the loop and has extracted the test outside of it.
I'm sure that if you remove the check from the original version it won't change its performance!

I wouldn't know, as I did not measure this particular part of the code before and after. perhaps the compiler does this optimization on its own, perhaps not, I didn't look at the generated machine code. if there is a performance gain here, i think it will stand back behind the gain from avoiding the copying above anyway. by lengths. I find the new version easier to read though and easier to reason about.

QuoteWhy do you need to add elements and then remove them?

because as above, I think it makes it a lot easier to read and reason about. and strictly spoken, the elements are not removed from the container locale to the function, just copied around (and in some future only moved around). the return value is an unnamed object constructed directly from the resulting new range of valid entries.

QuoteThe return statement is doing a redundant copy.

correct, that is what some former version of the patch did. I took up the hint given and changed it accordingly.

@all:

thx for all the input. thx for not ignoring my efforts all together. apologies for my beginner mistakes.

please give me more bashing, so I will eventually learn and improve ... ;)

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on February 06, 2016, 04:47:00 PM

I've done some benchmarking and profiling.
orig - current version in codeblocks
forum - your version + a fix when find returns npos
mine - obviously this is my implementation

The results printed by google's benchmark lib:

Code Select


> bin/release/wx_routine_bench
Success: 13 tests passed.
Test time: 0.00 seconds.
Run on (4 X 3599.88 MHz CPU s)
2016-02-06 17:15:06
Benchmark                    Time(ns)    CPU(ns) Iterations
-----------------------------------------------------------
BM_orig/1                         742        741     958904
BM_orig/8                        7456       7457      93333
BM_orig/64                      85563      85504       8140
BM_orig/512                   2997799    3000000        233
BM_orig/4k                  173072815  173000000          4
BM_orig/32k                28505832434 28504000000          1
BM_orig/64k                139696512699 139686000000          1
BM_forum/1                        774        774     886076
BM_forum/8                       5031       5029     140000
BM_forum/64                     33001      32993      21277
BM_forum/512                   256982     256831       2745
BM_forum/4k                   2011854    2014368        348
BM_forum/32k                 19614743   19611111         36
BM_forum/64k                 43754816   43764706         17
BM_mine/1                         246        246    2834008
BM_mine/8                        1321       1320     530303
BM_mine/64                       9476       9481      74468
BM_mine/512                     69313      69295      10145
BM_mine/4k                     537183     536715       1321
BM_mine/32k                   6133614    6135135        111
BM_mine/64k                  12912220   12907407         54
BM_mine_vector/1                  217        217    3225806
BM_mine_vector/8                 1447       1445     479452
BM_mine_vector/64                9060       9064      77778
BM_mine_vector/512              68974      68972      10294
BM_mine_vector/4k              539424     538986       1321
BM_mine_vector/32k            6113907    6115044        113
BM_mine_vector/64k           13649380   13666667         51
BM_no_trim_orig/1                 383        383    1861702
BM_no_trim_orig/8                4165       4166     168269
BM_no_trim_orig/64              61783      61825      11290
BM_no_trim_orig/512           2833573    2829960        247
BM_no_trim_orig/4k          167948902  168000000          4
BM_no_trim_orig/32k        29410523891 29410000000          1
BM_no_trim_orig/64k        137438740492 137432000000          1
BM_no_trim_forum/1                357        357    1949861
BM_no_trim_forum/8               1751       1753     400000
BM_no_trim_forum/64             11392      11384      61404
BM_no_trim_forum/512            85726      85749       8140
BM_no_trim_forum/4k            676759     676385       1029
BM_no_trim_forum/32k          8598947    8597561         82
BM_no_trim_forum/64k         19704975   19714286         35
BM_no_trim_mine/1                 250        250    2788845
BM_no_trim_mine/8                1527       1528     457516
BM_no_trim_mine/64              10476      10470      66667
BM_no_trim_mine/512             78441      78449       8974
BM_no_trim_mine/4k             606494     606684       1167
BM_no_trim_mine/32k           7209706    7218750         96
BM_no_trim_mine/64k          15034847   15021739         46
BM_no_trim_mine_vector/1          224        224    3139013
BM_no_trim_mine_vector/8         1649       1649     419162
BM_no_trim_mine_vector/64        9681       9672      72165
BM_no_trim_mine_vector/512      74088      74147       9589
BM_no_trim_mine_vector/4k      585973     585750       1207
BM_no_trim_mine_vector/32k    7304332    7305263         95
BM_no_trim_mine_vector/64k   15878234   15883721         43

See the attached file for the source code.
Notes:
1. There are cases where the original version is faster than the forum version.
2. Mine version is always faster.
3. When doing trimming it is 3x or so faster. :)
4. Obviously faster code is uglier :)

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 06, 2016, 05:43:46 PM

wow, you really took this to heart...

well, yes, you are right, there is one case, where the original version is faster... slipped me... ;)

and yes, your version is faster, in every regard. it is nice to see what a real programmer can do. a little bit of competitive spirit always works wonders... three iterations over the vector == three times slower than only one :o

I tried to think about the sense of my changes to this particular function when benchmarking. I wondered, if testing the function for 5.000 entries would really make sense, or even be realistic. perhaps the interesting range is from 64 to 4.000 entries, at least if the string that gets tokenized is some normal file. but I don't really know.

I like your version. post a patch. it is a bit a pain in the neck to read, and it replaces 2 functions with 6, but the speed gain is great and those helper functions surely could find use in other functions too...

have you checked the user functions too which I changed with the last patch?

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on February 06, 2016, 06:18:08 PM

Quote from: frithjofh on February 06, 2016, 05:43:46 PM
wow, you really took this to heart...

Not really, I'm just playing with perf and also wanted to play with google.benchmark a bit.
And for sure I wanted to verify that what I'm talking about is correct.
And it turned out that the difference is not that big that I thought it will be.
The reason is that most of the time is wasted constructing wxString/std::string objects.
BTW the wx3.0 version is quite a lot slower, because it uses std::string

Quote from: frithjofh on February 06, 2016, 05:43:46 PM
have you checked the user functions too which I changed with the last patch?

Not yet.
For the getstringfromarray one my guess is that there won't be any difference:)

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on February 06, 2016, 06:33:37 PM

Heh, it turned out that your version of GetStringFromArray is a lot slower than the original, because you're using operator+. ::)

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 06, 2016, 07:03:39 PM

QuoteHeh, it turned out that your version of GetStringFromArray is a lot slower than the original, because you're using operator+. ::)

instead of two times append or += ?

corrected...

QuoteAnd it turned out that the difference is not that big that I thought it will be.

the difference between which versions? original and my version? my version and your version? I mean between your version and my version there is a factor of three, I think that is much, although in absolute numbers it is only msec scale.

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 07, 2016, 12:19:05 PM

corrected patch. still contains my version of GetStringFromArray() until better version from obfuscated arrives, or what ever the decision may be...

please test and comment on EscapeSpaces()

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on February 07, 2016, 02:43:42 PM

I'm talking about the non-trimming case. With my version it is slightly faster than your version.
The trimming version is way faster, because it doesn't do trimming, which is slow.

I've done more testing on the EscapeSpaces function and it seems that your fix is an improvement only in wx2.8. With wx3.0 it doesn't make a difference.

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on February 07, 2016, 04:34:04 PM

ok. so just leave it as is, right?

Title: Re: small refactoring in globals.cpp
Post by: frithjofh on March 14, 2016, 11:54:33 AM

@obfuscated

made another attempt on GetArrayFromString(). if you don't mind, could you give it a run in your testing harness to see its performance?

Code Select

wxArrayString GetArrayFromString_3(const wxString& text, const wxString& separator, bool trimSpaces)
{
    if ( text.empty() )
        return wxArrayString();

    const size_t seplen = separator.length();
    size_t start_pos = 0;
    size_t sep_pos = 0;
    wxArrayString out;

    if (trimSpaces)
    {
        while (sep_pos != wxString::npos)
        {
            start_pos = text.find_first_not_of(_T(" \t\n\r"), start_pos);
            if (start_pos == wxString::npos)
                break;
            sep_pos = text.find(separator, start_pos);
            if (sep_pos != start_pos)
            {
                const size_t end_pos = text.find_last_not_of(_T(" \t\n\r"), (sep_pos == wxString::npos) ? wxString::npos : sep_pos - 1);
                out.push_back(text.substr(start_pos, end_pos - start_pos + 1));
            }
            start_pos = sep_pos + seplen;
        }
    }
    else
    {
        while (sep_pos != wxString::npos)
        {
            sep_pos = text.find(separator, start_pos);
            if ( sep_pos != start_pos )
                out.push_back(text.substr(start_pos, (sep_pos == wxString::npos) ? wxString::npos : sep_pos - start_pos));
            start_pos = sep_pos + seplen;
            if (start_pos >= text.length())
                break;
        }
    }

    return out;
}

Title: Re: small refactoring in globals.cpp
Post by: oBFusCATed on March 14, 2016, 11:21:23 PM

No time now. You can try it for yourself -
https://github.com/google/benchmark
https://youtu.be/nXaxk27zwlk?t=691

Code::Blocks Forums

Developer forums (C::B DEVELOPMENT STRICTLY!) => Development => Topic started by: frithjofh on January 29, 2016, 05:44:37 PM