News:

As usual while waiting for the next release - don't forget to check the nightly builds in the forum.

Main Menu

Several improvements to Code Completion plugin

Started by Huki, September 06, 2013, 10:03:35 PM

Previous topic - Next topic

ollydbg

Today, I have tested the patch: cc_includes_parsing.patch
But I found it cause some GUI hang problem, see my report question about running parser in thread, I guess the reason is:

If you recursively parse the #include directive.


                if (!locked)
                    CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex)
                //this is a recursive call to Parse() function????
                result = thread->Parse();
                delete thread;

                if (!locked)
                    CC_LOCKER_TRACK_TT_MTX_UNLOCK(s_TokenTreeMutex)

So, you hold the s_TokenTreeMutex for a very long time. Some GUI function may try to query information from Tokentree, thus they get hangs.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Folco

What is approximatively the size of the data that you lock ? If it is not too huge, would it be a solution to lock a mutex, then copy the data that you need, then unlock the mutex ? It could be really faster that what you are currently doing.

I have this code in one of my projects :
        SDL_mutexP (rsd->RenderingMutex);
        rgd = rsd->rgd;
        SDL_mutexV (rsd->RenderingMutex);

This code is run in a rendering thread, which grabs only the data that it needs to draw, then unlock data for the main thread to continue. Perhaps something like that would remove some CC hangs and wouldn't freeze the GUI anymore.
Kernel Extremist - PedroM power ©

ollydbg

Hi, thanks Folco for the help.
QuoteWhat is approximatively the size of the data that you lock ?
The tokentree, which holds all the tokens(symbols), you can regard it as symbol database.
When we parse a file, you mainly do syntax analysis and add symbols one by one.

In the current implementation, we lock the tokentree when starting a single file, after a file is parsed, the locker is released. Something like this:

while (file queue is not empty)
{
    get one file from the queue
    lock the tokentree
    while (parsing file not finished)
    {
        syntax analysis
        add symbol to tokentree
        if (meet a #include directive)
        {
            add file to the queue(for later parsing)
        }
    }
    release the locker of tokentree
    remove the file from the queue
}


So, I think it is not good idea to put the locker around "add symbol to tokentree".

while (file queue is not empty)
{
    get one file from the queue
    while (parsing file not finished)
    {
        syntax analysis
        lock the tokentree
        add symbol to tokentree
        release the locker of tokentree
        if (meet a #include directive)
        {
            add file to the queue(for later parsing)
        }
    }
    remove the file from the queue
}

Because lock and release will be called too frequently.


With Huki's patch, the code becomes:

while (file queue is not empty)
{
    get one file from the queue
    lock the tokentree
    while (parsing file not finished)
    {
        syntax analysis
        add symbol to tokentree
        if (meet a #include directive)
        {
            recursive parse on the included source file
        }
    }
    release the locker of tokentree
    remove the file from the queue
}


So, you can get the correct token, as the #include directive is correctly expanded(it does the same thing as C preprocessor do), the time of the locker becomes longer.

I have no idea what is the correct idea, I think the GUI should not access to the tokentree when the parsing is not finished.

If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Folco

#48
And something like that :

lock the tokentree
copy the token tree
release the tokentree

while (file queue not empty)
{
    analysis / add symbol / fill queue with include directives, modifying copied_tokentree

    lock the tokentree, update tokentree with copied_tokentree, then release it
}


Note that the token tree should contain a var saying "I am (not) in a writable state/copiable state". I don't know what use it.

This should have 2 consequences :
- the token tree is locked during short peroid of time, without risk of infinite loop (recursive inclusion or other funky trick)
- nobody has to wait for the parser to finish its task. Even if a parser would crash, the GUI would keep responsive.
Kernel Extremist - PedroM power ©

ollydbg

Quote from: Folco on January 04, 2014, 02:14:31 PM
And something like that :

lock the tokentree
copy the token tree
release the tokentree

while (file queue not empty)
{
    analysis / add symbol / fill queue with include directives, modifying copied_tokentree

    lock the tokentree, update tokentree with copied_tokentree, then release it
}


What does the above pseudo code used for? For parser? the copied tokentree are used for GUI?
Note A Tokentree is a very big object, so copy tokentree takes a lot of time. So, it is a shared object, can either write or read at a time.

I think if the parser is running, I mean the tokentree should not be accessed by GUI thread, the GUI should show some words like: parsing is still running. Or delay the operation until tokentree stabilized (parsing finished).



If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Folco

Quote from: ollydbg on January 04, 2014, 02:42:23 PM
Quote from: Folco on January 04, 2014, 02:14:31 PM
And something like that :

lock the tokentree
copy the token tree
release the tokentree

while (file queue not empty)
{
    analysis / add symbol / fill queue with include directives, modifying copied_tokentree

    lock the tokentree, update tokentree with copied_tokentree, then release it
}


What does the above pseudo code used for? For parser?
Yes. First the parser create a copy, then it modifies its copy when parsing files, and once a file is parsed, it can update the token tree used by the GUI or other parts of CB.
Quotethe copied tokentree are used for GUI?
No. It uses the TokenTree which is updated after each file by the parser thread.

QuoteNote A Tokentree is a very big object, so copy tokentree takes a lot of time. So, it is a shared object, can either write or read at a time.
Ah. That is why my very first question was about the size of the token tree. Do you have any idea of the size in KB/MB ?

Of course, my idea would not be usable it the token tree was too big... The aim of what I propose is to avoid long periods of locking for the Token Tree.
Kernel Extremist - PedroM power ©

ollydbg

Quote from: Folco on January 04, 2014, 03:04:47 PM
Quote from: ollydbg on January 04, 2014, 02:42:23 PM
Note A Tokentree is a very big object, so copy tokentree takes a lot of time. So, it is a shared object, can either write or read at a time.
Ah. That is why my very first question was about the size of the token tree. Do you have any idea of the size in KB/MB ?

Of course, my idea would not be usable it the token tree was too big... The aim of what I propose is to avoid long periods of locking for the Token Tree.
Thanks for all your reply.
A tokentree is very big, for example for a large project (Codeblocks.cbp), its size if about 100MByte or 200MByte or even bigger. ;D
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Folco

Erf :D Too large to perform a complete copy.
Nevertheless it could still be possible with a differential copy. I don't know if it would be a good solution, I never worked on such big data.
Kernel Extremist - PedroM power ©

ollydbg

Quote from: Folco on January 04, 2014, 03:29:48 PM
Erf :D Too large to perform a complete copy.
Nevertheless it could still be possible with a differential copy. I don't know if it would be a good solution, I never worked on such big data.
What do you mean by "differential copy"?
Partial copy? I think partial copy of the tokentree is also not possible, because symbols in Tokentree are referenced each other.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Folco

Sorry, my english is very bad. I think that : "push to TokenTree the parts of the modified TokenTree which have been modified". But this seems hard if symbols refers to other ones.
Kernel Extremist - PedroM power ©

dekar

Is there some technique to make  Code Completion interpret defines?
for example:
((Some_struct*) someAddr)-> completing correctly. But
#define AVAR ((Some_struct*) someAddr)
AVAR->
  does not wokr.
It is rely annoying with embedded programming.

ollydbg

Quote from: ollydbg on January 04, 2014, 10:38:49 AM
Today, I have tested the patch: cc_includes_parsing.patch
But I found it cause some GUI hang problem, see my report question about running parser in thread, I guess the reason is:

If you recursively parse the #include directive.


                if (!locked)
                    CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex)
                //this is a recursive call to Parse() function????
                result = thread->Parse();
                delete thread;

                if (!locked)
                    CC_LOCKER_TRACK_TT_MTX_UNLOCK(s_TokenTreeMutex)

So, you hold the s_TokenTreeMutex for a very long time. Some GUI function may try to query information from Tokentree, thus they get hangs.
I have a improvement patch to handle this (see attachment). The parser only parses cpp files in the project, recursive call the parse function when it meet the #include directive, the result is quite good.

The only issue about this is that if the project contains some header files(like a header only library project) which does not used by any cpp file, they will be left. I don't think this is a very big issue, because a user project will include those headers, then it still get parsed correctly.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

ollydbg

Quote from: dekar on January 20, 2014, 01:18:14 PM
Is there some technique to make  Code Completion interpret defines?
for example:
((Some_struct*) someAddr)-> completing correctly. But
#define AVAR ((Some_struct*) someAddr)
AVAR->
  does not wokr.
It is rely annoying with embedded programming.
I think currently it is not, so it should be a feature request.
We need to do Macro substation before we breakup the statement string before the caret..... Patches are welcome. ;)
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Huki

Hi, been busy for a while...

Quote from: ollydbg on January 28, 2014, 10:28:03 AM
Quote from: ollydbg on January 04, 2014, 10:38:49 AM
Today, I have tested the patch: cc_includes_parsing.patch
But I found it cause some GUI hang problem, see my report question about running parser in thread, I guess the reason is:

If you recursively parse the #include directive.


               if (!locked)
                   CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex)
               //this is a recursive call to Parse() function????
               result = thread->Parse();
               delete thread;

               if (!locked)
                   CC_LOCKER_TRACK_TT_MTX_UNLOCK(s_TokenTreeMutex)

So, you hold the s_TokenTreeMutex for a very long time. Some GUI function may try to query information from Tokentree, thus they get hangs.
I have a improvement patch to handle this (see attachment). The parser only parses cpp files in the project, recursive call the parse function when it meet the #include directive, the result is quite good.

Yes, my cc_includes_parsing patch did cause the UI to hang-up a bit because of the recursive parsing. I just tried your patch which frees up the mutex for a while before recursing, and the result is indeed quite good. Thanks!

Huki

Any news on my cc_enum_values.patch? I'm attaching it again and the supported test cases in cc_enum_test.patch.
- Uses expression solver to calculate the enum value, expand macro assignments.
- Supports enum assignment to a previous enum (checks under the correct parent).
- If the expression cannot be evaluated (eg, unknown macro), leave it and the next enums blank and don't reset it to zero.