Several improvements to Code Completion plugin

ollydbg · January 04, 2014, 10:38:49 AM

Today, I have tested the patch: cc_includes_parsing.patch
But I found it cause some GUI hang problem, see my report question about running parser in thread, I guess the reason is:

If you recursively parse the #include directive.

Code Select


                if (!locked)
                    CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex)
                //this is a recursive call to Parse() function????
                result = thread->Parse();
                delete thread;

                if (!locked)
                    CC_LOCKER_TRACK_TT_MTX_UNLOCK(s_TokenTreeMutex)

So, you hold the s_TokenTreeMutex for a very long time. Some GUI function may try to query information from Tokentree, thus they get hangs.

Folco · January 04, 2014, 11:45:03 AM

What is approximatively the size of the data that you lock ? If it is not too huge, would it be a solution to lock a mutex, then copy the data that you need, then unlock the mutex ? It could be really faster that what you are currently doing.

I have this code in one of my projects :

Code Select

        SDL_mutexP (rsd->RenderingMutex);
        rgd = rsd->rgd;
        SDL_mutexV (rsd->RenderingMutex);

This code is run in a rendering thread, which grabs only the data that it needs to draw, then unlock data for the main thread to continue. Perhaps something like that would remove some CC hangs and wouldn't freeze the GUI anymore.

ollydbg · January 04, 2014, 01:44:48 PM

Hi, thanks Folco for the help.

QuoteWhat is approximatively the size of the data that you lock ?

The tokentree, which holds all the tokens(symbols), you can regard it as symbol database.
When we parse a file, you mainly do syntax analysis and add symbols one by one.

In the current implementation, we lock the tokentree when starting a single file, after a file is parsed, the locker is released. Something like this:

Code Select

while (file queue is not empty)
{
    get one file from the queue
    lock the tokentree
    while (parsing file not finished)
    {
        syntax analysis
        add symbol to tokentree
        if (meet a #include directive)
        {
            add file to the queue(for later parsing)
        }
    }
    release the locker of tokentree
    remove the file from the queue
}

So, I think it is not good idea to put the locker around "add symbol to tokentree".

Code Select

while (file queue is not empty)
{
    get one file from the queue
    while (parsing file not finished)
    {
        syntax analysis
        lock the tokentree
        add symbol to tokentree
        release the locker of tokentree
        if (meet a #include directive)
        {
            add file to the queue(for later parsing)
        }
    }
    remove the file from the queue
}

Because lock and release will be called too frequently.

With Huki's patch, the code becomes:

Code Select

while (file queue is not empty)
{
    get one file from the queue
    lock the tokentree
    while (parsing file not finished)
    {
        syntax analysis
        add symbol to tokentree
        if (meet a #include directive)
        {
            recursive parse on the included source file
        }
    }
    release the locker of tokentree
    remove the file from the queue
}

So, you can get the correct token, as the #include directive is correctly expanded(it does the same thing as C preprocessor do), the time of the locker becomes longer.

I have no idea what is the correct idea, I think the GUI should not access to the tokentree when the parsing is not finished.

Folco · January 04, 2014, 02:14:31 PM

And something like that :

Code Select


lock the tokentree
copy the token tree
release the tokentree

while (file queue not empty)
{
    analysis / add symbol / fill queue with include directives, modifying copied_tokentree

    lock the tokentree, update tokentree with copied_tokentree, then release it
}

Note that the token tree should contain a var saying "I am (not) in a writable state/copiable state". I don't know what use it.

This should have 2 consequences :
- the token tree is locked during short peroid of time, without risk of infinite loop (recursive inclusion or other funky trick)
- nobody has to wait for the parser to finish its task. Even if a parser would crash, the GUI would keep responsive.

ollydbg · January 04, 2014, 02:42:23 PM

Quote from: Folco on January 04, 2014, 02:14:31 PM
And something like that :
Code Select Expand
lock the tokentree copy the token tree release the tokentree while (file queue not empty) { analysis / add symbol / fill queue with include directives, modifying copied_tokentree lock the tokentree, update tokentree with copied_tokentree, then release it }

What does the above pseudo code used for? For parser? the copied tokentree are used for GUI?
Note A Tokentree is a very big object, so copy tokentree takes a lot of time. So, it is a shared object, can either write or read at a time.

I think if the parser is running, I mean the tokentree should not be accessed by GUI thread, the GUI should show some words like: parsing is still running. Or delay the operation until tokentree stabilized (parsing finished).

Folco · January 04, 2014, 03:04:47 PM

Quote from: ollydbg on January 04, 2014, 02:42:23 PM
Quote from: Folco on January 04, 2014, 02:14:31 PM
And something like that :
Code Select Expand
lock the tokentree copy the token tree release the tokentree while (file queue not empty) { analysis / add symbol / fill queue with include directives, modifying copied_tokentree lock the tokentree, update tokentree with copied_tokentree, then release it }

What does the above pseudo code used for? For parser?

Yes. First the parser create a copy, then it modifies its copy when parsing files, and once a file is parsed, it can update the token tree used by the GUI or other parts of CB.

Quotethe copied tokentree are used for GUI?

No. It uses the TokenTree which is updated after each file by the parser thread.

QuoteNote A Tokentree is a very big object, so copy tokentree takes a lot of time. So, it is a shared object, can either write or read at a time.

Ah. That is why my very first question was about the size of the token tree. Do you have any idea of the size in KB/MB ?

Of course, my idea would not be usable it the token tree was too big... The aim of what I propose is to avoid long periods of locking for the Token Tree.

ollydbg · January 04, 2014, 03:09:22 PM

Quote from: Folco on January 04, 2014, 03:04:47 PM
Quote from: ollydbg on January 04, 2014, 02:42:23 PM
Note A Tokentree is a very big object, so copy tokentree takes a lot of time. So, it is a shared object, can either write or read at a time.
Ah. That is why my very first question was about the size of the token tree. Do you have any idea of the size in KB/MB ?

Of course, my idea would not be usable it the token tree was too big... The aim of what I propose is to avoid long periods of locking for the Token Tree.

Thanks for all your reply.
A tokentree is very big, for example for a large project (Codeblocks.cbp), its size if about 100MByte or 200MByte or even bigger. ;D

Folco · January 04, 2014, 03:29:48 PM

Erf

Too large to perform a complete copy.
Nevertheless it could still be possible with a differential copy. I don't know if it would be a good solution, I never worked on such big data.

ollydbg · January 04, 2014, 04:09:20 PM

Quote from: Folco on January 04, 2014, 03:29:48 PM
Erf Too large to perform a complete copy.
Nevertheless it could still be possible with a differential copy. I don't know if it would be a good solution, I never worked on such big data.

What do you mean by "differential copy"?
Partial copy? I think partial copy of the tokentree is also not possible, because symbols in Tokentree are referenced each other.

Folco · January 04, 2014, 04:29:17 PM

Sorry, my english is very bad. I think that : "push to TokenTree the parts of the modified TokenTree which have been modified". But this seems hard if symbols refers to other ones.

dekar · January 20, 2014, 01:18:14 PM

Is there some technique to make Code Completion interpret defines?
for example:

Code Select

((Some_struct*) someAddr)-> completing correctly. But

Code Select

#define AVAR ((Some_struct*) someAddr)
AVAR->

does not wokr.
It is rely annoying with embedded programming.

ollydbg · January 28, 2014, 10:28:03 AM

Quote from: ollydbg on January 04, 2014, 10:38:49 AM
Today, I have tested the patch: cc_includes_parsing.patch
But I found it cause some GUI hang problem, see my report question about running parser in thread, I guess the reason is:

If you recursively parse the #include directive.

Code Select Expand
if (!locked) CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex) //this is a recursive call to Parse() function???? result = thread->Parse(); delete thread; if (!locked) CC_LOCKER_TRACK_TT_MTX_UNLOCK(s_TokenTreeMutex)
So, you hold the s_TokenTreeMutex for a very long time. Some GUI function may try to query information from Tokentree, thus they get hangs.

I have a improvement patch to handle this (see attachment). The parser only parses cpp files in the project, recursive call the parse function when it meet the #include directive, the result is quite good.

The only issue about this is that if the project contains some header files(like a header only library project) which does not used by any cpp file, they will be left. I don't think this is a very big issue, because a user project will include those headers, then it still get parsed correctly.

ollydbg · January 28, 2014, 10:33:27 AM

Quote from: dekar on January 20, 2014, 01:18:14 PM
Is there some technique to make Code Completion interpret defines?
for example:
Code Select Expand
((Some_struct*) someAddr)-> completing correctly. But
Code Select Expand
#define AVAR ((Some_struct*) someAddr) AVAR-> does not wokr.
It is rely annoying with embedded programming.

I think currently it is not, so it should be a feature request.
We need to do Macro substation before we breakup the statement string before the caret..... Patches are welcome.

Huki · February 02, 2014, 04:10:04 PM

Hi, been busy for a while...

Quote from: ollydbg on January 28, 2014, 10:28:03 AM
Quote from: ollydbg on January 04, 2014, 10:38:49 AM
Today, I have tested the patch: cc_includes_parsing.patch
But I found it cause some GUI hang problem, see my report question about running parser in thread, I guess the reason is:

If you recursively parse the #include directive.

Code Select Expand
if (!locked) CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex) //this is a recursive call to Parse() function???? result = thread->Parse(); delete thread; if (!locked) CC_LOCKER_TRACK_TT_MTX_UNLOCK(s_TokenTreeMutex)
So, you hold the s_TokenTreeMutex for a very long time. Some GUI function may try to query information from Tokentree, thus they get hangs.
I have a improvement patch to handle this (see attachment). The parser only parses cpp files in the project, recursive call the parse function when it meet the #include directive, the result is quite good.

Yes, my cc_includes_parsing patch did cause the UI to hang-up a bit because of the recursive parsing. I just tried your patch which frees up the mutex for a while before recursing, and the result is indeed quite good. Thanks!

Huki · February 02, 2014, 04:20:34 PM

Any news on my cc_enum_values.patch? I'm attaching it again and the supported test cases in cc_enum_test.patch.
- Uses expression solver to calculate the enum value, expand macro assignments.
- Supports enum assignment to a previous enum (checks under the correct parent).
- If the expression cannot be evaluated (eg, unknown macro), leave it and the next enums blank and don't reset it to zero.

Code::Blocks Forums

News:

Several improvements to Code Completion plugin

ollydbg

Folco

ollydbg

Folco

ollydbg

Folco

ollydbg

Folco

ollydbg

Folco

dekar

ollydbg

ollydbg

Huki

Huki