See: https://github.com/Valloric/YouCompleteMe/blob/master/README.md
Quote
A critical thing to notice is that the completion filtering is NOT based on the input being a string prefix of the completion (but that works too). The input needs to be a subsequence match of a completion. This is a fancy way of saying that any input characters need to be present in a completion string in the order in which they appear in the input. So abc is a subsequence of xaybgc, but not of xbyxaxxc. After the filter, a complicated sorting system ranks the completion strings so that the most relevant ones rise to the top of the menu (so you usually need to press TAB just once).
The animated GIF has a nice little demonstration. Cool feature IMO.
Enjoy attached hack ;) . (To cc_interface branch.)
(Edit: Filtering, but no fancy sorting.)
Nice work! It works nicely with the python cc plugin too!
How much of an issue is the sorting? Does the popup list allow you to control the sort order? The most important thing is that (1) exact prefix matches come first (2) exact infix or suffix matches next and, maybe, (3) other matches by count of neighboring letters that match. Giving priority for the acronym of CamelCase symbols would be a nice touch too. (e.g. return AbstractBaseClass with high priority for abc, and maybe getNextCharacter would be high on the list for gnc)
PS: Seems to be small bug with placement of doc string window. Sometimes it places on left when there clearly isn't room for the whole window on that side. (edited)
Quote from: dmoore on January 18, 2014, 05:05:52 PM
How much of an issue is the sorting? [...]
It is not really an issue; only that a sort function would need to be written.
m_FilteredTokens.erase(remove_if(m_FilteredTokens.begin(), m_FilteredTokens.end(), Matcher), m_FilteredTokens.end());
std::sort(m_FilteredTokens.begin(), m_FilteredTokens.end(), MyFancySort); // add sorting here
That line (plus the definition of the "
MyFancySort" - or whatever name is chosen) should be all that is necessary.
Quote from: dmoore on January 18, 2014, 05:05:52 PM
PS: Seems to be small bug with placement of doc string window. Sometimes it places on left when there clearly isn't room for the whole window.
Well, that was supposed to be a feature (https://github.com/alpha0010/codeblocks_sf/commit/9d177affc9bb4965d703a44db8ae2640a969e713). Does it act unexpectedly?
Quote from: Alpha on January 18, 2014, 07:46:51 PM
It is not really an issue; only that a sort function would need to be written.
Probably it would be better to use some ranking function, converting the string to some integer and then using normal integer sorting.
Probably YouCompleteMe dev won't mind if you use theirs way of sorting.... :)
Quote from: Alpha on January 18, 2014, 07:46:51 PM
Quote from: dmoore on January 18, 2014, 05:05:52 PM
PS: Seems to be small bug with placement of doc string window. Sometimes it places on left when there clearly isn't room for the whole window.
Well, that was supposed to be a feature (https://github.com/alpha0010/codeblocks_sf/commit/9d177affc9bb4965d703a44db8ae2640a969e713). Does it act unexpectedly?
It gets cut off when being displayed on the left side. Maybe the same would have happened on the right side too, but I didn't think so (if that's the case, then maybe the window just needs to be resized). I was using a 1080p display on a dual monitor setup. The 1080p monitor is on the left, the other monitor is something less than 1080p.
Quote from: oBFusCATed on January 18, 2014, 07:57:27 PM
Quote from: Alpha on January 18, 2014, 07:46:51 PM
It is not really an issue; only that a sort function would need to be written.
Probably it would be better to use some ranking function, converting the string to some integer and then using normal integer sorting.
Probably YouCompleteMe dev won't mind if you use theirs way of sorting.... :)
Agree. After all we're using their way of matching. (I am going to steal this for my photo manager's tag completions too!)
This looks to be the relevant code:
https://github.com/Valloric/YouCompleteMe/blob/master/cpp/ycm/Result.cpp#L141
But I think it would be simpler (and almost certainly faster!) to just compute an integer priority for each item as oBFus mentioned.
++hack;
Quote from: dmoore on January 18, 2014, 08:12:31 PM
Quote from: Alpha on January 18, 2014, 07:46:51 PM
Quote from: dmoore on January 18, 2014, 05:05:52 PM
PS: Seems to be small bug with placement of doc string window. Sometimes it places on left when there clearly isn't room for the whole window.
Well, that was supposed to be a feature (https://github.com/alpha0010/codeblocks_sf/commit/9d177affc9bb4965d703a44db8ae2640a969e713). Does it act unexpectedly?
It gets cut off when being displayed on the left side. Maybe the same would have happened on the right side too, but I didn't think so (if that's the case, then maybe the window just needs to be resized). I was using a 1080p display on a dual monitor setup. The 1080p monitor is on the left, the other monitor is something less than 1080p.
Confirmed. (And also confirmed that it is the fault of this hack interacting with another component of CCManager.)
Quote from: Alpha on January 19, 2014, 02:46:52 AM
++hack;
a very nice hack indeed. Works beautifully.
Quote from: Alpha on January 19, 2014, 05:12:36 AM
Quote from: dmoore on January 18, 2014, 08:12:31 PM
Quote from: Alpha on January 18, 2014, 07:46:51 PM
Quote from: dmoore on January 18, 2014, 05:05:52 PM
PS: Seems to be small bug with placement of doc string window. Sometimes it places on left when there clearly isn't room for the whole window.
Well, that was supposed to be a feature (https://github.com/alpha0010/codeblocks_sf/commit/9d177affc9bb4965d703a44db8ae2640a969e713). Does it act unexpectedly?
It gets cut off when being displayed on the left side. Maybe the same would have happened on the right side too, but I didn't think so (if that's the case, then maybe the window just needs to be resized). I was using a 1080p display on a dual monitor setup. The 1080p monitor is on the left, the other monitor is something less than 1080p.
Confirmed. (And also confirmed that it is the fault of this hack interacting with another component of CCManager.)
hopefully a simple fix.
One more issue: in python cc, I think I need to be able to do async docstring calls. otherwise the gui can freeze for quite a while.
Quote from: dmoore on January 19, 2014, 07:13:08 AM
a very nice hack indeed. Works beautifully.
Until you try to complete
wxString. With my clang plugin, it does not even show up until I have typed the full name, and even then, it is the fifth item. ???
(http://www.pasteall.org/pic/show.php?id=65465) (http://www.pasteall.org/pic/show.php?id=65465)
Quote from: dmoore on January 19, 2014, 07:13:08 AM
One more issue: in python cc, I think I need to be able to do async docstring calls. otherwise the gui can freeze for quite a while.
I have been attempting to work on that for a while (with
cbThreadPool), however, I apparently do not understand threading nearly as well as I had thought; each of my tries immediately crashed while updating the GUI from a
cbEVT_THREADTASK_ALLDONE handler. Looks like more reading for me.
Nice catch re wxString, just need to sort alphabetically for equal priority items?
Re threads, need to be careful about passing data that is constructed in the worker thread being deleted before it gets handled on the main thread. There are some posts about certain events being unsafe by (i think) ollydbg.
Quote from: dmoore on January 19, 2014, 10:20:01 PM
Nice catch re wxString, just need to sort alphabetically for equal priority items?
The problem is,
CalcValue() (which is tries to nearly replicate/quantify the algorithm YCM designed) gives
wxString a worse value because its word boundary string is only "ws" (whereas
wxSTRING_REVERSE_ITERATOR for example is "wsri", more closely matching the input). (Items with equal priority *should* already be in alphabetical order, assuming my implementation works the way I think it does.)
Unfortunately, re-balancing the weighting algorithm would probably cause it to lose the benefit it gives in most other cases. (I am tempted to hardcode a small score boost for
wxString, because it is rather common in wxWidgets based code... but that could get even uglier if other similar problem tokens show up.)
Quote from: dmoore on January 19, 2014, 10:20:01 PM
Re threads, need to be careful about passing data that is constructed in the worker thread being deleted before it gets handled on the main thread. There are some posts about certain events being unsafe by (i think) ollydbg.
Thanks, I will look for those.
Quote from: Alpha on January 20, 2014, 06:43:43 AM
Unfortunately, re-balancing the weighting algorithm would probably cause it to lose the benefit it gives in most other cases. (I am tempted to hardcode a small score boost for wxString, because it is rather common in wxWidgets based code... but that could get even uglier if other similar problem tokens show up.)
No hardcoding, please, it is ugly and unprofessional:)
Quote from: Alpha on January 19, 2014, 10:01:31 PM
Quote from: dmoore on January 19, 2014, 07:13:08 AM
One more issue: in python cc, I think I need to be able to do async docstring calls. otherwise the gui can freeze for quite a while.
I have been attempting to work on that for a while (with cbThreadPool), however, I apparently do not understand threading nearly as well as I had thought; each of my tries immediately crashed while updating the GUI from a cbEVT_THREADTASK_ALLDONE handler. Looks like more reading for me.
Have you tried to make the docstring API to be consisted of two steps/calls?
For example:
1. CCManager calls a plugin to to start gathering the data. No ui is shown to the user or empty UI is shown, so he can have a chance to cancel it.
2. The CC plugin has gathered the data and then calls CCManager to show the UI or fill the UI.
1 and 2 happen in the main thread. This way you leave the task of threading to the plugin.
Quote from: Alpha on January 20, 2014, 06:43:43 AM
Quote from: dmoore on January 19, 2014, 10:20:01 PM
Nice catch re wxString, just need to sort alphabetically for equal priority items?
The problem is, CalcValue() (which is tries to nearly replicate/quantify the algorithm YCM designed) gives wxString a worse value because its word boundary string is only "ws" (whereas wxSTRING_REVERSE_ITERATOR for example is "wsri", more closely matching the input). (Items with equal priority *should* already be in alphabetical order, assuming my implementation works the way I think it does.)
Unfortunately, re-balancing the weighting algorithm would probably cause it to lose the benefit it gives in most other cases. (I am tempted to hardcode a small score boost for wxString, because it is rather common in wxWidgets based code... but that could get even uglier if other similar problem tokens show up.)
There are almost certainly going to be similar problems, so don't hardcode. How about doing Acronym matching as a separate, alternative calculation that only gets used if it would create a higher score?
example: user types wxs, candidates (and their acronyms) are
wxString -> wS
wxStringHelper -> wSH
wantsXtraStr ->wxS
wxstring0 -> no acronym
You compute a score for wxString (3 matches at the start), a score for wS (no matches) and take the higher score => 3 prefix matches (I'm abstracting from the exact numeric score you might give this)
You compute a score for wxStringHelper (3 matches at the start), a score for wSH (no matches) and take the higher score => 3 prefix matches
You compute a score for wantsXtraStr (3 partial matches), a score for wxS (3 prefix matches) and take the higher score => 3 prefix matches
You compute a score for wxstring (3 matches at the start), there's no acronym so ignore => 3 prefix matches
So in this case all 3 would have an equal score, so they should appear alphabetically (you could also slightly penalize acronym match, because it isn't as intuitive as a straight prefix match).
Quote from: oBFusCATed on January 20, 2014, 10:32:09 AM
Have you tried to make the docstring API to be consisted of two steps/calls?
For example:
1. CCManager calls a plugin to to start gathering the data. No ui is shown to the user or empty UI is shown, so he can have a chance to cancel it.
2. The CC plugin has gathered the data and then calls CCManager to show the UI or fill the UI.
1 and 2 happen in the main thread. This way you leave the task of threading to the plugin.
Although that would work, I would be hesitant to use it as I have so far designed the interface such that individual CC plugins do not need to know that CCManager exists, nor to talk with it. They only have to return data when requested.
Quote from: dmoore on January 20, 2014, 02:33:55 PM
[...] How about doing Acronym matching as a separate, alternative calculation that only gets used if it would create a higher score?
Hmm, I will experimenting with that...
Quote from: Alpha on January 20, 2014, 11:12:12 PM
Although that would work, I would be hesitant to use it as I have so far designed the interface such that individual CC plugins do not need to know that CCManager exists, nor to talk with it. They only have to return data when requested.
OK you can make a custom event cbEVT_CC_DOC_STRING_READY or something like that...
Quote from: oBFusCATed on January 21, 2014, 12:41:24 AM
OK you can make a custom event cbEVT_CC_DOC_STRING_READY or something like that...
I think that would be a reasonable solution. (I was quite literally just typing that same thought, when I saw you had posted.)
Any progress with this feature? Would like to see a finished version merged into trunk too.
Quote from: dmoore on March 20, 2014, 08:38:00 PM
Any progress with this feature? Would like to see a finished version merged into trunk too.
Unfortunately in the busyness of my life, I have not yet had time to work further on it. Hopefully I will soon though...
I have cleaned up this patch somewhat (attached). No significant functional changes, but it should at least apply to the current trunk.
(Improvements to stability/usability are in progress...)
Settings > Editor > Code Completion
"Display info when hovering mouse over a token in the editor"
Is it possible to have this option, excluding the fuck-with-whatever-you're-typing feature?
(Edit: yes it is, but could you please implement it)
Also, make the tooltip instant. Draw --it-- (because I hope you don't intantiate some object every time you need to render a different tip string) at the cursor's position immediately. I'm capable of moving my cursor if it's obstructive. If you keep it how they stick to a fixed position obtained when the tip becomes visible (which is stupid), then obstruction becomes more frustrating. ;)
pieman:
You're getting too aggressive lately. Mind your words please (don't use the f** one)! Also don't hijack topics. This topics has nothing to do with the things you've suggested.
If you want something implemented now or tomorrow start coding, else try to be polite and convince the current devs that your proposal are worth our time!
Quote from: Alpha on April 11, 2014, 08:49:21 PM
(Improvements to stability/usability are in progress...)
Updated patch (attached) is fully functional, and
stable (as far as I can tell) edit: discovered a crash to work out.
Quote from: Alpha on January 19, 2014, 10:01:31 PM
Quote from: dmoore on January 19, 2014, 07:13:08 AM
a very nice hack indeed. Works beautifully.
Until you try to complete wxString. With my clang plugin, it does not even show up until I have typed the full name, and even then, it is the fifth item. ???
Also included in this patch is a potential solution to this. CCManager records each time you select (hit enter/double click to complete and insert the result) an item that took "too long" to come up in the autocomplete listing. These entries (stored in a small buffer, to allow older, no longer used, entries to expire) are given a score bonus, so they float to the top faster the next time the same query comes up.
After a bit more testing, and adding the appropriate configuration menu, this feature should be ready to commit.
Quote from: Alpha on April 12, 2014, 02:52:32 AM
[...] stable (as far as I can tell) edit: discovered a crash to work out.
Probably working now ::)... (patch attached)
While i can see some utility in this approach, to be honest I would prefer something more deterministic. What is the issue with implementing the algorithm I proposed above?
Quote from: dmoore on April 12, 2014, 05:11:07 AM
What is the issue with implementing the algorithm I proposed above?
Mostly it was that this "adaption" algorithm was much faster for me to get to a working state. (To implement your algorithm, I would need to spend some trial-error time, testing different constants.)
Although determinism is, in general, preferential (to me as well), in using this completion algorithm, it bothers me each time something takes longer than with plain alphabetical. I doubt I could come up with any deterministic algorithm to sufficiently satisfy myself.
That said, there is an easy point in the code to disable adaption, so it can be a configurable option.
Quote from: Alpha on April 12, 2014, 05:58:18 AM
Quote from: dmoore on April 12, 2014, 05:11:07 AM
What is the issue with implementing the algorithm I proposed above?
[...] To implement your algorithm, I would need to spend some trial-error time, testing different constants.
This seems to work reasonably... (I will need to do more testing though.)
static int CalcPrefixValue(wxString query, wxString token)
{
if (query.IsEmpty() || token.IsEmpty())
return 0;
int val = 0;
int shift = sizeof(val) * 8 - 2;
if (!token.Lower().StartsWith(query.Lower()))
val += (1 << shift);
--shift;
if (!token.StartsWith(query))
val += (1 << shift);
--shift;
int idx = token.Lower().Find(query.Lower());
if (idx == wxNOT_FOUND)
val += (1 << shift);
--shift;
if (idx > 6)
val += (1 << shift);
--shift;
if (idx > 3)
val += (1 << shift);
--shift;
idx = token.Find(query);
if (idx == wxNOT_FOUND)
val += (1 << shift);
--shift;
if (idx > 6)
val += (1 << shift);
--shift;
if (idx > 3)
val += (1 << shift);
--shift;
val += (1 << shift);
val *= std::max<int>(1, 4 - query.Length());
return val; // lower is better
}
@alpha: This untested pseudo code (which looks a bit like C/C++) is roughly what I had in mind:
int find(const wxString &str, const wxString &substr)
{
int pos = str.Find(substr);
return pos == wxNOT_FOUND? str.Len() : pos;
}
wxString acronym(const wxString &str)
{
wxString s = str.Mid(0,1);
for (i=1;i<str.Len();++i)
{
n = str.Mid(i,1);
bool underscore = false;
while (n == _T("_"))
{
underscore=true;
++i;
n = str.Mid(i,1);
}
if (n==wxEmptyString)
break;
if (underscore || n == n.upper())
s<<n;
}
return s;
}
int min(int x, int y)
{
return x<y? x:y;
}
int score(const wxString &a,const wxString &value)
{
wxString al = a.Lower();
wxString aa = acronym(a);
wxString aal = aa.Lower();
int pos_a = find(a, value)*4;
int pos_al = find(al, value.Lower())*4+1;
int pos_aa = find(aa, value)*4+2;
int pos_aal = find(aal, value.Lower())*4+3;
return min(min(pos_a,pos_al),min(pos_aa,pos_aal));
}
possible that your code does exactly the same thing, but I had trouble getting my head around the bit shifts. Would want to make sure the equal scored items are alphabetized as well.
I believe your pseudo code is fairly similar to what my code attempts. It is definitely a lot more readable than my code, so if I am able to get it to function similarly, I will try to use it to replace my current implementation.
Sorry about the bit shifts; I used them in CalcPrefixValue() so that the output score would be in the same range as CalcValue(). (Using shifts in CalcValue() was (sort of) necessary because it attempts to quantize different states, and each of which must have greater impact on the output, than the combination of all following tests.)
(I apparently got lost in the bit shifting in my code as well, because the CalcPrefixValue() I posted has a fairly bad overflow error. ... I will try to post a full patch later today containing the combination of ranking systems.)
Assuming the code is working how it is supposed to, alphabetization is already handled; items of equal weight are ordered by index, and the indices are pre-sorted to be alphabetical.
My pseudo code isn't completely right either. The "Find" calls that return wxNOT_FOUND should actually be ignored and not given a (albeit large) positive score.
Attached is the hybrid. To test without adaption, put a return at the beginning of CCManager::DoUpdateAdaption().
What do you think of the current performance? It *feels* acceptable to me right now (but I have not tested on multiple machines).
Another thought: the adaption mechanism can be made more deterministic running it in 'learning mode' for a session. Afterwards (in subsequent sessions), use the results as static data to run the heuristics.
Seems to mostly work, but in python, with completion on "os." then type "mod" I get: "mknod, chmod, fchmod, removedirs" in that order
I expected to only see chmod and fchmod, because the other two aren't exact matches for mod or acronyms that include mod.
Also, with the disappointing lack of useful namespace usage in C++, I can see why you would want the "adaption" algorithm, but at least for python it seems less of a necessity. It would be good to cleanly decouple that functionality from the basic functionality, so that it can be easily switched off. (I haven't read your code closely so maybe that's already done.) Another approach would be to give priority to tokens that are used in the local file, but I'm guessing that would be quite a bit trickier to implement.
Another thing that occurrs to me is that we only use the YCM style filtering to limit the already created list of items that match the typed prefix before CC opens, not what is actually add to the list in the first place. This is probably for the best?
Quote from: Alpha on April 16, 2014, 04:17:07 AM
Another thought: the adaption mechanism can be made more deterministic running it in 'learning mode' for a session. Afterwards (in subsequent sessions), use the results as static data to run the heuristics.
Seems like if you would use it at all that it may as well be always running. Over time your coding evolves... Maybe just need a way of limiting the size of the dictionary, or a way of making sure that stuff that hasn't been used in a while get de-prioritized?
Quote from: dmoore on April 17, 2014, 04:17:15 PM
Seems to mostly work, but in python, with completion on "os." then type "mod" I get: "mknod, chmod, fchmod, removedirs" in that order [...]
"mod" matches all of them because "
mkn
od" and "re
move
dirs". "mknod" is first, because it starts with the same character that you first typed; "chmod" and "fchmod" are next because the contain an exact match of "mod". This the (current) behaviour of the prioritization.
Quote from: dmoore on April 17, 2014, 05:15:31 PM
Also, with the disappointing lack of useful namespace usage in C++, I can see why you would want the "adaption" algorithm, but at least for python it seems less of a necessity. It would be good to cleanly decouple that functionality from the basic functionality, so that it can be easily switched off. (I haven't read your code closely so maybe that's already done.) Another approach would be to give priority to tokens that are used in the local file, but I'm guessing that would be quite a bit trickier to implement.
Not exactly clean :) , but the two easiest ways to disable are either comment out the line that searches the cache, or disable the function that adds items to the cache (or both).
Priority of local tokens would have to be done on the plugin end, however, the infrastructure is already supported. If the plugin marks local tokens with a lower (meaning better) priority, the sorting algorithm boosts their score (assuming that these priority tokens account for less than 1/4 of the total tokens).
Quote from: dmoore on April 17, 2014, 05:15:31 PM
Another thing that occurrs to me is that we only use the YCM style filtering to limit the already created list of items that match the typed prefix before CC opens, not what is actually add to the list in the first place. This is probably for the best?
Sort of; CCManager sends the plugins a "fake" context, that is at most one character long. That seemed to provide an acceptable balance showing applicable results, and cost of building/filtering a large list.
Quote from: dmoore on April 17, 2014, 05:15:31 PM
Seems like if you would use it at all that it may as well be always running. Over time your coding evolves... Maybe just need a way of limiting the size of the dictionary, or a way of making sure that stuff that hasn't been used in a while get de-prioritized?
Currently, the dictionary is not saved on close. However, when making it persistent, I think saving only the 64 (or some other magic constant) most recently (or perhaps frequently?) used entries to the config file, would be a good option.
Quote from: Alpha on April 23, 2014, 04:47:36 PM
"mod" matches all of them because "mknod" and "removedirs". "mknod" is first, because it starts with the same character that you first typed; "chmod" and "fchmod" are next because the contain an exact match of "mod". This the (current) behaviour of the prioritization.
I understood that, I just think it's bad. At a minimum, the exact matches should get priority over inexact matches, but I'm not really convinced that you should match imperfect matches at all unless they fit the acronym rules. (That's what my pseudocode was attempting to do)
What about, for example: clanccgcext -> clang_codeCompleteGetContexts() (from libClang)? Imperfect matches allow typing a longer name by (mentally) skipping through, typing only the parts where it differentiates from the other items listed.
I find my typing habit often is: type a prefix (to bring up autocomp), enter acronym for the remaining parts, (if choice is still "far") enter suffix.
(I still am trying to get used to this; it is a completely different way of thinking.)
Quote from: Alpha on April 23, 2014, 09:17:44 PM
What about, for example: clanccgcext -> clang_codeCompleteGetContexts() (from libClang)? Imperfect matches allow typing a longer name by (mentally) skipping through, typing only the parts where it differentiates from the other items listed.
I find my typing habit often is: type a prefix (to bring up autocomp), enter acronym for the remaining parts, (if choice is still "far") enter suffix.
I guess I can see that, but how about making the inexact matches lowest priority. So an ordering like
exact match (ordered by position of first matching char, then alphabetical)
exact abbreviation match (ordered by position of first matching char, then alphabetical)
inexact matches (ordered by position of first matching char, then alphabetical)
Quote
(I still am trying to get used to this; it is a completely different way of thinking.)
Agree and me too.