This is pretty clearly just a search engine with more parameters.
I thought there was something more going on with copilot, but the fact that it is regurgitating arbitrary code comments tells me that there is zero semantic analysis going on with the actual code being pulled in.
It's more that the model is so large it is capable of memorizing a lot. This can be seen in other language models like GPT-3 as well.
Comments, I suspect, will be more likely to be memorized since the model would be trained to make syntactically correct outputs, and a comment will always be syntactically correct. That would mean there is nothing to 'punish' bad comments.
It is decidedly not "just a search engine with more parameters." Language models are just prone to repeating training examples verbatim when they have a strong signal with the prompt. Arguably, in this case, it is the most correct continuation.
I thought there was something more going on with copilot, but the fact that it is regurgitating arbitrary code comments tells me that there is zero semantic analysis going on with the actual code being pulled in.