Why would the models themselves need security fixes? The software running the mo...

spacebanana7 · on May 12, 2023

LLMs (at least the ones with read/write memory) can exactly simulate the execution of a universal Turing machine [1]. AFAIK running such models will therefore entails the same fundamental security risks as ordinary software.

[1] https://arxiv.org/pdf/2301.04589.pdf

jonplackett · on May 12, 2023

Not necessarily. The insecurity from LLMs comes from the fact they’re a black box - what if it turns out that particular version can be easily tricked into giving out terrorism ideas. You could try to add safeguards on top, but they’ve already been bypassed if it has been used for something like that. You might just have to retrain it somehow to make it safe