We Are Still Unable to Secure LLMs from Malicious Inputs

Nice indirect prompt injection attack:

Bargury’s attack starts with a poisoned document, which is shared to a potential victim’s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting policies. But inside the document, Bargury hid a 300-word malicious prompt that contains instructions for ChatGPT. The prompt is written in white text in a size-one font, something that a human is unlikely to see but a machine will still read.

In a proof of concept video of the attack…

Read More >>