AI Security Research Company Zenity Introduces LLMs As Self-Reconnaissance Tool

Disclaimer: I’m the author of this blog post.

In the follow up to my previous post, where I Introduced a new attack class against LLMs which I define as Data-Structure Injection, I extend this method to show how these models can actually provide feedback on which prompts successfully bypass their safety filters.

This matters to both attackers and defenders in that this is crucial intel which can tell – in advance – which exploits work against LLMs.

Read the full blog post here: https://labs.zenity.io/p/modeling-llms-via-structured-self-modeling-ssm

submitted by /u/dvnci1452
[link] [comments]

November 12, 2025
Read More >>