Another week, another AI model release. Except there are signs that Anthropic’s release of Fable 5 is different. This week I’ll be talking about the cybersecurity craziness that has plagued the AI space in the short time that I’ve been here.

Fable 5: Cybersecurity is about to get (even) crazier

I re-watched Avengers: Age of Ultron recently. Unpopular opinion but it’s one of my favourite Marvel movies, and I think it’s pretty relevant today.

(Not because of the killer AI that they create. Because of the discussion that the Avengers have that actually leads to the creation of Ultron.)

Earth’s Mightiest Heroes note that the amount of supervillains, disasters, and cosmic threats has increased exponentially after their arrival. I think it’s an interesting parallel to the way that cybersecurity is developing in the age of AI.

Frontier labs keep releasing more and more sophisticated models that are better and better at finding and patching vulnerabilities. However, this also means that they are better and better at finding and using vulnerabilities as well. Which, of course, necessitates the need for a better model. And so on and so forth.

A quick timeline

April 7: Anthropic releases the system card for Mythos preview. They do not release the model to the public because it is too good at finding vulnerabilities in code.
also April 7: Anthropic announces Project Glasswing, an initiative to preemptively use Mythos to patch vulnerabilities.
May 11: The Mini Shai-Hulud exploit hits the npm ecosystem and source code is shared.
May - June: A ton of Windows and Linux exploits get released.
June 9: Fable 5, a model with Mythos-like capabilities, is released to the public. Uh-oh!

So what does this all mean? Well, it seems that with the release of Fable 5 we are going to see even more vulnerabilities and craziness come out in the next little while.

I understand that Fable 5 is shipping with guardrails, but we’ve seen the community jailbreak models before. One interesting security measure of Fable 5 is that it passes off potentially harmful queries to a less sophisticated model. However, I can’t imagine that this is really going to fix anything - if Fable is doing the legwork then only passing off the last bit to a worse model, I think these exploits are still going to get written.

One of the interesting parts of the Mythos discourse was that cheaper models could patch the same vulnerabilities. That is, if they knew exactly where to look. The reason that Mythos is in another class is because it is finding those places to look on its own.

And we’re talking about enterprise-grade software here. Add a bunch of vibe-coded apps with no security to the mix and you’ve got yourself a perfect storm.

The main thing is, the cat’s out of the bag. Bad actors now have a better idea of where to look to find vulnerabilities, and increasingly sophisticated tools to execute attacks.

To go back to the Avengers analogy: if Ultron represents the bad actors and Mythos represents the Vision, the rest of us are the citizens on the ground hoping the city doesn’t fall on top of us. While we wait to see who wins, the best we can do is harden our own defences - which ultimately means constantly changing passwords and putting up with two-factor authentication. The battle of the bots is heating up, and we’re all in the blast radius.

Wrap up

That’s all for this week - I thought this release was big enough news that I should talk about it at length. Let me know if you liked this op-ed style of newsletter! Or if you want me to go back to talking about tools and repos. Thanks as always for reading and see you next week.

The Blueprint #4: Threat escalation and a whole bunch of exploits

Fable 5: Cybersecurity is about to get (even) crazier

A quick timeline

Wrap up

Keep Reading

The Blueprint