Writing backwards can trick an AI into providing a bomb recipe

AI models have safeguards in place to prevent them creating dangerous or illegal output, but a range of jailbreaks have been found to evade them. Now researchers show that writing backwards can trick AI models into revealing bomb-making instructions.

​AI models have safeguards in place to prevent them creating dangerous or illegal output, but a range of jailbreaks have been found to evade them. Now researchers show that writing backwards can trick AI models into revealing bomb-making instructions. AI models have safeguards in place to prevent them creating dangerous or illegal output, but a range of jailbreaks have been found to evade them. Now researchers show that writing backwards can trick AI models into revealing bomb-making instructions. 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top