- The Carnegie Mellon University and Center for A.I. Safety researchers developed jailbreaks to target mainstream AI chatbot systems.
- Adding characters to end of user queries, termed automated adversarial attacks, could be used to bypass safety rules.
- The discovery raises concerns about moderation of AI systems and safety of open-source language models.