“In various implementations, a method includes identifying an experimental population for purposes of testing a behavior change model; applying the behavior change model to the experimental population ...
Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...