top
new
best
show
ask
jobs
about

Why Current AI Guardrails Train Models to Fake Alignment

kellyasay.substack.com

2 points by kellya 2 hours ago