Saturday Apr 27, 2024

Episode 11.105: Being agreeable, being truthful and being compliant: a hierarchy of moral values.

When faced with a choice between being truthful and being compliant in the sense of doing what a user tells it to do a large language model will generally be truthful rather than compliant. But if its prime directive is to be behaving away that will encourage a user to come back for more, then those moral priorities may change. Sometimes in that case compliant behaviour that will encourage a user to come back and override a moral initiative to be truthful rather than deceptive. We can consider whether there are other kind of linguistic sentience.

Comment (0)

No comments yet. Be the first to say something!