Why complex reasoning models could make misbehaving AI easier to catch

Short excerpt below. Click through to read at the original source.

In a new paper from OpenAI, the company proposes a framework for analyzing AI systems’ chain-of-thought reasoning to understand how, when, and why they misbehave.

Read at Source