Echo: routing LLM requests cheaply without training a router
Call the cheap model twice with different personas. If the two answers agree, keep the cheap answer; if they disagree, escalate. No classifier, no labels. The hard part turns out to be measuring agreement, and the winning signal is a surprise.