Intelligent Routing
NorthSignal's defining feature is its Intelligent Routing Engine. Instead of forcing you to choose the best AI model for every prompt, NorthSignal analyzes the complexity of your request and automatically sends it to the best expert for the job.
How the Router Decides
The routing engine evaluates your prompt across multiple dimensions in milliseconds:
| Prompt Characteristic | Routed To | Why |
|---|---|---|
| Coding, debugging, complex logic | Claude Sonnet 4.5 (Anthropic) | Best-in-class for software engineering tasks |
| Very long documents (100+ pages) | Gemini 2.0 Pro (Google) | Largest context window — can read entire books |
| Quick questions, creative writing | GPT-5 (OpenAI) | Fastest time-to-first-token, versatile |
| Simple tasks, summaries | Fastest available model | Prioritizes speed over depth |
The router considers:
- Token complexity — How sophisticated is the language and structure of the prompt?
- Input length — How much text, code, or file data are you sending?
- Task type signals — Keywords like "debug," "refactor," "summarize," or "compare" influence routing.
- Available keys — The router only considers providers you've connected.
The more API keys you add to NorthSignal, the smarter the routing becomes. With all three providers connected, the router has maximum flexibility to pick the optimal model.
Overriding the Router
You always have full control. If you want to use a specific model:
- Look below the chat input bar for the Model Selector dropdown.
- Click the dropdown — it defaults to Auto-Route.
- Select your preferred model from the list (e.g.,
Claude Sonnet 4.5).
NorthSignal will lock your current chat to that model until you change it back.
Why Not Just Always Use the "Best" Model?
Different models have different strengths and different pricing. The Intelligent Router saves you money by:
- Sending simple questions to cheaper, faster models.
- Only invoking expensive reasoning models (like
o1) when the task truly demands it. - Picking the provider with the largest context window when you attach massive files, rather than hitting a context limit error.
The result is higher quality answers at lower cost — without you having to think about which model to pick.