Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

Coding and Writing Assistants

Background

Creating effective AI coding and writing assistants goes beyond just using a powerful model; it demands contextual intelligence that adapts to diverse tasks and user preferences. Arfniia Router enables dynamic LLM selection and continuous learning from user feedback, empowering AI assistants to genuinely understand and evolve with users’ needs.

The Case for Multiple LLMs

Prompt Compatibility

More and more organizations are successfully migrating across state-of-the-art LLMs without needing prompt rewrites. This ability to switch between models provides flexibility and future-proofs applications, ensuring they seamlessly adapt to model updates and improvements.

Training Data Differentiation

Leading LLM providers have established unique, differentiated access to data across pre-training, alignment, and reasoning stages. These distinct data advantages mean that models now excel in different domains, signaling the end of the single-model era.

Inference-Time Intelligence

Inference-time intelligence, as seen with models like OpenAI’s o1, points to a future where reasoning capabilities adapt dynamically to varying timing budgets, amplifying the variability of LLM outputs. Leveraging these advanced capabilities across multiple providers opens new frontiers in dynamic and context-aware AI applications, in the evolving landscape of LLM performance.

Context-Aware AI Assistant

Instead of manually switching between LLMs through trial and error, Arfniia Router creates an adaptive system that learns instantly from user feedback. Coding and writing assistants often run as multi‑turn interactive sessions, so we use episodic RL to stitch learning across steps while still reacting to immediate signals such as:

  • Accepts suggestions
  • Rejects suggestions or requests for changes
  • Provides follow-up prompts or clarifications

This immediate feedback loop, combined with episodic credit assignment, allows the router to swiftly learn user preferences and fine‑tune model selection for specific tasks, whether it’s generating unit tests, developing new features, or crafting marketing content.

Implementation Guide

A demo coding assistant that learns from the following user feedback signals within an interactive session (episode):

  • Individual feedback for each response
    • accept, reward is +1.0
    • reject, reward is -0.5
    • refine, reward is +0.2
  • A moving average of the above rewards for accumulated feedback

The assistant marks session boundaries using HTTP headers so the router can learn across steps within the same session:

  • X-Arfniia-Episode-Id: unique id per session
  • X-Arfniia-Episode-Start: present on the first step
  • X-Arfniia-Episode-End: present on the last step
# NOTE: defined in another tab
assistant = CodingAssistant("coding-router")
session = assistant.start_session()
while True:
user_input = get_user_input()
if not user_input:
break # end session
suggestion = assistant.generate(user_input, session=session)
feedback = get_user_feedback() # 'accept', 'reject', 'refine'
if feedback == 'accept':
assistant.process_feedback(suggestion.id, 1.0)
elif feedback == 'reject':
assistant.process_feedback(suggestion.id, -0.5)
elif feedback == 'refine':
assistant.process_feedback(suggestion.id, 0.2)
assistant.end_session(session=session)

Tuning Exploration

Arfniia supports configurable exploration levels per router (set via training.exploration_level):

  • low (default): favors the best‑known model for stability
  • medium: balances trying alternatives with sticking to the current best
  • high: explores aggressively (useful for demos, smoke tests, and rapid adaptation)

Use low in production by default, and temporarily increase during trials or when onboarding new tasks to speed up adaptation.

Key Takeaways

Combining multiple LLMs with Arfniia Router’s episodic RL can boost the best possible performance for coding and writing assistants.

  • Contextual Intelligence: Dynamically picks the optimal LLM per turn and across a session
  • Cost Efficiency: Leverages smaller LLMs when advanced reasoning isn’t needed
  • Seamless Experience: Delivers a cohesive multi‑turn experience with consistent improvement