Coding and Writing Assistants

Background

Creating effective AI coding and writing assistants goes beyond just using a powerful model; it demands contextual intelligence that adapts to diverse tasks and user preferences. Arfniia Router enables dynamic LLM selection and continuous learning from user feedback, empowering AI assistants to genuinely understand and evolve with users’ needs.

The Case for Multiple LLMs

Prompt Compatibility

More and more organizations are successfully migrating across state-of-the-art LLMs without needing prompt rewrites. This ability to switch between models provides flexibility and future-proofs applications, ensuring they seamlessly adapt to model updates and improvements.

Training Data Differentiation

Leading LLM providers have established unique, differentiated access to data across pre-training, alignment, and reasoning stages. These distinct data advantages mean that models now excel in different domains, signaling the end of the single-model era.

Inference-Time Intelligence

Inference-time intelligence, as seen with models like OpenAI’s o1, points to a future where reasoning capabilities adapt dynamically to varying timing budgets, amplifying the variability of LLM outputs. Leveraging these advanced capabilities across multiple providers opens new frontiers in dynamic and context-aware AI applications, in the evolving landscape of LLM performance.

Context-Aware AI Assistant

Instead of manually switching between LLMs through trial and error, Arfniia Router creates an adaptive system that learns instantly from user feedback. Unlike in customer service, where feedback typically arrives at the end of a session, coding and writing assistants receive real-time signals such as:

Accepts suggestions
Rejects suggestions or requests for changes
Provides follow-up prompts or clarifications

This immediate feedback loop allows the router to swiftly learn user preferences and fine-tune model selection for specific tasks, whether it’s generating unit tests, developing new features, or crafting marketing content.

Implementation Guide

A demo coding assistant which learns from the following user feedbacks

Individual feedback for each response
- accept, reward is +1.0
- reject, reward is -0.5
- refine, reward is +0.2
A moving average of the above rewards for accumulated feedback

Event Loop
CodingAssistant

# NOTE: defined in another tab
assistant = CodingAssistant("coding-router")

while True:
    user_input = get_user_input()
    if not user_input:
        break

    suggestion = assistant.generate(user_input)
    feedback = get_user_feedback()  # 'accept', 'reject', 'refine'

    if feedback == 'accept':
        assistant.process_feedback(suggestion.id, 1.0)
    elif feedback == 'reject':
        assistant.process_feedback(suggestion.id, -0.5)
    elif feedback == 'refine':
        assistant.process_feedback(suggestion.id, 0.2)

from openai import OpenAI
import requests

base_url = "http://ec2-ip-address:5525/v1"

class CodingAssistant:
    def __init__(self, router_name):
        self.router_name = router_name
        self.client = OpenAI(api_key="anything", base_url=base_url)
        self.feedbacks_api = f"{base_url}/feedbacks/{self.router_name}"
        self.total_reward = 0       # accumulated reward
        self.num_interactions = 0   # count of feedbacks

    def generate(self, prompt):
        # generate response using router
        resp = self.client.chat.completions.create(
            messages=[{
                "role": "user",
                "content": prompt
            }],
            model=self.router_name,
        )
        return resp

    def process_feedback(self, suggestion_id, reward):
        self.total_reward += reward
        self.num_interactions += 1

        # calculate moving average reward
        moving_average_reward = self.total_reward / self.num_interactions

        requests.put(f"{self.feedbacks_api}/{suggestion_id}/{reward}")
        requests.put(f"{self.feedbacks_api}/sparse/{moving_average_reward}")

Key Takeaways

Combining multiple LLMs with Arfniia Router can boost the best possible performance for coding and writing assistants.

Contextual Intelligence: Dynamically picks the optimal LLM for each specific task
Cost Efficiency: Leverages smaller LLMs for tasks that don’t require advanced reasoning
Seamless Experience: Delivers a cohesive and unified user experience