Disable Learning
Arfniia Router’s learning typically converges after a few hundred to a few thousand requests. Once stabilized, you can switch to serving-only mode by disabling learning and accelerate the inference.
from openai import OpenAI
base_url = "http://${EC2_IP_ADDR}:5525/v1"client = OpenAI(api_key="any-text-would-work", base_url=base_url)resp = client.chat.completions.create( messages=[ { "role": "user", "content": "How many Rs in Strawberry?", } ], model="advanced-reasoning", extra_headers={"X-Arfniia-Disable-Learning": "true"},)