AI Coding Assistants Integration
Introduction
The CERIT-SC AI infrastructure exposes Large Language Models (LLMs) through standard API protocols, enabling you to integrate powerful AI assistance directly into your local development environment. By connecting your tools to our backend, you can leverage high-performance models (such as qwen3-coder or gpt-oss-120b) for coding tasks without running them on your own hardware or relying on external commercial providers.
This guide explains how to configure several popular tools to communicate with our API.
Prerequisite: Before proceeding, ensure you have generated an API key from the AI Chat WebUI. You will need this key to authenticate your client.
Claude Code
Claude Code is also integrated into Jupyter Notebook.
Claude Code can be deployed and configured to work with our models by pointing it to our API endpoint.
Installation
Install Claude Code for your operating system by following the official instructions in the upstream repository:
Make sure the claude CLI is available in your $PATH after installation.
Linux Installation (including Windows WSL)
These instructions apply to both native Linux and Windows Subsystem for Linux (WSL).
Install Claude Code
Use the official installation script to install Claude Code:
curl -fsSL https://claude.ai/install.sh | bashAfter installation completes successfully, you should see output similar to the following:
Setting up Claude Code...
✔ Claude Code successfully installed!
Version: 2.1.5
Location: ~/.local/bin/claude
Next: Run claude --help to get started
✅ Installation complete!Start Claude and Exit During Onboarding
Run Claude for the first time:
claude- Proceed through the syntax scheme selection.
- When you reach the ”Select login method” screen, exit the application by pressing Ctrl+C three times.
This step generates the initial configuration file without completing onboarding.
Manually Complete Onboarding
Open the Claude configuration file:
vim ~/.claude.jsonAt the end of the file, add the following property:
"hasCompletedOnboarding": true- Ensure the previous last property ends with a comma.
- The JSON must remain valid.
Example of a correctly updated ~/.claude.json file:
{
"installMethod": "native",
"autoUpdates": false,
"cachedGrowthBookFeatures": {
"tengu_1p_event_batch_config": {
"scheduledDelayMillis": 5000,
"maxExportBatchSize": 200,
"maxQueueSize": 8192
},
"tengu_mcp_tool_search": false,
"tengu_scratch": false,
"tengu_log_segment_events": false,
"tengu_log_datadog_events": true,
"tengu_event_sampling_config": {},
"tengu_tool_pear": false,
"tengu_thinkback": false,
"tengu_sumi": false
},
"userID": "xxx",
"firstStartTime": "2026-01-12T12:59:53.117Z",
"sonnet45MigrationComplete": true,
"opus45MigrationComplete": true,
"thinkingMigrationComplete": true,
"changelogLastFetched": 1768222793309,
"autoUpdatesProtectedForNative": true,
"hasCompletedOnboarding": true
}Save the file and exit the editor.
Run Claude Normally
Start Claude again:
claudeClaude should now launch without triggering the onboarding flow and run smoothly.
Configuration
Claude Code is configured using environment variables. Export the following variables in your shell:
export ANTHROPIC_BASE_URL="https://llm.ai.e-infra.cz/"
export ANTHROPIC_AUTH_TOKEN="sk-..."
export ANTHROPIC_MODEL="glm-4.7"
export ANTHROPIC_DEFAULT_OPUS_MODEL="glm-4.7"
export ANTHROPIC_DEFAULT_SONNET_MODEL="glm-4.7"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="gpt-oss-120b"
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1Alternatively, you can define these environment variables in the settings file ~/.claude/settings.json:
{
"permissions": {
"defaultMode": "acceptEdits"
},
"env": {
"ANTHROPIC_BASE_URL": "https://llm.ai.e-infra.cz/",
"ANTHROPIC_AUTH_TOKEN": "sk-...",
"ANTHROPIC_MODEL": "glm-4.7",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-4.7",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-4.7",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "gpt-oss-120b",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
}
}Variable description:
ANTHROPIC_BASE_URL– Base URL of our LLM API.ANTHROPIC_AUTH_TOKEN– Your API key obtained from https://chat.ai.e-infra.cz.ANTHROPIC_MODEL– Default model to use when running Claude Code.ANTHROPIC_DEFAULT_OPUS_MODEL– Default model to use when running Claude Code for reasoning and complex tasks.ANTHROPIC_DEFAULT_SONNET_MODEL– Default model to use when running Claude Code for reasoning and moderately complex tasks.ANTHROPIC_DEFAULT_HAIKU_MODEL– Default model to use for simple tasks.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC– Disables sending telemetry and various reporting (not used with non-Anthropic APIs).
Running Claude Code
Once the environment variables are set, start Claude Code with:
claude [project-dir]You should now be able to interact with Claude Code using our backend and selected model.
You can choose any of our available models (e.g., qwen3-coder). However, not all models are guaranteed to work correctly with Claude Code.
If Claude Code stops responding or terminates unexpectedly, the most common cause is that the model’s context size has been exceeded. To resolve this, switch to a different model with a larger context window or reduce the amount of text being processed at once.
Codex
Codex can be deployed and configured to work with our models by pointing it to our API endpoint.
Installation
Install Codex for your operating system by following the official instructions in the upstream repository:
Ensure the codex CLI is available in your $PATH after installation.
For Linux or Windows, it is recommended to visit the Releases page and download the appropriate precompiled binary for your platform.
Configuration
Set these environment variables:
export OPENAI_BASE_URL=https://llm.ai.e-infra.cz
export OPENAI_API_KEY=sk-...Replace sk-... with your API key obtained from: https://chat.ai.e-infra.cz
Run the application with the following command:
codex --model qwen3-coder --full-autoWhen prompted to sign in:
- Select Provide your own API key.
- Confirm Use your own OpenAI API key for usage-based billing.
After this, the setup is complete and ready to use.
Open Code
Open Code can be deployed and configured to work with our models by pointing it to our API endpoint.
Installation
Install Open Code for your operating system by following the official instructions in the upstream repository:
Ensure the opencode CLI is available in your $PATH after installation.
Configuration
- Save the configuration below to the file
~/.config/opencode/opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"litellm": {
"npm": "@ai-sdk/openai-compatible",
"name": "LiteLLM",
"options": {
"baseURL": "https://llm.ai.e-infra.cz/v1"
},
"models": {
"deepseek-r1": {
"name": "deepseek-r1"
},
"gpt-oss-120b": {
"name": "gpt-oss-120b"
},
"qwen3-coder": {
"name": "qwen3-coder"
},
"glm-4.7": {
"name": "glm-4.7"
}
}
}
}
}-
Start the application by running:
opencode -
Inside
opencode, type:/connect -
From the list of available providers, scroll to the end and select LiteLLM.
-
When prompted, paste your API key obtained from: https://chat.ai.e-infra.cz
-
Select the model:
glm-4.7
Once completed, the setup is ready to use.
- The model
gpt-oss-120bhas partial support and may not work as expected. - The model
qwen3-codershould work as well asglm-4.7.
Visual Studio Code
Integrate AI chatbots with Visual Studio Code using third-party extensions such as Continue or Roo Code. These extensions enable AI assistance in various roles while coding, including simple chat, agent mode, and autocomplete. Chat provides a familiar conversational interface, agent mode analyzes or edits files in your project, and autocomplete suggests code as you write.
Install the Continue extension
With Visual Studio Code running:
- Open the Extensions tab (
Ctrl+Shift+X) and search forContinue - Click the extension and then click
Install - After installation, access the Continue extension by clicking the Continue icon in the left sidebar
Configure the Continue extension
Configure the Continue extension by editing the config.yaml file:
-
Access the Continue extension within Visual Studio Code using the icon in the left sidebar
-
Click
Open settingsin the top-right corner of theContinueextension window. -
Click
Configs. -
Click the
Open configurationicon at the end of the line labeledLocal Config. -
Once the
config.yamlis opened, use the following configuration with your own<api-key>(guide above):
%YAML 1.1
---
name: Local Assistant
version: 1.0.0
schema: v1
model_defaults: &model_defaults
provider: openai
apiKey: <api-key>
apiBase: https://llm.ai.e-infra.cz/v1
models:
- name: autocomplete-coder
<<: *model_defaults
model: qwen3-coder
promptTemplates:
autocomplete: '<|fim_prefix|>{{{ prefix }}}<|fim_suffix|>{{{ suffix }}}<|fim_middle|>'
autocompleteOptions:
transform: false
defaultCompletionOptions:
temperature: 0.6
maxTokens: 512
roles:
- autocomplete
- name: chat-coder
<<: *model_defaults
model: qwen3-coder
env:
useLegacyCompletionsEndpoint: false
roles:
- chat
- edit
context:
- provider: code
- provider: docs
- provider: diff
- provider: terminal
- provider: problems
- provider: folder
- provider: codebase- Save the file. The new configuration should apply immediately.
- Verify that
FIMautocomplete is being used by checking theContinuebutton in the Visual Studio Code status bar (bottom-right corner). It should displayContinue, notContinue (NE). IfContinue (NE)is shown, click this button and selectUse FIM autocomplete over Next Edit.
Usage of AI in Visual Studio Code
- Chat: Access chat by clicking the Continue icon in the left sidebar
- Agent mode: In chat, engage agent mode by asking to analyze, explain, or edit files in your current project. This requires additional permissions such as read/write access to related files; you must grant these permissions for the agent to perform requested actions
- Autocomplete: The autocomplete feature continuously suggests new code as you write. Press
Tabto accept suggestions. Responsiveness depends on model speed. You can change the model fromqwen3-codertogpt-oss-120bin the autocomplete configuration section for faster responses, though the default model generally performs better on coding tasks
Roo Code
Roo Code is a Visual Studio Code extension offering AI assistance with enhanced agent and code understanding capabilities.
Install the Roo Code extension
With Visual Studio Code running:
- Open the Extensions tab (
Ctrl+Shift+X) - Search for
Roo Code - Select the extension and click
Install - After installation, access Roo Code via its icon in the left sidebar
Configure the Roo Code extension
To use self-hosted models:
- Open Roo Code using the sidebar icon
- Click the ⚙️ Settings icon (top-right corner)
- In the Providers section:
- API Provider:
OpenAI Compatible - Base URL:
https://llm.ai.e-infra.cz/v1 - API Key: Your key from AI Chat WebUI
- Model:
qwen3-coder(or alternative)
- API Provider:
- Set Context Window Size using this table
- Save settings
Using Roo Code
- Select agents via the button in the extension’s bottom-left corner
- After submitting a prompt, agents work autonomously until intervention is needed
- For advanced usage, consult the Roo Code documentation
Codebase Indexing
Index the codebase using one of our embedding models to help agents better search and understand your codebase.
To store the index, set up a local Qdrant database. Here is an example docker-compose.yaml:
services:
qdrant:
image: qdrant/qdrant
ports:
- "6333:6333"
volumes:
- qdrant_storage:/qdrant/storage
volumes:
qdrant_storage:Configure codebase indexing by following these steps:
- Launch a new database instance by running
docker compose up -din the directory containing thedocker-compose.yamlfile - In the Roo Code extension interface, click on the database icon in the bottom right corner.
- Open the
Setupsection. - Set the following values:
- Embedder Provider:
OpenAI Compatible - Base URL:
https://llm.ai.e-infra.cz/v1 - API Key: Your API key from AI Chat WebUI
- Model: One of our embedding models, e.g.,
qwen3-embedding-4b - Model Dimension: The embedding vector size of the selected model, e.g.,
2560forqwen3-embedding-4b. - Qdrant URL:
http://localhost:6333 - Qdrant API Key: Leave empty if you used the provided Docker Compose configuration.
- Embedder Provider:
- Create the index by clicking on
Start Indexing. - After the indexing is finished, the agents will have access to the database and be able to better search and understand the codebase.
Caveats
Disable all other Visual Studio Code extensions that provide AI autocomplete features. Otherwise, the Continue extension may not work properly.
Last updated on
