AI Coding Assistants Integration
Introduction
The CERIT-SC AI infrastructure exposes Large Language Models (LLMs) through standard API protocols, allowing you to integrate powerful AI assistance directly into your local development environment. By connecting your tools to our backend, you can leverage high-performance models (such as qwen3-coder or gpt-oss-120b) for coding tasks without needing to run them on your own hardware or rely on external commercial providers.
This chapter describes how to configure two popular tools to communicate with our API.
Prerequisite: Before proceeding, ensure you have generated an API key from the AI Chat WebUI. You will need this key to authenticate your client.
Integrate with Visual Studio Code
The AI chatbot can be integrated with Visual Studio Code. For this purpose, a 3rd part extension Continue is used. With this extension, you can engage AI in miscellaneous roles that can help you while coding in Visual Studio Code. The roles include simple chat, agent mode, or autocomplete. While the chat can provide a familiar conversation, the agent mode lets you analyze or edit files opened in your project. Finally, the autocomplete role suggests code as you write.
Install the Continue extension
While Visual Studio Code is running, open the Extensions tab (Ctrl+Shift+X) and search for Continue. Click on the extension and then click on the Install button. After the installation is complete, you can access the Continue extension by clicking on the Continue icon in the left sidebar.
Configure the Continue extension
First, the configuration file config.yaml must be edited. Access the Continue extension within Visual Studio Code using the icon in the left sidebar.
- Click
Open settingsin the top-right corner of theContinueextension window. - Click
Configs. - Click
Open configurationicon at the end of the lineLocal Config. - Once the
config.yamlis opened, use the following configuration with your own<api-key>(guide above):
%YAML 1.1
---
name: Local Assistant
version: 1.0.0
schema: v1
model_defaults: &model_defaults
provider: openai
apiKey: <api-key>
apiBase: https://llm.ai.e-infra.cz/v1
models:
- name: autocomplete-coder
<<: *model_defaults
model: qwen3-coder
promptTemplates:
autocomplete: '<|fim_prefix|>{{{ prefix }}}<|fim_suffix|>{{{ suffix }}}<|fim_middle|>'
autocompleteOptions:
transform: false
defaultCompletionOptions:
temperature: 0.6
maxTokens: 512
roles:
- autocomplete
- name: chat-coder
<<: *model_defaults
model: qwen3-coder
env:
useLegacyCompletionsEndpoint: false
roles:
- chat
- edit
context:
- provider: code
- provider: docs
- provider: diff
- provider: terminal
- provider: problems
- provider: folder
- provider: codebase- Save the file. The new configuration should apply immediately.
- Check that the
FIMautocomplete is used by looking at theContinuebutton in the Visual Studio Code status bar (in the bottom-right corner). It should displayContinue. It should not displayContinue (NE). IfContinue (NE)is shown, press this button and select the choiceUse FIM autocomplete over Next Edit.
Usage of AI in Visual Studio Code
- The chat can be accessed by pressing the
Continueicon in the left sidebar. - In the chat, you can engage the agent mode by asking to analyze, explain, or edit the file in the currently open project. This action will require additional permissions, such as read/write permissions to the related files. You need to grant the necessary permissions to get the agent to do the job.
- Autocomplete feature will continuously suggest new code as you write. Once you see a suggestion, pressing
Tabaccepts the suggestion. The responsiveness of the suggestions depends on the model’s speed. You can change the modelqwen3-coderin the autocomplete config section togpt-oss-120bto get a bit faster responses, but the default model performs better in coding tasks.
Caveats
Disable all other Visual Studio Code extensions that provide AI autocomplete features. Otherwise, the Continue extension may not work properly.
Claude Code
You can deploy Claude Code and configure it to work with our models by pointing it to our API endpoint.
Installation
First, install Claude Code for your operating system by following the official instructions in the upstream repository:
Make sure the claude CLI is available in your $PATH after installation.
Linux Installation (including Windows WSL)
These instructions apply to both native Linux and Windows Subsystem for Linux (WSL).
Install Claude Code
Install Claude Code using the official installation script:
curl -fsSL https://claude.ai/install.sh | bashAfter installation completes successfully, you should see output similar to the following:
Setting up Claude Code...
✔ Claude Code successfully installed!
Version: 2.1.5
Location: ~/.local/bin/claude
Next: Run claude --help to get started
✅ Installation complete!Start Claude and Exit During Onboarding
Run Claude for the first time:
claude- Proceed through the syntax scheme selection.
- When you reach the “Select login method” screen, exit the application by pressing Ctrl+C three times.
This step generates the initial configuration file without completing onboarding.
Manually Complete Onboarding
Open the Claude configuration file:
vim ~/.claude.jsonAt the end of the file, add the following property:
"hasCompletedOnboarding": true- Ensure the previous last property ends with a comma.
- The JSON must remain valid.
Example of a correctly updated ~/.claude.json file:
{
"installMethod": "native",
"autoUpdates": false,
"cachedGrowthBookFeatures": {
"tengu_1p_event_batch_config": {
"scheduledDelayMillis": 5000,
"maxExportBatchSize": 200,
"maxQueueSize": 8192
},
"tengu_mcp_tool_search": false,
"tengu_scratch": false,
"tengu_log_segment_events": false,
"tengu_log_datadog_events": true,
"tengu_event_sampling_config": {},
"tengu_tool_pear": false,
"tengu_thinkback": false,
"tengu_sumi": false
},
"userID": "xxx",
"firstStartTime": "2026-01-12T12:59:53.117Z",
"sonnet45MigrationComplete": true,
"opus45MigrationComplete": true,
"thinkingMigrationComplete": true,
"changelogLastFetched": 1768222793309,
"autoUpdatesProtectedForNative": true,
"hasCompletedOnboarding": true
}Save the file and exit the editor.
Run Claude Normally
Start Claude again:
claudeClaude should now launch without triggering the onboarding flow and run smoothly.
Configuration
Claude Code is configured using environment variables. Export the following variables in your shell:
export ANTHROPIC_BASE_URL="https://llm.ai.e-infra.cz/"
export ANTHROPIC_AUTH_TOKEN="sk-..."
export ANTHROPIC_MODEL="qwen3-coder"
export ANTHROPIC_DEFAULT_OPUS_MODEL="qwen3-coder"
export ANTHROPIC_DEFAULT_SONNET_MODEL="qwen3-coder"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="gpt-oss-120b"
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1Alternatively, you can define these environment variables in the settings file ~/.claude/settings.json:
{
"permissions": {
"defaultMode": "acceptEdits"
},
"env": {
"ANTHROPIC_BASE_URL": "https://llm.ai.e-infra.cz/",
"ANTHROPIC_AUTH_TOKEN": "sk-...",
"ANTHROPIC_MODEL": "qwen3-coder",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "qwen3-coder",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "qwen3-coder",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "gpt-oss-120b",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
}
}Variable description:
ANTHROPIC_BASE_URL– Base URL of our LLM API.ANTHROPIC_AUTH_TOKEN– Your API token obtained from https://chat.ai.e-infra.cz.ANTHROPIC_MODEL– Default model to use when running Claude Code.ANTHROPIC_DEFAULT_OPUS_MODEL– Default model to use when running Claude Code for reasoning and complex tasks.ANTHROPIC_DEFAULT_SONNET_MODEL– Default model to use when running Claude Code for reasoning and less complex tasks.ANTHROPIC_DEFAULT_HAIKU_MODEL– Default model to use for simple tasks.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC– Disables to send telemetry and various reporting (not used with non-anthropics API).
Running Claude Code
Once the environment variables are set, start Claude Code with:
claude [project-dir]You should now be able to interact with Claude Code using our backend and selected model.
You can choose any of our available models, e.g., devstral-2. However, not all models are guaranteed to work correctly with Claude Code, esp. DeepSeek-R1 currently does not work and returns error: Internal server error: can only concatenate str (not "dict") to str.
If Claude Code stops responding or terminates unexpectedly, the most common cause is that the model’s context size has been exceeded. To resolve this, switch to a different model with a larger context window or reduce the amount of text being processed at once.
Last updated on
