einfra logoDocumentation
Ai as a service

Introduction

About AI as a Service

CERIT-SC, a core component of e-INFRA CZ, is developing an on-premise AI platform dedicated to providing researchers with secure, high-performance, and interoperable AI tools. This platform is powered by cutting-edge NVIDIA DGX H100/B200/B300 class systems, guaranteeing the robust compute required for both efficient large-scale model training and high-speed inference. This robust environment hosts a curated selection of open large language and generative models, which are easily accessible via the Open WebUI interface or standard OpenAI-compatible APIs.

Difference between AI Training and Inference

image-webui

Training is learning, and Inference is applying that knowledge.

ConceptWhat It DoesData FlowOur Context
AI TrainingThe “Education” Phase. Developing the brain by feeding it massive datasets. The model learns patterns, adjusts weights, and minimizes error through backpropagation. Learn patterns & optimize weights.Forward & Backward (Backprop).This is the work done by others who created the open-source trained models we are hosting.
AI InferenceThe “Application” Phase. Putting the brain to work. The trained model takes new, unseen data and applies its fixed knowledge to make real-time predictions or decisions. Make predictions on new data.Forward pass only.This is our job: providing the models and applications over them.

Key Features (Inference)

  • Secure, on‑premise LLM & generative‑AI platform – Our models run securely on the e-INFRA CZ infrastructure. Queries and responses are not logged by external providers, ensuring that your research and sensitive data remain within our environment. With the exception of internet searching, nothing leaves our local infrastructure.
  • Supports privacy‑sensitive research – compliance with institutional and legal requirements. This makes our services ideal for handling sensitive data.

What the Platform Provides

CategoryHighlights
ComputeNVIDIA DGX‑H100 (Hopper) and DGX‑B200 (Bergamo). Petaflop‑class GPU performance.
Key Models

Full-sized DeepSeek R1 0528 (685B): High-performance reasoning model, excellent for complex tasks and the best for Czech language processing. Operates without internet access for high security.

GPT-OSS-120B: General reasoning model, equivalent to known mini OpenAI models.

Qwen3 Coder 480B Instruct: Specialized instruction model optimized for programming and code generation. Additional community models can be added on request.

AccessMUNI students and employees and MetaCentrum users
External usageOpenAI‑compatible REST API - use existing openai client code, LangChain, or other integrations without modification

Key AI Services (Inference)

ServiceLinkDescription
Chat (Open-WebUI)WebUI chat

A full-featured conversational interface (similar to ChatGPT) offering advanced features and explicit language model selection.

Text Work: Translations, summaries, analysis, and generation of program code.

Multimodality: Image generation (including editing) and content recognition in images (e.g., extracting a serial number from a photo).

Tools: Searching the internet, GitHub, arXiv, and a Python sandbox for running code in the browser and data analytics.

RAG (Knowledge): Searching within a document attached to the chat works very well.

Call the APIUsing AI models – API docsUse the OpenAI‑compatible endpoint to integrate AI into your scripts, pipelines, or services.
DeepSiteDeepSite (vibe‑coding)A generative tool that creates webpages and applications (HTML/CSS/JS) based on a simple text description. Excellent for design proposals, mockups, or quick web concepts.
AI in Jupyter NotebooksJupyterHub integrationIntegration of an AI Assistant directly into the Jupyter Lab environment (Notebook Intelligence). Used for fixing code, generating new snippets, and conversational assistance within your coding projects.
Documentation ChatBotdocs.e-infra.cz and other documentation sitesA Retrieval-Augmented Generation (RAG) system implemented across e-INFRA CZ documentation. Answers specific questions and acts as a problem solver based on our knowledge base (e.g., “How to run MATLAB?”).

Reference

Read more details on our e-INFRA Blog at https://blog.e-infra.cz/

Picture used from https://blogs.nvidia.com/blog/difference-deep-learning-training-inference-ai/

Last updated on

publicity banner

On this page

einfra banner