Skip to main content

Architecture Design

1. Overview

CSGHub is an open-source, trusted Large Language Model (LLM) asset management platform. Its architecture is designed with private deployment as the core objective, striving to provide users with a complete suite of asset management capabilities consistent with Hugging Face. It enables full lifecycle governance for LLM native assets, including models, datasets, and code. The system adopts a microservice architecture, offering excellent scalability that supports a smooth evolution from lightweight single-machine Docker deployments to large-scale Kubernetes clusters, adapting to various scales and deployment scenarios.

2. Logical Architecture and Components

CSGHub utilizes a standardized microservice architecture where core components have clear responsibilities and work in synergy.

  • Docker Deployment: All components run in different processes within the same container, simplifying deployment and management.
  • Kubernetes Deployment: Components run as independent Pods, achieving component isolation, elastic scaling, and high-availability deployment.

2.1 Core Business & Access Layer

  • Portal & Server: The primary entry point for the platform. It provides the Web UI and core business logic APIs, managing metadata for assets like models and datasets.
  • User & Casdoor: Builds a complete identity management system handling registration, login, permission allocation, and multi-tenant OAuth authentication to ensure secure access.
  • Nginx & RProxy: Manages all traffic ingress and dynamic routing. RProxy specifically handles dynamic load requests for "Space" applications, ensuring precise forwarding and load balancing.
  • Notifier: A unified notification service integrating email, webhooks, and system messages to push critical events (task completion, asset updates, alerts) to users.
  • DataViewer: An online dataset preview tool supporting content parsing and visual display for various formats, helping users quickly understand dataset details.

2.2 AI Computing & Orchestration Layer

This layer is responsible for resource allocation, AI task execution, and backend support for code assistants.

  • AI Gateway: The unified entry point for AI services, integrating inference request routing, rate limiting, billing statistics, and security controls.
  • CSGShip: The backend service for code assistants, providing support for the CodeSouler IDE plugin.
  • Runner (Critical Component): A distributed task executor (successor to Space Builder) responsible for compute-intensive tasks like Space app building, model fine-tuning, and general task execution.
  • Dataflow: A data pipeline service focused on cleaning, transforming, and formatting large-scale datasets to support model training and inference.
  • Temporal & Worker: The "brain" of asynchronous task management, managing state machines for long-running tasks like resource synchronization and image building to ensure stability and error recovery.
  • Accounting: A resource billing system that tracks computing usage, storage occupancy, and API calls for resource control and cost accounting.

2.3 Asset Storage & Acceleration Layer

This layer handles persistence, versioning, and high-speed transfer of large files.

  • xNet (Core Acceleration): An intelligent acceleration engine designed for large files (LFS, model weights). It optimizes transmission paths and caching to solve the pain point of slow asset transfers.
  • Gitaly & Gitlab-shell: A high-performance Git storage backend providing version control and SSH access for models, code, and datasets.
  • Mirroring Service: Consists of mirror_repo and mirror_lfs modules to synchronize assets between domestic and international repositories.
  • Object Storage (MinIO) & Registry: The physical storage foundation. MinIO stores model files and datasets, while the Registry manages container images for Spaces.

2.4 Infrastructure Layer

  • Databases: Includes PostgreSQL (storing metadata like user info and task configs) and Redis (handling caching and session management).
  • NATS: A high-performance event bus facilitating asynchronous communication and decoupling between microservices.
  • Observability: Integrates Prometheus (metrics) and Loki (centralized logging) for real-time monitoring and troubleshooting.

3. Deployment Methods

CSGHub offers multiple deployment options to meet different business needs and environmental constraints:

  • Docker Compose: A "all-in-one" single-image solution. Best for rapid local onboarding, product demos, and developer debugging. It features a minimal delivery process but limited scalability.
  • Kubernetes (Helm): A standardized distributed deployment for production environments. It supports Pod-level elastic scaling, high availability, and enterprise-grade stability.
  • Air-gap Deployment (Coming Soon): Designed for high-security environments without internet access (e.g., finance or government). It uses pre-downloaded image tarballs and internal private registries.
  • Quick Install: An automated script based on K3s. It sets up a lightweight Kubernetes environment and initializes CSGHub in one click, ideal for single-machine environments requiring K8s orchestration.

4. Network Access and Port Specifications

4.1 Docker Compose (Multiple Exposed Ports)

Since all services run in a single container/namespace, multiple ports are mapped to the host:

  • Main Entry: Port 80 (Nginx) for Web and API access.
  • Git SSH: Port 2222 (Git Over SSH) to avoid conflict with the host's default SSH (Port 22).
  • Identity: Port 8000 (Casdoor) for authentication and SSO.
  • Code Assistant: Ports 8001 (Frontend) and 8002 (API) for CSGShip.
  • Object Storage: Port 9000 (API) and 9001 (Console) for MinIO management.

4.2 Kubernetes / Air-gap / Quick Install (80/443 Convergence)

In these modes, network access is unified by an Ingress or Envoy-Gateway:

  • Unified Entry: All functions (Web, API, Auth, Inference) are accessed via standard 80 (HTTP) or 443 (HTTPS) ports.
  • Standard Git Access: SSH operations typically use the standard Port 22 via a LoadBalancer, aligning with standard user habits.