Skip to main content

Installation Summary

Version History:

  • Starting from v0.9.0, CSGHub will no longer provide support for Gitea as a git backend.
  • Starting from v1.1.0, Add Temporal component as an asynchronous/scheduled task executor.
  • Starting from v1.3.0, CSGHub removes gitea from the docker-compose/helm-chart installer.
  • Starting from v1.6.0, Space Builder is removed, its function is inherited by runner.
  • Starting from v1.7.0, CSGHub internal integration starship.
  • Starting from v1.8.0, New services Notification added.
  • Starting from v1.9.0, csghub helm chart ce/ee merged.
  • Starting from v1.14.0, XNet Storage in beta.

Introduction

CSGHub is an open source, trusted large model asset management platform that helps users govern assets (datasets, model files, codes, etc.) involved in the life cycle of LLM and its applications. Based on CSGHub, users can operate assets such as model files, data sets, and codes through web interfaces, Git command lines, or natural language chatbots, including uploading, downloading, storing, verifying, and distributing. At the same time, the platform provides microservice submodules and standardized APIs to facilitate users to integrate with their own systems.

CSGHub is committed to providing users with an asset management platform that is natively designed for large models and can be privately deployed and run offline. CSGHub provides a similar private Hugging Face function to manage LLM assets in a similar way to OpenStack Glance managing virtual machine images, Harbor managing container images, and Sonatype Nexus managing artifacts.

For an introduction to CSGHub, please refer to:

Deployment methods

This project mainly introduces various installation methods of CSGHub.

Currently, there are following installation methods for CSGHub:

Components Overview

CSGHub is composed of multiple modules, each responsible for a specific part of the system. Together, they form a highly efficient and scalable platform for model and data management. Below is an optimized explanation of all current components.

Core Frontend & Backend Services

  • portal

    Provides the user interface for interaction and serves as the main web frontend entry.

  • server

    The core backend service that exposes primary APIs and handles requests from the portal and external clients.

  • user

    The user center module responsible for user registration, login, permissions, and authentication logic.

  • rproxy

    Handles reverse proxying and request routing for deployed applications, such as forwarding Space application operations to Knative Serving.

  • nginx

    The unified external entry gateway that handles TLS termination, routing, and static file delivery.#

AI Gateway & Content Safety

  • gateway / aigateway

    The unified entrypoint for AI service access, responsible for routing inference requests, rate limiting, circuit breaking, metering, and authorization.

  • moderation

    Sensitive content detection and safety module. (If integrated into server or aigateway, it may not run as a standalone component.)

Billing & Notification

  • accounting

    The resource billing system that calculates usage-based costs for model inference, Space applications, and other resources.

  • notifier

    Handles internal and external notifications, including email alerts, system messages, and Webhooks.

Data & Model Management#

  • mirror_repo

    Synchronizes model and dataset repositories from opencsg.com to the local environment.

  • mirror_lfs

    Synchronizes large files (LFS) associated with repositories.

  • dataviewer

    Enables quick dataset preview in the frontend, supporting formats like CSV, images, and more.

Task System (Temporal)

  • temporal

    Manages and schedules long-running or asynchronous tasks such as resource syncing and build jobs.

  • temporal_worker

    Executes the asynchronous tasks defined by Temporal workflows.

  • temporal_ui

    A visual management interface for Temporal, used to inspect workflow states and task details.

Git Storage & Repository Services

  • gitaly

    High-performance Git storage backend responsible for all Git operations.

  • gitlab_shell

    Provides the Git-over-SSH interface, handling secure Git SSH requests.

Infrastructure Services

  • postgresql

    The primary database used to store CSGHub metadata.

  • redis

    Provides caching, temporary data storage, and session storage.

  • minio

    Local object storage used for models, datasets, inference outputs, and other static artifacts.

  • registry

    Container image registry used to store build outputs for Space applications and other services.

  • nats

    A message and event bus enabling efficient asynchronous communication between microservices.

Logging & Monitoring

  • loki

    Centralized log storage and indexing system, typically used with promtail or fluentd.

  • fluentd (optional)

    Log collector used to aggregate logs and forward them to Loki, S3, Elasticsearch, or other storage backends.

  • prometheus

    Metrics collection and storage system used for performance monitoring and alerting.

Network & Internal Components

  • casdoor

    Identity and authentication service that works together with the user module to provide login and OAuth capabilities.

  • xnet

    Internal networking or sidecar service (depends on your implementation), often used for internal traffic routing or unified outbound access.