Installation Summary
Version History:
- Starting from v0.9.0, CSGHub will no longer provide support for Gitea as a git backend.
- Starting from v1.1.0, Add Temporal component as an asynchronous/scheduled task executor.
- Starting from v1.3.0, CSGHub removes gitea from the docker-compose/helm-chart installer.
- Starting from v1.6.0, Space Builder is removed, its function is inherited by runner.
- Starting from v1.7.0, CSGHub internal integration starship.
- Starting from v1.8.0, New services Notification added.
- Starting from v1.9.0, csghub helm chart ce/ee merged.
- Starting from v1.14.0, XNet Storage in beta.
Introduction
CSGHub is an open source, trusted large model asset management platform that helps users govern assets (datasets, model files, codes, etc.) involved in the life cycle of LLM and its applications. Based on CSGHub, users can operate assets such as model files, data sets, and codes through web interfaces, Git command lines, or natural language chatbots, including uploading, downloading, storing, verifying, and distributing. At the same time, the platform provides microservice submodules and standardized APIs to facilitate users to integrate with their own systems.
CSGHub is committed to providing users with an asset management platform that is natively designed for large models and can be privately deployed and run offline. CSGHub provides a similar private Hugging Face function to manage LLM assets in a similar way to OpenStack Glance managing virtual machine images, Harbor managing container images, and Sonatype Nexus managing artifacts.
For an introduction to CSGHub, please refer to:
- Portal: https://github.com/OpenCSGs/csghub
- Server: https://github.com/OpenCSGs/csghub-server
- Install: https://github.com/OpenCSGs/csghub-charts
Deployment methods
This project mainly introduces various installation methods of CSGHub.
Currently, there are following installation methods for CSGHub:
- Docker
- Helm Chart
- VirtualBox(To be removed)
Components Overview
CSGHub is composed of multiple modules, each responsible for a specific part of the system. Together, they form a highly efficient and scalable platform for model and data management. Below is an optimized explanation of all current components.
Core Frontend & Backend Services
-
portal
Provides the user interface for interaction and serves as the main web frontend entry.
-
server
The core backend service that exposes primary APIs and handles requests from the portal and external clients.
-
user
The user center module responsible for user registration, login, permissions, and authentication logic.
-
rproxy
Handles reverse proxying and request routing for deployed applications, such as forwarding Space application operations to Knative Serving.
-
nginx
The unified external entry gateway that handles TLS termination, routing, and static file delivery.#
AI Gateway & Content Safety
-
gateway / aigateway
The unified entrypoint for AI service access, responsible for routing inference requests, rate limiting, circuit breaking, metering, and authorization.
-
moderation
Sensitive content detection and safety module. (If integrated into server or aigateway, it may not run as a standalone component.)
Billing & Notification
-
accounting
The resource billing system that calculates usage-based costs for model inference, Space applications, and other resources.
-
notifier
Handles internal and external notifications, including email alerts, system messages, and Webhooks.
Data & Model Management#
-
mirror_repo
Synchronizes model and dataset repositories from opencsg.com to the local environment.
-
mirror_lfs
Synchronizes large files (LFS) associated with repositories.
-
dataviewer
Enables quick dataset preview in the frontend, supporting formats like CSV, images, and more.
Task System (Temporal)
-
temporal
Manages and schedules long-running or asynchronous tasks such as resource syncing and build jobs.
-
temporal_worker
Executes the asynchronous tasks defined by Temporal workflows.
-
temporal_ui
A visual management interface for Temporal, used to inspect workflow states and task details.
Git Storage & Repository Services
-
gitaly
High-performance Git storage backend responsible for all Git operations.
-
gitlab_shell
Provides the Git-over-SSH interface, handling secure Git SSH requests.
Infrastructure Services
-
postgresql
The primary database used to store CSGHub metadata.
-
redis
Provides caching, temporary data storage, and session storage.
-
minio
Local object storage used for models, datasets, inference outputs, and other static artifacts.
-
registry
Container image registry used to store build outputs for Space applications and other services.
-
nats
A message and event bus enabling efficient asynchronous communication between microservices.
Logging & Monitoring
-
loki
Centralized log storage and indexing system, typically used with promtail or fluentd.
-
fluentd (optional)
Log collector used to aggregate logs and forward them to Loki, S3, Elasticsearch, or other storage backends.
-
prometheus
Metrics collection and storage system used for performance monitoring and alerting.
Network & Internal Components
-
casdoor
Identity and authentication service that works together with the user module to provide login and OAuth capabilities.
-
xnet
Internal networking or sidecar service (depends on your implementation), often used for internal traffic routing or unified outbound access.