Skip to main content

Configuration Explanation

  • Note: This document only explains the important parameters. Some parameters, due to their similar or identical functions, are not elaborated upon here.

    • All non-third-party services can customize the service name <service>.name to change the service name and the domain name exposed by the ingress (except for the portal's main domain).

    • All non-third-party services can define deployments and most deployment-related attributes of the StatefulSet, such as labels, annotations, replicas, serviceAccount, environments, resources, volumeMounts, livenessProbe, readinessProbe, startupProbe, lifecycle, stdin, tty, volumes, nodeSelector, tolerations, affinity, securityContext, etc.

This document explains the important configuration items in the CSGHub Helm Chart, including global configuration, core service configuration, built-in components, and the priority of each parameter. It is suitable for deployment, operation and maintenance, and secondary development personnel.

1. Global Parameter

1.1 Release edition

global:
edition: "ce" or "ee"

This specifies the deployment version, either Community or Enterprise edition. Different versions will affect the image tag, enabled features, and dependencies.

1.2 Global Ingress

global:
ingress:
domain: "example.com"
useTop: false
tls:
enabled: false
secretName: "<kubernetes tls secret name>"
service:
type: "LoadBalancer" or "NodePort"

Explanation:

  • domain

The base domain used by the CSGHub system. All services will generate their final access domains based on this.

  • useTop

    Whether to directly use the top-level domain.

    • true: The main CSGHub service directly uses the top-level domain (e.g., example.com).

    • false: Services use subdomains (default csghub.example.com).

  • tls

    • enabled
      • true:To enable HTTPS encrypted access, you must provide a secretName.
      • false: HTTPS encrypted access is not enabled.
  • service

    • type

      Specifies how the csghub Ingress is exposed externally. Optional values are LoadBalancer or NodePort.

      Note: This value needs to be specified during deployment. Modifying the svc type after deployment will affect access.

      This field is referenced via a YAML anchor (&type), which will affect ingress-nginx.controller.service.type.

Priority:

global.ingress < ingress < <service>.ingress

1.3 Global Image

global:
image:
registry: "opencsg-registry.cn-beijing.cr.aliyuncs.com"
tag: "v1.11.0"
pullPolicy: "IfNotPresent"
pullSecrets:
- acr-pull-secret
  • registry

    This parameter overrides the image repositories for all images in the Helm chart. Modifying it is generally not recommended, as it defaults to relying on the original image repository of the image to pull relevant images. This might be docker.io, etc. If used in China, it can be set to opencsg-registry.cn-beijing.cr.aliyuncs.com.

  • tag

    This is used to define the version number of the csghub image. If the image namespace belongs to opencsghq, the template will automatically complete identifiers such as edition based on whether the tag is compliant. For example, the tag in the example will be output as v1.11.0-ce/v1.11.0-ee during actual rendering.

  • pullPolicy

    Image pull policy.

  • pullSecrets

    Configure the pull key to pull images from the private image repository.

Priority:

global.image < image < <service>.image

1.4 Global Persistence

global:
persistence:
storageClass: "hostpath"
accessModes: ["ReadWriteOnce"]
size: "10Gi"
  • storageClass

    The default storage class used by all StatefulSets.

  • accessModes

    The default access modes used by all StatefulSets.

  • Size

    The default storage volume size used by all StatefulSets when creating PVCs.

Priority:

global.persistence < <service>.persistence

1.5 PostgreSQL、Redis、Mongo、Object Storage、Registry 等 external 配置

Each component allow:

<service>:
enabled: true or false
external: {}
  • enabled

    • true: Enables the built-in service component; in this case, the external configuration will not take effect.
  • external

    When enabled is set to false, the connection information for the corresponding external service component is set via external.

Priority:

global.service < <service>.service

1.6 Global ChartContext

chartContext:
isBuiltIn: true
  • isBuitIn

    The default value is true. Its main purpose is to achieve seamless integration of dataflow, runner, and csgship chart, and to indicate whether the chart is deployed independently or bundled with the csghub main service.


2. CSGHub Core Configuration

2.1 Image(Core service image)

image:
registry: "opencsg-registry.cn-beijing.cr.aliyuncs.com"
tag: "v1.11.0"
pullPolicy: "IfNotPresent"
pullSecrets:
- acr-pull-secret

This parameter functions similarly to global.image, but its scope is limited to services started using the csghub-server/csghub-portal image.

Priority:

global.image < image(Here) < <service>.image

2.2 Logging

logging:
level: "info" or "warning" or "debug" or "error"

Used to set the log level for all services started from the csghub-server image. Global control over log levels.

3. Portal

3.1 Image

portal:
repository: "opencsghq/csghub-portal"

Other configurations can be ignored; they are inherited from global.image and image.

Priority:

global.image < image < portal.image

3.2 Ingress

Without going into details, it functions the same as global.ingress, but it cannot declare ingress.service.type, and all other parameters have higher priority than global.image.

3.3 Docs

portal:
docs:
domain: "docs.example.com"

or

portal:
docs:
host: "192.168.18.19"
port: 8003

This configuration is used to link the CSGHub document center to an externally deployed document instance (CSGHub does not have a built-in document center).

Currently, two configuration methods are provided (choose one):

  • domain

Specifies the domain name of the deployed external document center instance.

  • host and port

If no domain name is configured, you can directly specify the host and port of the document center instance.

3.4 PostgreSQL

portal:
postgresql:
host: "<postgresql host>"
port: "<postgreql port>"
database: "<postgresql csghub portal database>"
user: "<postgresql user>"
password: "<postgresql password>"
timezone: "Etc/UTC"
sslmode: "prefer"

This parameter defines the database connection information for the Portal. Compared to global.postgresql.external, it includes a database parameter. Because this parameter cannot be specified globally, using the same database for all components is discouraged, and Helm charts haven't internally adapted for it.

Standard parameter settings are not detailed here.

Priority:

global.postgresql.external < portal.postgresql

3.5 ObjectStore

portal:
objectStore:
endpoint: "<object store endpoint>"
accessKey: "<object store access key>"
secretKey: "<object store secret key>"
bucket: "<object store public bucket>"
region: "<object store region>"
secure: "<object store tls>"
encrypt: "<object store server encrypt>"
pathStyle: "<object store path style>"

The object storage connection information used to define the Portal has an additional bucket parameter compared to global.objectStore.external. Because this parameter cannot be specified globally, it is not recommended for all components to use the same database, and Helm charts have not adapted it internally.

Priority:

global.objectStore.external < portal.objectStore

4. Server

4.1 gitlabShell

server:
gitlabShell:
sshPort: 22

This defines the port number for the SSH service when cloning using git over ssh. The default port is 22 in LoadBalancer mode and 30022 in NodePort mode. Modifying this port is generally not recommended, as it involves adjusting the Ingress Controller's TCP exposure rules.

4.2 multiSync

server: 
multiSync:
enabled: true
proxy: "<proxy to connect internet>"
  • enabled

    Defaults to true, indicating that multi-source synchronization is enabled.

  • proxy

    Defaults to nil, used to specify the network proxy used to connect to the Internet during multi-source synchronization.

4.3 SwaggerAPI

server:
swaggerAPI
enabled: false
  • enabled

    The default value is false, which disables the Swagger API helper instance.

5. RProxy

rproxy:
coredns:
enabled: true
image:
repository: "coredns/coredns"
tag: "1.11.1"
nginx:
enabled: true
image:
repository: "nginx"
tag: "latest"

This section will not be explained in detail. Coridns and Nginx were components used in versions prior to v1.12.0 to assist rproxy in traffic forwarding. Starting with v1.12.0, these two components are deprecated and no longer used.

6. Notifier

6.1 SMTP

notifier:
smtp:
host: "<smtp host>"
port: "<smtp port>"
username: "<smtp username>"
password: "<smtp password>"

Configure the notifier mail server.

6.2 FeiShu

notifier:
feiShu:
appId: "<feishu app id>"
appSecret: "<feishu app secret>"

Configure the notifier to send notifications to Lark.

7. Runner(Chart built-in)

此部分配置会直接传递到 Runner 子 Chart。

  • region

    Default: region-0. Used to identify the cluster where the runner resides, e.g., "cn-north". Custom formats and rules are available.

  • interval

    Default: 60 seconds. The time interval between Runner reports information to CSGHub.

  • namespace

    Default: spaces. The Kubernetes namespace used for deploying inference, fine-tuning, and application spaces.

  • autoConfigure

    Default: true. Specifies whether to automatically configure dependent components such as Knative Serving, Argo Workflow, and LeaderWorkSet. These components are essential for inference, fine-tuning, model evaluation, application spaces, and MCP.

  • mergingNamespace

    Default: disable. By default, with autoConfigure enabled, different types of components will automatically create different Kubernetes namespaces. This parameter allows for appropriate namespace merging.

    • disable

      Do not perform any namespace merging.

    • multi

      Merge namespaces appropriately.

    • single

      Merge all resources into a single namespace (not recommended).

  • kymlMode

    Default: create. Used for maintaining resources created by autoConfigure.

    • create

      Create only. Skip if the resource already exists.

    • update

      Update resources using Apply mode.

    • replace

      Force replacement of resources, deleting and then recreating them.

  • userPublicDomain

    Default: true. Specifies the method for accessing inference, fine-tuning, application space, and other instances.

    • true: Indicates using a separate domain name.
    • false: Uses subPath access, which may restrict the use of application space, MCP, and other features.
  • pipIndexUrl

    Default: https://pypi.tuna.tsinghua.edu.cn/simple/. Defines the PyPi source used when building the application space image.

  • extraBuildArgs

    Default: nil. Used to specify more parameters when building images with Kaniko.

  • modelRegistry

    Default: nil. Specifies the container image repository from which to pull images for a specified architecture when starting an inference instance. OpenCSG ACR is used by default.

  • knative.serving.domain

    Default: example.com. Defines the default internal domain name for exposing the ksvc service. No DNS resolution configuration is required; it is used only for internal routing.

  • rbac

    • create

      Default: true. Specifies whether to create the Kubernetes permissions required for runner creation and related resources.

  • logcollector

    • enabled

      Default: false. Specifies whether to enable the logcollector service. This service needs to be enabled if you want to retain ksvc instance logs for the past 7 days.

    • loki.address

      Default: nil. Defines the address of the loki service for storing logs. If not set, the csghub loki instance is used by default.

8. Dataflow(Chart built-in)

Data processing tool. Dataflow Helm Chart can be deployed independently or bundled with CSGHub Helm Chart (by setting .Values.global.chartContext: true). This chart includes the following components:

  • dataflow

  • label studio

  • Celery worker

  • PostgreSQL (disabled by default when bundled)

  • Redis (disabled by default when bundled)

  • MongoDB

  • Ingress-nginx (disabled by default when bundled)

  • Prometheus (disabled by default when bundled)

Enabled via dataflow.enabled. It is installed bundled with CSGHub Helm Chart by default and requires no additional configuration. Currently, customizable settings are as follows:

dataflow:
enabled: true
dataflow:
image: {}
postgresql: {}
redis: {}
mongo: {}
persistence: {}
labelStudio:
image: {}
postgresql: {}
persistence: {}

All parameter definition rules are the same as those described above.

9. csgship(内建子Chart)

AI-assisted coding assistant backend service. CSGShip Helm Chart can be deployed independently or bundled with CSGHub Helm Chart (by setting .Values.global.chartContext: true). This Chart contains the following components:

  • agentic
  • billing
  • casdoor
  • frontend
  • megalinter-server
  • megalinter-worker
  • postgresql
  • redis
  • secscan
  • web

All components require no special or additional configuration.

10. Other Services

Besides the csghub-server service, the following derivative services are based on the same image:

  • accounting

  • user

  • dataviewer

  • mirror

  • temporalWorker

  • gateway

They have extremely similar configuration parameters for image, PostgreSQL, Redis, etc., and by default inherit all parameters from csghub-server. Common custom parameters are mostly passed in proprietary environments settings.

11. Third-party Built-In Components

11.1 PostgreSQL

11.1.1 Databases

postgresql:
databases:
- "csghub_casdoor"
- "csghub_temporal"
- "csghub_server"
- "csghub_portal"
- "csghub_dataflow"
- "csghub_label_studio"
- "csghub_csgship"

Defines the data created during database initialization; only valid during database initialization.

11.1.2 Parameters

postgresql:
parameters:
max_connections: 200
......

This parameter is used to customize database parameters. By default, the database starts with all parameters at their default values, but you can optimize them using this parameter.

11.1.3 Other Configuration

The details are not elaborated here, as they are all general configurations.

11.2 Redis

11.2.1 requirePass

redis:
requirePass: false

This parameter is not enabled by default. It can be enabled if csgship is not running. Currently, csgship does not support password verification for Redis.

11.3 MinIO

11.3.1 Console

minio:
console:
enabled: true
service:
port: 9001
protocol: "TCP"

Define whether the port for the MinIO service UI is enabled, etc.

11.3.2 Region

minio:
region: "cn-north-1"

Define the default region for minio.

11.3.3 Buckets

minio:
buckets:
- name: "csghub-registry"
policy: "none" # Access policy: none, download, public
- name: "csghub-billing"
policy: "none"
- name: "csghub-server"
policy: "none"
- name: "csghub-portal"
policy: "none"
- name: "csghub-portal-public"
policy: "download"
- name: "csghub-runner"
policy: "none"

Defines the bucket to be created. Unlike postgresql.databases, this parameter can be modified after startup and always checks if the bucket has been created.

  • name

    The bucket name

  • policy

    The bucket access policy.

    • none Default value, i.e., private

    • download Allows read-only access

    • public Allows public read and write access

11.3.4 Other Configuration

The rest are all standard configurations, which will not be elaborated here.

11.4 Registry

Unless there are any special configuration requirements, they will not be elaborated here.

11.5 Gitaly

11.5.1 Storage

gitaly:
storage: "default"

The default name for the gitaly storage.

11.5.2 Other Configuration

The rest are all standard configurations, which will not be elaborated here.

11.6 GitlabShell

11.6.1 RBAC

gitlabShell:
rbac:
create: true

Whether to create RBAC permissions. The primary user creates a key pair containing SSH keys to verify git over SSH operations.

11.6.2 Other Configuration

The rest are all standard configurations, which will not be elaborated here.

11.7 Nats

The rest are all standard configurations, which will not be elaborated here.

11.8 Casdoor

The rest are all standard configurations, which will not be elaborated here.

11.9 Temporal

11.9.1 Console

temporal:
enabled: false

Whether to enable the temporal console UI. It is disabled by default. Due to OAuth settings, if the UI is enabled by default and the Casdoor service is not ready, the entire Temporal service will fail to function. Therefore, if you want to enable this service, please ensure that your Casdoor service is ready (i.e., accessible via Ingress).

11.9.2 Other Configuration

Unless there are any special configuration requirements, they will not be elaborated here.

12 Third-party Dependencies

The following components are for illustrative purposes only. In actual use, they generally do not need to be modified; the default configuration is sufficient.

12.1 ingress-nginx

The default ingress controller is used. Its enabling/disabling function can be controlled via ingress-nginx.enabled.

Please do not modify the default configuration, as this may cause service access errors.

12.2 fluentd

The initial default collection tool is now disabled by default, but we may consider removing it later.

12.3 loki

Log storage and query engine. Due to low usage intensity, a minimal deployment was adopted here.

12.3.1 Ingress

loki:
ingress:
enabled: false
basicAuth: {}
# username: ""
# password: ""

If the logcollector in loki and the runner chart are not the same instance, you need to enable loki ingress. However, logcollector does not currently support basicAuth authentication.

12.4 Tempo

Trace is a log collection tool. Enabling it will significantly impact performance. It is not recommended to enable it unless there is a specific need.

12.5 Prometheus

This is used to collect background analysis data for inference, fine-tuning, and other instances; it is disabled by default.