Configuration Explanation
Note: This document only explains the important parameters. Some parameters, due to their similar or identical functions, are not elaborated upon here.
All non-third-party services can customize the service name
<service>.nameto change the service name and the domain name exposed by the gateway API (except for the portal's main domain).All non-third-party services can define deployments and most deployment-related attributes of the StatefulSet, such as labels, annotations, replicas, serviceAccount, environments, resources, volumeMounts, livenessProbe, readinessProbe, startupProbe, lifecycle, stdin, tty, volumes, nodeSelector, tolerations, affinity, securityContext, etc.
This document explains the important configuration items in the CSGHub Helm Chart, including global configuration, core service configuration, built-in components, and the priority of each parameter. It is suitable for deployment, operation and maintenance, and secondary development personnel.
1. Global Parameter
1.1 Release edition
global:
edition: "ce" or "ee"
This specifies the deployment version, either Community or Enterprise edition. Different versions will affect the image tag, enabled features, and dependencies.
1.2 Global Gateway API
global:
gateway:
external:
domain: "csghub.example.com"
# public: "public.example.com"
tls:
enabled: false
secretName: "<kubernetes tls secret name>"
service:
type: "LoadBalancer" or "NodePort"
nodePorts:
http: 30080
https: 30443
gitssh: 30022
-
gateway
-
external
-
domain
Default:
csghub.example.com.The domain name used for external access to CSGHub. This differs from previous versions; the base domain is no longer used.
- For example: If access is ultimately via http://csghub.example.com, configure
csghub.example.comhere. - Other service domains are generated based on the base domain of the current domain, such as
casdoor.example.com. - If the specified domain is a subdomain, such as
example.com, it will ultimately referencecsghub.example.com, automatically appending a prefix.
- For example: If access is ultimately via http://csghub.example.com, configure
-
public
Default:
nil.When accessing instances such as inference, fine-tuning, MCP, and SPACE using independent domains, an independent domain name needs to be assigned to each service.
The default is
<domain>. If you need to modify it, please specify a separate domain name, such aspublic.exmaple.com. It will ultimately be referenced as a wildcard domain.You cannot directly specify a wildcard domain here.
-
-
tls
-
enabled
Default
false.- true: Enables HTTPS encrypted access. To enable HTTPS, you must provide
secretName. - false: Disables HTTPS encrypted access.
- true: Enables HTTPS encrypted access. To enable HTTPS, you must provide
-
secretName
Default
nil.Specify the domain certificate to use. Ensure the domain certificate contains at least the
publicwildcard domain. For example:- If
publicis not specified, anddomainis specified ascsghub.example.com, then the domain certificate must contain at least*.csghub.example.com. - If
publicis specified aspublic.example.com, then the domain certificate must contain at least*.public.example.com.
- If
-
-
service
-
type
Default
LoadBalancer.Specifies the external exposure method for the Gateway API Controller. Optional values are
LoadBalancerorNodePort.Note: This value needs to be specified during deployment. Modifying the
svctype after deployment may affect access. -
nodePorts (only valid for type = NodePort)
-
http
Default
30080.Specifies the NodePort corresponding to port 80.
-
https
Default
30443.Specifies the NodePort corresponding to port 443.
-
gitssh
Default
30022.Specifies the NodePort corresponding to port 22.
-
-
-
Priority:
global.gateway < gateway < <service>.gateway
1.3 Global Image
global:
imageRegistry: "opencsg-registry.cn-beijing.cr.aliyuncs.com/opencsghq"
image:
registry: "opencsg-registry.cn-beijing.cr.aliyuncs.com"
tag: "v1.11.0"
pullPolicy: "IfNotPresent"
pullSecrets:
- acr-pull-secret
-
imageRegistry
Default
docker.io.Specifies the global mirror repository for EnvoyGateway. For use within China, it can be set to
opencsg-registry.cn-beijing.cr.aliyuncs.com/opencsghq. -
registry
Default
docker.io.This parameter overrides the image repositories for all images in the Helm chart. Modifying it is generally not recommended, as it defaults to relying on the original image repository of the image to pull relevant images. This might be docker.io, etc. If used in China, it can be set to
opencsg-registry.cn-beijing.cr.aliyuncs.com(envoyGateway is not subject to this parameter; please refer toimageRegistry.). -
tag
This is used to define the version number of the csghub image. If the image namespace belongs to opencsghq, the template will automatically complete identifiers such as edition based on whether the tag is compliant. For example, the tag in the example will be output as
v1.11.0-ce/v1.11.0-eeduring actual rendering. -
pullPolicy
Default
IfNotPresent.Image pull policy.
-
pullSecrets
Default
nil.Configure the pull key to pull images from the private image repository.
Priority:
global.image < image < <service>.image
1.4 Global Persistence
global:
persistence:
storageClass: "hostpath"
accessModes: ["ReadWriteOnce"]
size: "10Gi"
-
storageClass
Default
nil.The default storage class used by all StatefulSets.
-
accessModes
Default
ReadWriteOnce.The default access modes used by all StatefulSets.
-
Size
Default
10Gi.The default storage volume size used by all StatefulSets when creating PVCs.
Priority:
global.persistence < <service>.persistence
1.5 PostgreSQL、Redis、Mongo、Object Storage、Registry 等 external 配置
Each component allow:
<service>:
enabled: true or false
external: {}
-
enabled
- true: Enables the built-in service component; in this case, the external configuration will not take effect.
- false:Disable the built-in service component.
-
external
When enabled is set to false, the connection information for the corresponding external service component is set via external.
Priority:
global.service < <service>.service
1.6 Global ChartContext
chartContext:
isBuiltIn: true
-
isBuitIn
Default
true.Its main purpose is to achieve seamless integration of dataflow, runner, and csgship chart, and to indicate whether the chart is deployed independently or bundled with the csghub main service.
2. CSGHub Core Configuration
2.1 Image(Core service image)
image:
registry: "opencsg-registry.cn-beijing.cr.aliyuncs.com"
tag: "v1.11.0"
pullPolicy: "IfNotPresent"
pullSecrets:
- acr-pull-secret
This parameter functions similarly to global.image, but its scope is limited to services started using the csghub-server/csghub-portal image.
Priority:
global.image < image(Here) < <service>.image
2.2 Logging
logging:
level: "info" or "warning" or "debug" or "error"
Default info.
Used to set the log level for all services started from the csghub-server image. Global control over log levels.
3. Portal
3.1 Image
portal:
repository: "opencsghq/csghub-portal"
Other configurations can be ignored; they are inherited from global.image and image.
Priority:
global.image < image < portal.image
3.2 Gateway API
Without going into details, it functions the same as global.gateway, but it cannot declare gateway.service.type, and all other parameters have higher priority than global.image.
3.3 Docs
portal:
docs:
domain: "docs.example.com"
or
portal:
docs:
host: "192.168.18.19"
port: 8003
This configuration is used to link the CSGHub document center to an externally deployed document instance (CSGHub does not have a built-in document center).
Currently, two configuration methods are provided (choose one):
- domain
Specifies the domain name of the deployed external document center instance.
- host and port
If no domain name is configured, you can directly specify the host and port of the document center instance.
3.4 PostgreSQL
portal:
postgresql:
host: "<postgresql host>"
port: "<postgreql port>"
database: "<postgresql csghub portal database>"
user: "<postgresql user>"
password: "<postgresql password>"
timezone: "Etc/UTC"
sslmode: "prefer"
This parameter defines the database connection information for the Portal. Compared to global.postgresql.external, it includes a database parameter. Because this parameter cannot be specified globally, using the same database for all components is discouraged, and Helm charts haven't internally adapted for it.
Standard parameter settings are not detailed here.
Priority:
global.postgresql.external < portal.postgresql
3.5 ObjectStore
portal:
objectStore:
endpoint: "<object store endpoint>"
accessKey: "<object store access key>"
secretKey: "<object store secret key>"
bucket: "<object store public bucket>"
region: "<object store region>"
secure: "<object store tls>"
encrypt: "<object store server encrypt>"
pathStyle: "<object store path style>"
The object storage connection information used to define the Portal has an additional bucket parameter compared to global.objectStore.external. Because this parameter cannot be specified globally, it is not recommended for all components to use the same database, and Helm charts have not adapted it internally.
Priority:
global.objectStore.external < portal.objectStore
4. Server
4.1 gitlabShell
server:
gitlabShell:
sshPort: 22
This defines the port number for the SSH service when cloning using git over ssh. The default port is 22 in LoadBalancer mode and 30022 in NodePort mode. Modifying this port is generally not recommended, as it involves adjusting the gateway API Controller's TCP exposure rules.
4.2 multiSync
server:
multiSync:
enabled: true
proxy: "<proxy to connect internet>"
-
enabled
Default
true.Indicating that multi-source synchronization is enabled.
-
proxy
Default
nil.Used to specify the network proxy used to connect to the Internet during multi-source synchronization.
4.3 SwaggerAPI
server:
swaggerAPI:
enabled: false
-
enabled
Default
false.Which disables the Swagger API helper instance.
5. RProxy
rproxy:
coredns:
enabled: true
image:
repository: "coredns/coredns"
tag: "1.11.1"
nginx:
enabled: true
image:
repository: "nginx"
tag: "latest"
This section will not be explained in detail. Coridns and Nginx were components used in versions prior to v1.12.0 to assist rproxy in traffic forwarding. Starting with v1.12.0, these two components are deprecated and no longer used.
6. Notifier
6.1 SMTP
notifier:
smtp:
host: "<smtp host>"
port: "<smtp port>"
username: "<smtp username>"
password: "<smtp password>"
Configure the notifier mail server.
6.2 FeiShu
notifier:
feiShu:
appId: "<feishu app id>"
appSecret: "<feishu app secret>"
Configure the notifier to send notifications to Lark.
7. Runner(Chart built-in)
此部分配置会直接传递到 Runner 子 Chart。
-
region
Default
region-0.Used to identify the cluster where the runner resides, e.g., "cn-north". Custom formats and rules are available.
-
interval
Default
60, unit seconds.The time interval between Runner reports information to CSGHub.
-
namespace
Default
spaces.The Kubernetes namespace used for deploying inference, fine-tuning, and application spaces.
-
autoConfigure (Deprecated)
Default
true.Specifies whether to automatically configure dependent components such as Knative Serving, Argo Workflow, and LeaderWorkSet. These components are essential for inference, fine-tuning, model evaluation, application spaces, and MCP.
-
mergingNamespace
Default
disable.By default, with autoConfigure enabled, different types of components will automatically create different Kubernetes namespaces. This parameter allows for appropriate namespace merging.
-
disable
Do not perform any namespace merging.
-
multi
Merge namespaces appropriately.
-
single
Merge all resources into a single namespace (not recommended).
-
-
kymlMode (Deprecated)
Default
create.Used for maintaining resources created by autoConfigure.
-
create
Create only. Skip if the resource already exists.
-
update
Update resources using Apply mode.
-
replace
Force replacement of resources, deleting and then recreating them.
-
-
userPublicDomain
Default
true.Specifies the method for accessing inference, fine-tuning, application space, and other instances.
- true: Indicates using a separate domain name.
- false: Uses subPath access, which may restrict the use of application space, MCP, and other features.
-
pipIndexUrl
Default
https://pypi.tuna.tsinghua.edu.cn/simple/.Defines the PyPi source used when building the application space image.
-
extraBuildArgs
Default
nil.Used to specify more parameters when building images with Kaniko.
-
modelRegistry
Default
nil.Specifies the container image repository from which to pull images for a specified architecture when starting an inference instance. OpenCSG ACR is used by default.
-
knative.serving.domain
Default
example.com.Defines the default internal domain name for exposing the ksvc service. No DNS resolution configuration is required; it is used only for internal routing.
-
rbac
-
create
Default
true.Specifies whether to create the Kubernetes permissions required for runner creation and related resources.
-
-
logcollector
-
enabled
Default
false.Specifies whether to enable the logcollector service. This service needs to be enabled if you want to retain ksvc instance logs for the past 7 days.
-
loki.address
Default
nil.Defines the address of the loki service for storing logs. If not set, the csghub loki instance is used by default.
-
-
gpuModelLabel
-
typeLabel
Default
nvidia.com/gpu.product.The scheduler can use this field to determine the GPU type of a node, and thus schedule Pods that require a specific type of GPU.
-
capacityLabel
Default
nvidia.com/gpu.A Pod can reference this label in
resources.requestsorresources.limitsto request GPUs.
-
-
applicationEndpoint
Default
auto.By default,
http://kourier-internal.kourier-system.svc.cluster.local, which is the knative serving ingress exposure address, is automatically generated by the program. -
networkInterface
Default
nil.When using LWS to deploy multi-machine, multi-GPU inference instances, specify the communication network interface card.
-
storageClassName
Default
nil.Specify the storage class for persistent storage requests such as Space.
-
knative.serving.autoscaler
-
enableScaleToZero
Default
true。Enable automatic shutdown of KSVC service instance Pods.
-
scaleToZeroGracePeriod
Default
60m。Configure the tolerance time for automatic shutdown of KSVC service instance Pods.
-
8. Dataflow(Chart built-in)
Data processing tool. Dataflow Helm Chart can be deployed independently or bundled with CSGHub Helm Chart (by setting global.chartContext: true). This chart includes the following components:
-
dataflow
-
label studio
-
Celery worker
-
PostgreSQL (disabled by default when bundled)
-
Redis (disabled by default when bundled)
-
MongoDB
-
envoyGateway (disabled by default when bundled)
-
Prometheus (disabled by default when bundled)
Enabled via dataflow.enabled. It is installed bundled with CSGHub Helm Chart by default and requires no additional configuration. Currently, customizable settings are as follows:
dataflow:
enabled: true
dataflow:
image: {}
postgresql: {}
redis: {}
mongo: {}
persistence: {}
labelStudio:
image: {}
postgresql: {}
persistence: {}
All parameter definition rules are the same as those described above.