Skip to main content

Advanced Configuration

1. Configuration Description

By default, the containers started via Docker do not include the Runner or Dataflow services. However, they can be configured to interface with external corresponding services.

  • Runner: The deployment executor for CSGHub. It depends on a Kubernetes cluster and can only be deployed within a Kubernetes environment.
  • Dataflow: A dataset processing tool used for generating and cleaning datasets.

2. Configuring Dataflow

This component serves as the backend service for dataset processing. Due to its high architectural complexity, it currently remains a standalone deployment. You can connect to an independently deployed Dataflow instance by adding this container and defining server.dataflow.address.

The following deployment options are currently available:

3. Configuring Runner

Because the Runner is strictly coupled with Kubernetes, it is only provided as a Helm Chart deployment.

4. Support and Feedback

Submit your feedback or issues via the project repository: