Skip to main content

Downloading Datasets

If you want to get and download the datasets on CSGHub, we currently support downloading datasets via Git, web interface, command line and SDK. Below are the detailed steps for each method:

Downloading Datasets Using Git

  • Downloading dataset repositories using HTTP:
git lfs install
git clone https://www.opencsg.com//datasets/demo/test_dataset_1.git
  • Downloading dataset repositories using SSH:
git lfs install
git clone git@hub.opencsg.com:datasets_demo/test_dataset_1.git

You will need to add your SSH public key to your user settings to push changes or access private repositories.

Downloading Files Using Web Interface

Click the download button under the Files tab to download the file directly.

Download file

Downloading Files Using Command Line

Use command line tool csghub-cli to download data easily, the installation method is as follows:

pip install csghub-sdk

Here is an example of how to download a model:

export CSG_TOKEN=your_access_token

# donwload dataset
csghub-cli download demo/test_dataset -t dataset

Downloading Files Using SDK

CSGHub SDK Provide a Python Libaray,you can download files by code.

Here is an example of how to download a model:

from pycsghub.snapshot_download import snapshot_download
token="xxxx"
endpoint = "https://hub.opencsg.com"
repo_id = 'AIWizards/tmmluplus'
repo_type="dataset"
cache_dir = '/Users/xiangzhen/Downloads/'
result = snapshot_download(repo_id, repo_type=repo_type, cache_dir=cache_dir, endpoint=endpoint, token=token)