Downloading Datasets
If you want to get and download the datasets on CSGHub, we currently support downloading datasets via Git, web interface, command line and SDK. Below are the detailed steps for each method:
Downloading Datasets Using Git
- Downloading dataset repositories using HTTP:
git lfs install
git clone http://101.200.14.180/datasets/opencsg/dataset123.git
- Downloading dataset repositories using SSH:
git lfs install
git clone ssh://git@localhost:2222/datasets/opencsg/dataset123.git
Note: You need to add your SSH public key to your user settings in order to push changes or access private repositories. Click on "Account Settings" in the top right corner and go to "SSH Keys" to add your public key.
Downloading Files Using Web Interface
Click the download button under the Files tab to download the file directly.
Downloading Files Using Command Line
Use command line tool csghub-cli
to download data easily, the installation method is as follows:
pip install csghub-sdk
Here is an example of how to download a model:
export CSG_TOKEN=your_access_token
# donwload dataset
csghub-cli download demo/test_dataset -t dataset
Downloading Files Using SDK
CSGHub SDK Provide a Python Libaray,you can download files by code.
Here is an example of how to download a model:
from pycsghub.snapshot_download import snapshot_download
token="xxxx"
endpoint = "https://hub.opencsg.com"
repo_id = 'AIWizards/tmmluplus'
repo_type="dataset"
cache_dir = '/Users/xiangzhen/Downloads/'
result = snapshot_download(repo_id, repo_type=repo_type, cache_dir=cache_dir, endpoint=endpoint, token=token)
Multi-source Synchronization of Datasets
In the open-source version of CSGHub, you can browse datasets from the remote OpenCSG community. By entering a project and clicking the sync
button, you can quickly synchronize the dataset to your local server.
For more details, refer to the Multi-source Synchronization of Models section.
Check the video tutorial for more details: