Create a Personal Dataset
Before using DataFlow for data processing, you need to create a personal dataset on CSGHub. The dataset must be created via the backend API to ensure independent data management and smooth operations.
Register and Log In
Visit CSGHub and click the Login/Register
button in the top-right corner to log in or create an account.
Obtain Access Token
Click your avatar, go to Settings
, and generate an Access Token for API usage.
Create a Personal Dataset via API
Use Postman or command line to create a dataset through the API.
Below is an example using the curl
command. Replace "Your-Access-Token" and "Your Account Name" in the following command:
curl --location 'https://hub.opencsg.com/api/v1/datasets' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <Your-Access-Token>' \
--data '{
"default_branch": "main",
"description": "dataset examples",
"labels": "a",
"license": "MIT",
"name": "dataflow-dataset",
"namespace": "<Your Account Name>",
"nickname": "dataflow-dataset",
"private": false,
"readme": "dataflow datasets need to be refined"
}'
Upload Dataset Files
After creation, visit Profile
to view your dataset.
From the dataset details page, click Download Dataset
button to clone the repository. Then copy your local dataset files into the repository folder. For example:
cd dataflow-dataset
cp -rf /work/my_dataset_dir/* .
git add .
git commit -m "commit message"
git push origin main