XNet Smart Trunk Accelerator
Introduction to XNet
In the past, we primarily relied on Git LFS (Large File Storage) to manage large files. While it effectively addressed Git’s limitations with big binaries, it also came with clear drawbacks: every read or write operation had to be performed on the entire file, with no ability to load only part of it. As models and datasets continue growing in size, this “all-or-nothing” approach becomes increasingly inefficient and significantly slows down collaboration and iteration.
To better meet the needs of modern AI/ML workflows—characterized by massive models, large-scale datasets, and the need for efficient random access—we introduced XNet, a high-performance storage backend built specifically for contemporary AI development. It supports fine-grained data access, intelligent caching, and parallel transmission. While delivering substantial improvements in storage efficiency and developer experience, it remains fully compatible with existing toolchains. (Note: Git LFS remains supported to ensure a smooth transition for legacy projects.)
Core Capabilities
Unlike traditional large-file solutions such as Git LFS, XNet is designed from the ground up for AI/ML workflows. This brings several significant capability upgrades:
1. Intelligent Deduplication — Saves Space and Accelerates Transfers
- XNet automatically splits files into appropriately sized chunks based on their content and performs platform-wide deduplication. This leads to:
- Extremely small incremental uploads between different versions of a model or dataset
- Significant storage savings when reusing identical data across repositories or projects
- Faster and more efficient uploads, both for initial files and subsequent updates
2. Second-Level Incremental Updates — No More Full-File Reuploads
- Unlike LFS, which requires reuploading the full file, XNet uploads only the chunks that have changed. For frequently updated models or datasets, upload time can be reduced to seconds, dramatically improving development and experimentation efficiency.
3. High-Speed Parallel Downloads — Designed for Training and Inference
- With parallel and streaming-based block downloads, XNet fully utilizes available bandwidth. This makes it ideal for training clusters, inference services, and distributed workloads—ensuring data becomes “ready in seconds.”
4. Reliable, Verifiable, Shareable Data Integrity
- Each chunk is uniquely identified using a cryptographic hash, providing:
- End-to-end integrity checks to prevent silent data corruption
- Efficient local caching and sharing across multiple clients or nodes
- Fast synchronization and strong consistency across multi-region or multi-datacenter environments
Together, these capabilities form the core of XNet’s advantage: it is no longer just a “large-file storage solution,” but a new high-performance, highly efficient, and highly reliable infrastructure layer for managing models and data in the AI era.
How to Use XNet
CSGHub continues to maintain compatibility with Hugging Face’s command-line tooling. Users who already have the HF CLI installed can immediately try out XNet features on the OpenCSG platform.
The csghub-sdk CLI is currently under testing and will soon provide full support for XNet storage.
- Install huggingface_hub By using huggingface_hub for uploading and downloading files, first need install huggingface_hub:
pip install -U huggingface_hub
- Set the HF_ENDPOINT Environment Variable
export HF_ENDPOINT="https://hub.opencsg.com/hf"
- Log In
hf auth login
When prompted for a token, use your Access Token, which can be found in your account settings.
- Upload Files
hf upload {user}/{repoName} # Upload the entire repository
hf upload {user}/{repoName} {fileName} # Upload a single file
Example:
hf upload demo/test1 Electric_Vehicle_Population_Data.zip
- Download Files
hf download {user}/{repoName} # Download the entire repository
hf download {user}/{repoName} {fileName} # Download a single file
Example:
hf download demo/test1
hf download demo/test1 Electric_Vehicle_Population_Data.zip