XNet Smart Trunk Accelerator

Introduction to XNet

In the past, we primarily relied on Git LFS (Large File Storage) to manage large files. While it effectively addressed Git’s limitations with big binaries, it also came with clear drawbacks: every read or write operation had to be performed on the entire file, with no ability to load only part of it. As models and datasets continue growing in size, this “all-or-nothing” approach becomes increasingly inefficient and significantly slows down collaboration and iteration.

To better meet the needs of modern AI/ML workflows—characterized by massive models, large-scale datasets, and the need for efficient random access—we introduced XNet, a high-performance storage backend built specifically for contemporary AI development. It supports fine-grained data access, intelligent caching, and parallel transmission. While delivering substantial improvements in storage efficiency and developer experience, it remains fully compatible with existing toolchains. (Note: Git LFS remains supported to ensure a smooth transition for legacy projects.)

Core Capabilities

Unlike traditional large-file solutions such as Git LFS, XNet is designed from the ground up for AI/ML workflows. This brings several significant capability upgrades:

1. Intelligent Deduplication — Saves Space and Accelerates Transfers

XNet automatically splits files into appropriately sized chunks based on their content and performs platform-wide deduplication. This leads to:
- Extremely small incremental uploads between different versions of a model or dataset
- Significant storage savings when reusing identical data across repositories or projects
- Faster and more efficient uploads, both for initial files and subsequent updates

2. Second-Level Incremental Updates — No More Full-File Reuploads

Unlike LFS, which requires reuploading the full file, XNet uploads only the chunks that have changed. For frequently updated models or datasets, upload time can be reduced to seconds, dramatically improving development and experimentation efficiency.

3. High-Speed Parallel Downloads — Designed for Training and Inference

With parallel and streaming-based block downloads, XNet fully utilizes available bandwidth. This makes it ideal for training clusters, inference services, and distributed workloads—ensuring data becomes “ready in seconds.”

4. Reliable, Verifiable, Shareable Data Integrity

Each chunk is uniquely identified using a cryptographic hash, providing:
- End-to-end integrity checks to prevent silent data corruption
- Efficient local caching and sharing across multiple clients or nodes
- Fast synchronization and strong consistency across multi-region or multi-datacenter environments

Together, these capabilities form the core of XNet’s advantage: it is no longer just a “large-file storage solution,” but a new high-performance, highly efficient, and highly reliable infrastructure layer for managing models and data in the AI era.

How to Use XNet

Install csghub-sdk version 0.8.0 or higher via PyPI.

Install csghub-sdk

pip install -U csghub-sdk

Set the CSGHUB_TOKEN Environment Variable

export CSGHUB_TOKEN="your_csghub_token"

Upload Files

csghub-cli upload {user}/{repoName} {local_dir}    # Upload the entire directory
csghub-cli upload {user}/{repoName} {fileName}     # Upload a single file

Example:

csghub-cli upload demo/test1 Electric_Vehicle_Population_Data.zip

Download Files

csghub-cli download {user}/{repoName}              # Download the entire repository
csghub-cli download {user}/{repoName} {fileName}   # Download a single file

Example:

csghub-cli download demo/test1
csghub-cli download demo/test1 Electric_Vehicle_Population_Data.zip

Introduction to XNet​

Core Capabilities​

1. Intelligent Deduplication — Saves Space and Accelerates Transfers​

2. Second-Level Incremental Updates — No More Full-File Reuploads​

3. High-Speed Parallel Downloads — Designed for Training and Inference​

4. Reliable, Verifiable, Shareable Data Integrity​

How to Use XNet​