Pytorch
We believe, that there is no need to introduce you to PyTorch.
If you need any information, please refer to the official website
How to run it​
Don't forget to mount your data volume at the start.
The amount of CPU and RAM depends on the type and quantity of the chosen GPU.
It should be at least RAM ⩾ sum(vRAM) + 2GB
- but remember, this is only a recommendation, you can always start small and grow with your problem.
cgc compute create --name torch01 -c 40 -m 1024 -g 8 -gt A100 -v data_volume nvidia-pytorch
Shared Memory​
Shared memory is a critical parameter when utilizing multiprocessing, particularly in scenarios such as employing methods like nn.DataParallel
, which rely on shared memory for inter-process communication.
Default Allocation​
By default, the allocated shared memory is set to 64MB, which might be insufficient when employing multiple workers concurrently.
Optimizing for AI Model Training​
For training AI models, it is recommended to increase the allocated shared memory. To accomplish this, include the --shm
flag when executing the cgc create
command.
Syntax for Modifying Shared Memory Allocation​
The basic syntax to modify the shared memory allocation is as follows:
Shm parameter requires value of <size_in_GB< in gigabytes.
cgc compute create --name torch01 -c 40 -m 1024 -g 8 -gt A100 -v data_volume nvidia-pytorch --shm 1