mateuf.blogg.se

Shipping container construction carnegie mellon
Shipping container construction carnegie mellon





shipping container construction carnegie mellon

We have built TGS and integrated it with Docker and Kubernetes. It ensures that production jobs are not greatly affected by opportunistic jobs on shared GPUs. TGS leverages adaptive rate control and transparent unified memory to simultaneously achieve high GPU utilization and performance isolation. Transparency allows users to use any software to develop models and run jobs in their containers. In stark contrast to recent application-layer solutions for GPU sharing, TGS operates at the OS layer beneath containers. We present TGS (Transparent GPU Sharing), a system that provides transparent GPU sharing to DL training in container clouds. As a result, GPU clusters have low GPU utilization, which leads to a long job completion time because of queueing.

shipping container construction carnegie mellon

Due to the diverse resource demands of DL jobs in production, a significant number of GPUs are underutilized. A common practice to support deep learning (DL) training in container clouds is to statically bind GPUs to containers in entirety. Containers are widely used for resource management in datacenters.







Shipping container construction carnegie mellon