aws/deep-learning-containers: AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet.

519
STARS
37
WATCHERS
293
FORKS
242
ISSUES

deep-learning-containers Recent Issues

Issue Title State Comments Created Date Updated Date
[feature-request] EKS test failures due to timeouts should report on cluster state and provide more info open 0 2022-09-01 2022-09-21
[feature-request] Add basic input validation to manual scripts used for EKS cluster setup open 0 2022-09-01 2022-09-21
Parameter model_name is required open 1 2022-09-01 2022-09-21
[feature-request] Expose pytest options in testrunner.py as CLI options open 0 2022-08-31 2022-09-21
[bug] Module import error using newer versions of pytest open 0 2022-08-31 2022-09-21
[bug] New PyTorch inference image leads to inference job failure open 0 2022-08-25 2022-09-21
[bug] Local testing in README.md not working with testrunner.py open 0 2022-08-23 2022-09-21
[feature-request] Add tests for running inference on EKS using KServe open 0 2022-08-22 2022-09-21
[feature-request] Add tests for using a single node with the KF training operator open 0 2022-08-22 2022-09-21
[pending-change] Update version of Kubeflow in installation scripts open 0 2022-08-22 2022-09-21
[bug] Huggingface model parallel training not working `smdistributed-modelparallel` open 0 2022-08-18 2022-09-18
[feature-request] Tensorflow2 support on NVIDIA Triton Inference Containers open 2 2022-08-18 2022-09-18
[feature-request] Support Python 3.9 open 1 2022-08-09 2022-09-15
config.properties for hugging face containers closed 1 2022-08-08 2022-09-24
[bug]Tensorflow Framework Container not working on Apple M1 open 0 2022-08-02 2022-09-15
[feature-request] Have `detectron2` managed in the DLC available images open 0 2022-08-02 2022-09-17
[bug] 'ImportError: libtorch_cuda_cu.so' while using detectron2 open 0 2022-08-02 2022-09-15
[bug] Cannot use HuggingFace Neuron with custom script closed 1 2022-07-29 2022-09-15
[bug]Multi-model endpoint creation fails on pytorch GPU inference images open 1 2022-07-28 2022-09-15
[bug] Neuron/Inferencia image does not have drivers closed 5 2022-07-13 2022-09-23
[bug] NaN list response from tensorflow-inference:2.4.1-cpu-py37-ubuntu18.04 closed 1 2022-06-30 2022-09-28
[bug] Mxnet containers have high vulnerabilities found via ECR scanning closed 2 2022-06-28 2022-09-21
[bug] Account-ids with prefix zero's (e.g. 027412998179) are failing in container build open 0 2022-06-15 2022-09-15
Why torchvision without cu113? closed 2 2022-06-10 2022-09-26
Pytorch elastic inference for pytorch >1.5 open 0 2022-06-09 2022-09-25
No module named mpi4py [bug] open 0 2022-05-30 2022-09-28
[bug] TF imagenet performance script does not respect timeout open 0 2022-05-24 2022-09-28
[bug] OOM for GPU while using recommended batch_size closed 2 2022-05-19 2022-09-23
[huggingface_tensorflow] Release Images for TF 2.8 with transformers 4.18.0 open 1 2022-04-26 2022-09-15
[huggingface_tensorflow] Release Images for TF 2.8 closed 1 2022-04-26 2022-09-24
[bug] apt update errors due to failing NVIDIA certificate verification open 2 2022-04-26 2022-09-23
[documentation-request] Missing TF 2.8.0 image in the available DLC list closed 1 2022-04-20 2022-09-23
[feature-request] Enable TorchServe env var overrides in PyTorch DLCs open 0 2022-04-18 2022-09-23
[bug] Missed neuron-start.sh open 1 2022-04-15 2022-09-28
[documentation-request] Suggest to add a note to refer to AWS Doc in the SM Training Compiler containers section open 0 2022-04-14 2022-09-15
[bug] OpenCV is not installed properly for Inference container open 1 2022-04-13 2022-09-23
[bug] Detectron2 errors when installing on PyTorch DLC open 9 2022-03-24 2022-09-23
[bug] Torchaudio 0.10.0+cu113 import fails because libtorch_cuda_cpp.so cannot be found open 3 2022-03-11 2022-09-27
[bug] Torch does not find GPU on pytorch-training:1.10.0-gpu-py38 container open 0 2022-03-06 2022-09-24
[bug] Huggingface Pytorch 1.9.1 inference container not logging to CloudWatch open 6 2022-03-04 2022-09-24
[bug] Docker File not building in scikit_bring_your_own example open 0 2022-02-17 2022-09-19
Update license of tensorflow 2.7->2.8 open 1 2022-02-11 2022-09-22
[feature-request] HuggingFace CPU training image (for convenient local mode testing) open 0 2022-02-08 2022-09-19
[bug] Latest version of mkl and mkl-include do not work with PyTorch 1.9.1 open 0 2022-02-04 2022-09-18
[bug] Tensorflow 2.x Serving / Inference GRPC overhead when transforming proto/tensor into python/numpy array open 0 2022-01-27 2022-09-21
Custom CUDA versions open 0 2022-01-25 2022-09-26
java.lang.IllegalArgumentException on Sagemaker Model/Endpoint closed 0 2022-01-21 2022-09-19
[bug] PT1.10.0 DLC error loading torchvision.ops.nms() open 2 2022-01-20 2022-09-25
[pending-change] Add test to validate DLC major version with production images open 0 2022-01-11 2022-09-23
[bug] Using a Pytorch image seems to be causing an ArgParser bug -- "bash: cannot set terminal process group (-1): Inappropriate ioctl for device" open 2 2022-01-03 2022-09-28

aws's Other Repos