Docker DRL

docker / nvidia-docker install

安装 docker:

安装 Nvidia driver.(参考前文)

安装 nvidia-docker(为了使用 GPU): https://github.com/NVIDIA/nvidia-docker

国内镜像加速

测速: https://github.com/silenceshell/docker_mirror https://blog.csdn.net/CSDN_duomaomao/article/details/73161076

次数太多,报错了。

See "systemctl status docker.service" and "journalctl -xe" for details.

docker.socket: Failed with result 'service-start-limit-hit'

https://github.com/docker/for-linux/issues/162 删掉/var/lib/docker重启解决 Remove /var/lib/docker (rm -rf /var/lib/docker). Restart Docker solved the problem.

https://forum.manjaro.org/t/docker-service-cant-start-solved/93410/4

sudo journalctl --no-hostname --no-pager -b -u docker.service

https://forum.manjaro.org/t/cant-start-docker-process/35164/5

pytorch image

https://hub.docker.com/r/pytorch/pytorch/

默认只是用 cpu, 如何使用 gpu(https://github.com/NVIDIA/nvidia-docker):

docker run --gpus all

tensorflow image

performance test

cpu/gpu 几乎一致

import torch
import time

print(torch.cuda.is_available())
print(torch.__version__)


N = 20000
a = torch.rand(N, N)
b = torch.rand(N, N)

t1 = time.time()
torch.mm(a, b)
t2 = time.time()
print(f'mm cost: {t2 - t1}')

t1 = time.time()

ac = a.cuda()
bc = b.cuda()

t2 = time.time()
print(f'to cuda cost: {t2 - t1}')


r = torch.mm(ac, bc)

t2 = time.time()
print(f'cuda mm cost: {t2 - t1}')

python

安装 jupyter

必须指定 IP 才可以

jupyter notebook --port=8889 --ip=0.0.0.0

jupyter notebook 远程

jupyter 插件

OpenAI Gym

ssh 避免自动断开

私有仓库

Warning: It’s not possible to use an insecure registry with basic authentication.

filesystem 是 docker 里面的目录, 还是需要-v 挂在 host 的目录, 否则 restart 后数据会丢失。

提交镜像

https://yeasy.gitbooks.io/docker_practice/image/commit.html