Deploy deepseek-ocr

deepseek-ocr

Reference Documentation

http://github.com/deepseek-ai/DeepSeek-OCR?tab=readme-ov-file

First clone the project, then install the conda environment

cd /data/tangjianing/.data_mapping
sudo git clone https://github.com/deepseek-ai/DeepSeek-OCR.git
sudo git clone https://www.modelscope.cn/deepseek-ai/DeepSeek-OCR.git /data/tangjianing/.data_mapping/deepseek-ocr
sudo chown -R
tangjianing:tangjianing/data/tangjianing/.data_mapping/deepseek-ocr
sudo yum install git-lfs
cd /data/tangjianing/.data_mapping/deepseek-ocr
git lfs pull

Create the virtual environment

sudo docker exec -it ubuntu-container-wyq /bin/bash

conda create -n deepseek-ocr python=3.12.9 -y
conda activate deepseek-ocr
conda create -n deepseek-ocr python=3.11 -y
cd ubuntu-22.04/deepseek-ocr1/
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu118
pip install vllm-0.8.5+cu118-cp38-abi3-manylinux1_x86_64.whl
pip install -r requirements.txt
pip install flash-attn==2.7.3 --no-build-isolation

cuda=12.2 version conflict issue is being resolved

Issue with the missing requirements.txt file

Map the host directory outside of Docker to the Docker container to achieve file sharing between the host and the container (real-time synchronization inside the container after modification on the host).

(deepseek-ocr) root@dfba29c2bbc6:/data/tangjianing/.data_mapping/deepseek-ocr# ls -l /data/tangjianing/.data_mapping/deepseek-ocr/
total 209248
-rw-r--r--. 1 root root      1064 Nov 21 01:51 LICENSE
-rw-r--r--. 1 root root      6308 Nov 21 01:51 README.md
drwxr-sr-x. 2 root root       114 Nov 21 01:51 assets
-rw-r--r--. 1 root root      2666 Nov 21 01:51 config.json
-rw-r--r--. 1 root root        76 Nov 21 01:51 configuration.json
-rw-r--r--. 1 root root     10646 Nov 21 01:51 configuration_deepseek_v2.py
-rw-r--r--. 1 root root      9253 Nov 21 01:51 conversation.py
-rw-r--r--. 1 root root     38008 Nov 21 01:51 deepencoder.py
-rw-r--r--. 1 root root       135 Nov 21 01:51 model-00001-of-000001.safetensors
-rw-r--r--. 1 root root    246759 Nov 21 01:51 model.safetensors.index.json
-rw-r--r--. 1 root root     40133 Nov 21 01:51 modeling_deepseekocr.py
-rw-r--r--. 1 root root     82224 Nov 21 01:51 modeling_deepseekv2.py
-rw-r--r--. 1 root root       460 Nov 21 01:51 processor_config.json
-rw-r--r--. 1 root root       801 Nov 21 01:51 special_tokens_map.json
-rw-r--r--. 1 root root       132 Nov 21 01:51 tokenizer.json
-rw-r--r--. 1 root root    165938 Nov 21 01:51 tokenizer_config.json
-rw-r--r--. 1 root root 213618745 Apr 28  2025 vllm-0.8.5+cu118-cp38-abi3-manylinux1_x86_64.whl

[tangjianing@localhost ~]$ cd /data/tangjianing/.data_mapping/deepseek-ocr/
[tangjianing@localhost deepseek-ocr]$ ls -l
total 7440
drwxr-xr-x. 2 root root    4096 Nov 24 09:48 assets
drwxr-xr-x. 4 root root    4096 Nov 24 09:48 DeepSeek-OCR-master
-rw-r--r--. 1 root root 7591202 Nov 24 09:48 DeepSeek_OCR_paper.pdf
-rw-r--r--. 1 root root    1065 Nov 24 09:48 LICENSE
-rw-r--r--. 1 root root    7733 Nov 24 09:48 README.md
-rw-r--r--. 1 root root      93 Nov 24 09:48 requirements.txt

Specific solution

sudo vim docker-compose.yaml

- /data/tangjianing/.data_mapping/deepseek-ocr:/ubuntu-22.04/deepseek-ocr1
# Save and exit
:wq
sudo docker exec -it ubuntu-container-wyq /bin/bash

sudo docker exec -it ubuntu-container-wyq /bin/bash
cd /ubuntu-22.04/
mkdir deepseek-ocr1
# You can choose to delete the original docker
# rm -rf deepseek-ocr/

exit
docker compose up -d
1 Like