About multi-ocr-sdk
Preview
The recently released PaddleOCR-VL-1.5 — has everyone tried it yet? The results are truly impressive.
On January 29, 2026, we released PaddleOCR-VL-1.5. PaddleOCR-VL-1.5 not only significantly raised the benchmark score on the OmniDocBench v1.5 evaluation set to 94.5%, but also innovatively supports irregular bounding-box localization—enabling outstanding performance across real-world scenarios such as document scanning, skewed or curved documents, screen captures, and complex lighting conditions. Moreover, the model newly incorporates seal recognition and joint text detection & recognition capabilities, maintaining leadership in key metrics.
使用教程 - PaddleOCR 文档
Those who haven’t tried it yet can experience it online: PaddleOCR - 文档解析与智能文字识别 | 支持API调用与MCP服务 - 飞桨星河社区
The image below is an official example—the recognition accuracy for such highly distorted images is excellent; standard images pose no challenge at all.
Accordingly, multi-ocr-sdk (GUI) has rapidly added support for PaddleOCR-VL-1.5.
Usage has been greatly simplified—you only need to specify base_url, api_key, and the file path.
How to Use
First, install:
pip install multi-ocr-sdk
Then use it:
import json
from multi_ocr_sdk import PaddleOCRVLClient
base_url = "http://10.131.101.39:8010"
api_key = "test"
# Default mode: returns only recognized text in Markdown format
client = PaddleOCRVLClient(base_url=base_url, api_key=api_key)
markdown_text = client.parse(r"examples/example_files/DeepSeek_OCR_paper_page1.jpg")
print(markdown_text)
# # Rich-result mode: returns Markdown + layout information per page (bounding box coordinates)
# rich_client = PaddleOCRVLClient(
# base_url=base_url,
# api_key=api_key,
# return_layout_info=True,
# )
# result = rich_client.parse(r"examples/example_files/DeepSeek_OCR_paper_page1.jpg")
# result_dict = result.to_dict()
# print(json.dumps(result_dict, ensure_ascii=False, indent=2))
Finally
We warmly welcome pull requests!
Project repository: GitHub - B-Beginner/MULTI-OCR-SDK: A simple and efficient Python SDK for Multi OCR API · GitHub
