zyddnys/manga-image-translator Docker Image Overview

zyddnys/manga-image-translator

manga-image-translator from [***]

12 收藏0 次下载activezyddnys镜像

🚀专业版镜像服务，面向生产环境设计

镜像简介版本下载

🚀专业版镜像服务，面向生产环境设计

Manga/Image Translator (English Readme)

Last Updated: 2025/05/10

!Commit activity !Lines of code !License !Contributors ![]([]

One-click translation of text in various images
中文说明 | Changelog
Welcome to join our *** <[***]>

This project aims to translate images that are unlikely to be professionally translated, such as comics/images on various group chats and image boards, making it possible for Japanese novices like me to understand the content. It mainly supports Japanese, but also supports Simplified and Traditional Chinese, English and 20 other minor languages. Supports image repair (text removal) and typesetting. This project is v2 of Qiú wén zhuǎn yì zhì.

Note: This project is still in the early stages of development and has many shortcomings. We need your help to improve it!

📂 Directory

Showcase
Online Version
Installation
- Local Setup
  - Using Pip/venv (Recommended)
  - Notes for Windows Users
- Docker
  - Run Web Server
    - Using Nvidia GPU
  - Use as CLI
  - Build Locally
Usage
- Local (Batch) Mode
- Web Mode
  - Old UI
  - New UI
- API Mode
  - API Documentation
- Config-help Mode
Option and Configuration
- Recommended Options
  - Tips to Improve Translation Quality
- Command Line Options
  - Basic Options
  - Additional Options
    - Local Mode Options
    - WebSocket Mode Options
    - API Mode Options
    - Web Mode Options
- Configuration File
  - Render Options
  - Upscale Options
  - Translator Options
  - Detector Options
  - Inpainter Options
  - Colorizer Options
  - OCR Options
  - Other Options
- Language Code Reference
- Translator Reference
- Glossary
- Replacement Dictionary
- Environment Variables Summary
- GPT Configuration Reference
- Rendering with Gimp
Future Plans
Support Us
- Thanks to all contributors
Star Growth Curve

Showcase

The following examples may not be frequently updated and may not represent the effect of the current main branch version.

Original Image Translated Image

Original Image	Translated Image
</a> <br /> (Source @09ra_19ra) </td> <td align="center" width="50%"> <a href="[*]"> </a> <br /> (Mask) </td> </tr> <tr> <td align="center" width="50%"> <a href="[]"> </a> <br /> (Source @VERTIGRIS_ART) </td> <td align="center" width="50%"> <a href="[]"> </a> <br /> <code>--detector ctd</code> (Mask) </td> </tr> <tr> <td align="center" width="50%"> <a href="[]"> </a> <br /> (Source @hiduki_yayoi) </td> <td align="center" width="50%"> <a href="[]"> </a> <br /> <code>--translator none</code> (Mask) </td> </tr> <tr> <td align="center" width="50%"> <a href="[]"> </a> <br /> (Source @rikak) </td> <td align="center" width="50%"> <a href="[**]"> </a> <br /> (Mask) </td> </tr>

    </a>
    <br />
    (Source @09ra_19ra)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Mask)
  </td>
</tr>
<tr>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Source @VERTIGRIS_ART)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    <code>--detector ctd</code>
    (Mask)
  </td>
</tr>
<tr>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Source @hiduki_yayoi)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    <code>--translator none</code>
    (Mask)
  </td>
</tr>
<tr>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Source @rikak)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Mask)
  </td>
</tr>

Online Version

Official demo site (maintained by zyddnys): <[]>
Browser script (maintained by QiroNT): <[]>

Note: If the online version is inaccessible, it might be due to Google GCP restarting the server. Please wait a moment for the service to restart.
The online version uses the latest version from the main branch.

Installation

Local Setup

Using Pip/venv (Recommended)

bash
# First, ensure you have Python 3.10 or later installed on your machine
# The very latest version of Python might not be compatible with some PyTorch libraries yet
$ python --version
Python 3.10.6

# Clone this repository
$ git clone [***]

# Create a venv (optional, but recommended)
$ python -m venv venv

# Activate the venv
$ source venv/bin/activate

# If you want to use the --use-gpu option, please visit [***] to install PyTorch, which needs to correspond to your CUDA version.
# If you did not use venv to create a virtual environment, you need to add --upgrade --force-reinstall to the pip command to overwrite the currently installed PyTorch version.

# Install dependencies
$ pip install -r requirements.txt

Models will be automatically downloaded to the ./models directory at runtime.

Notes for Windows Users:

Please install Microsoft C++ Build Tools (Download, Instructions) before performing the pip install, as some pip dependencies need it to compile. (See #114).

To use CUDA on Windows, install the correct PyTorch version as described on <[***]>.

Docker

Requirements:

Docker (19.03+ for CUDA / GPU acceleration)
Docker Compose (Optional, if you want to use the files in demo/doc folder)
Nvidia Container Runtime (Optional, if you want to use CUDA)

This project supports Docker, with the image being zyddnys/manga-image-translator:main. This Docker image contains all the dependencies and models required for the project. Please note that this image is quite large (~15GB).

Run Web Server

You can start the Web Server (CPU) using the following command:

Note that you need to add the required environment variables using -e or --env

bash
docker run \
  --name manga_image_translator_cpu \
  -p 5003:5003 \
  --ipc=host \
  --entrypoint python \
  --rm \
  -v /demo/doc/../../result:/app/result \
  -v /demo/doc/../../server/main.py:/app/server/main.py \
  -v /demo/doc/../../server/instance.py:/app/server/instance.py \
  -e OPENAI_API_KEY='' \
  -e OPENAI_API_BASE='' \
  -e OPENAI_MODEL='' \
  zyddnys/manga-image-translator:main \
  server/main.py --verbose --start-instance --host=0.0.0.0 --port=5003

Or use the compose file

Note that you need to add the required environment variables in the file first

bash
docker-compose -f demo/doc/docker-compose-web-with-cpu.yml up

The Web Server starts on port 8000 by default, and the translation results will be saved in the /result folder.

Using Nvidia GPU

To use a supported GPU, please read the Docker section above first. You will need some special dependencies.

You can start the Web Server (GPU) using the following command:

Note that you need to add the required environment variables using -e or --env

bash
docker run \
  --name manga_image_translator_gpu \
  -p 5003:5003 \
  --ipc=host \
  --gpus all \
  --entrypoint python \
  --rm \
  -v /demo/doc/../../result:/app/result \
  -v /demo/doc/../../server/main.py:/app/server/main.py \
  -v /demo/doc/../../server/instance.py:/app/server/instance.py \
  -e OPENAI_API_KEY='' \
  -e OPENAI_API_BASE='' \
  -e OPENAI_MODEL='' \
  -e OPENAI_HTTP_PROXY='' \
  zyddnys/manga-image-translator:main \
  server/main.py --verbose --start-instance --host=0.0.0.0 --port=5003 --use-gpu

Or use the compose file (for Web Server + GPU):

Note that you need to add the required environment variables in the file first

bash
docker-compose -f demo/doc/docker-compose-web-with-gpu.yml up

Use as CLI

To use Docker via CLI (i.e., Batch Mode):

Some translation services require API keys to run, pass them to your docker container as environment variables.

bash
docker run --env="DEEPL_AUTH_KEY=xxx" -v <targetFolder>:/app/<targetFolder> -v <targetFolder>-translated:/app/<targetFolder>-translated  --ipc=host --rm zyddnys/manga-image-translator:main local -i=/app/<targetFolder> <cli flags>

Note: If you need to reference files on your host, you will need to mount the relevant files as volumes into the /app folder inside the container. The CLI paths will need to be the internal Docker path /app/... and not the path on your host.

Build Locally

To build the docker image locally, you can run the following command (you need to have make tool installed on your machine):

bash
make build-image

Then test the built image, run:

Some translation services require API keys to run, pass them to your docker container as environment variables. Add environment variables in the Dockerfile.

bash
make run-web-server

Usage

Local (Batch) Mode

bash
# Replace <path> with the path to your image folder or file.
$ python -m manga_translator local -v -i <path>
# The results can be found in `<path_to_image_folder>-translated`.

Web Mode

Old UI

bash
# Start a web server.
$ cd server
$ python main.py --use-gpu
# The web demo service address is [***]

New UI

Documentation

API Mode

bash
# Start a web server.
$ cd server
$ python main.py --use-gpu
# The API service address is [***]

API Documentation

Read the openapi documentation at: 127.0.0.1:8000/docs

FastAPI-html

Config-help Mode

bash
python -m manga_translator config-help

Options and Configuration Description

Recommended Options

Detector:

English: ??
Japanese: ??
Chinese (Simplified): ??
Korean: ??
Using {"detector":{"detector": "ctd"}} can increase the number of text lines detected Update: Actual testing shows that default works better with related parameter adjustments in black and white comics.

OCR:

English: ??
Japanese: 48px
Chinese (Simplified): ??
Korean: 48px

Translator:

Japanese -> English: Sugoi
Chinese (Simplified) -> English: ??
Chinese (Simplified) -> Japanese: ??
Japanese -> Chinese (Simplified): sakura or opanai
English -> Japanese: ??
English -> Chinese (Simplified): ??

Inpainter: lama_large

Colorizer: mc2

Tips to Improve Translation Quality

Small resolutions can sometimes trip up the detector, which is not so good at picking up irregular text sizes. To circumvent this you can use an upscaler by specifying upscale_ratio 2 or any other value
If the rendered text is too small to read, specify font_size_offset or use the --manga2eng renderer, which will try to fit the detected text bubble rather than detected textline area.
Specify a font with --font-path fonts/anime_ace_3.ttf for example
Set mask_dilation_offset to 10~30 to increase the mask coverage and better wrap the source text
change inpainter.
Increasing the box_threshold can help filter out gibberish from OCR error detection to some extent.
Use OpenaiTranslator to load the glossary file (custom_openai cannot load it)
When the image resolution is low, lower detection_size, otherwise it may cause some sentences to be missed. The opposite is true when the image resolution is high.
When the image resolution is high, increase inpainting_size, otherwise it may not completely cover the mask, resulting in source text leakage. In other cases, you can increase kernel_size to reduce the accuracy of text removal so that the model gets a larger field of view (Note: Judge whether the text leakage is caused by inpainting based on the consistency between the source text and the translated text. If consistent, it is caused by inpainting, otherwise it is caused by text detection and OCR)

Command Line Options

Basic Options

text
-h, --help                     show this help message and exit
-v, --verbose                  print debug messages and save intermediate images in results folder
--attempts ATTEMPTS            Number of attempts when an error occurs. -1 for infinite attempts.
--ignore-errors                Skip images when an error occurs.
--model-dir MODEL_DIR          Model directory (defaults to ./models in the project root)
--use-gpu                      Turns on/off GPU (automatically switches between mps and cuda)
--use-gpu-limited              Turns on/off GPU (excluding offline translators)
--font-path FONT_PATH          Path to the font file
--pre-dict PRE_DICT            Path to the pre-translation replacement dictionary file
--post-dict POST_DICT          Path to the post-translation replacement dictionary file
--kernel-size KERNEL_SIZE      Set the kernel size for the convolution of text erasure area to completely clear residual text
--context-size                 Pages of context are needed for translating the current page. currently, this only applies to openaitranslator.

Additional Options

Local Mode Options

text
local                         run in batch translation mode
-i, --input INPUT [INPUT ...] Image folder path (required)
-o, --dest DEST               Destination folder path for translated images (default: '')
-f, --format FORMAT           Output format for the translation. Options: [List OUTPUT_FORMATS here, png,webp,jpg,jpeg,xcf,psd,pdf]
--overwrite                   Overwrite already translated images
--skip-no-text                Skip images with no text (won't be saved).
--use-mtpe                    Turn on/off Machine Translation Post-Editing (MTPE) on the command line (currently Linux only)
--save-text                   Save extracted text and translations to a text file.
--load-text                   Load extracted text and translations from a text file.
--save-text-file SAVE_TEXT_FILE  Similar to --save-text, but with a specified file path. (default: '')
--prep-manual                 Prepare for manual typesetting by outputting blanked, inpainted images, and copies of the original image for reference
--save-quality SAVE_QUALITY   Quality of saved JPEG images, from 0 to 100 where 100 is best (default: 100)
--config-file CONFIG_FILE     Path to a configuration file (default: None)

WebSocket Mode Options

text
ws                  run in WebSocket mode
--host HOST         Host of the WebSocket service (default: 127.0.0.1)
--port PORT         Port of the WebSocket service (default: 5003)
--nonce NONCE       Nonce used to secure internal WebSocket communication
--ws-url WS_URL     Server URL for WebSocket mode (default: ws://localhost:5000)
--models-ttl MODELS_TTL  Time in seconds to keep models in memory after last use (0 means forever)

API Mode Options

text
shared              run in API mode
--host HOST         Host of the API service (default: 127.0.0.1)
--port PORT         Port of the API service (default: 5003)
--nonce NONCE       Nonce used to secure internal API server communication
--report REPORT     Report to server to register instance (default: None)
--models-ttl MODELS_TTL  TTL of models in memory in seconds (0 means forever)

Web Mode Options (missing some basic options, still needs to be added)

text
--host HOST           Host address (default: 127.0.0.1)
--port PORT           Port number (default: 8000)
--start-instance      Whether an instance of the translator should be started automatically
--nonce NONCE         Nonce used to secure internal Web Server communication
--models-ttl MODELS_TTL  Time in seconds to keep models in memory after last use (0 means forever)

Configuration File

Run python -m manga_translator config-help >> config-info.json to see the documentation for the JSON schema An example config file can be found in example/config-example.json

Expand the full config JSON

json
{
  "$defs": {
    "Alignment": {
      "enum": [
        "auto",
        "left",
        "center",
        "right"
      ],
      "title": "Alignment",
      "type": "string"
    },
    "Colorizer": {
      "enum": [
        "none",
        "mc2"
      ],
      "title": "Colorizer",
      "type": "string"
    },
    "ColorizerConfig": {
      "properties": {
        "colorization_size": {
          "default": 576,
          "title": "Colorization Size",
          "type": "integer"
        },
        "denoise_sigma": {
          "default": 30,
          "title": "Denoise Sigma",
          "type": "integer"
        },
        "colorizer": {
          "$ref": "#/$defs/Colorizer",
          "default": "none"
        }
      },
      "title": "ColorizerConfig",
      "type": "object"
    },
    "Detector": {
      "enum": [
        "default",
        "dbconvnext",
        "ctd",
        "craft",
        "paddle",
        "none"
      ],
      "title": "Detector",
      "type": "string"
    },
    "DetectorConfig": {
      "properties": {
        "detector": {
          "$ref": "#/$defs/Detector",
          "default": "default"
        },
        "detection_size": {
          "default": 2048,
          "title": "Detection Size",
          "type": "integer"
        },
        "text_threshold": {
          "default": 0.5,
          "title": "Text Threshold",
          "type": "number"
        },
        "det_rotate": {
          "default": false,
          "title": "Det Rotate",
          "type": "boolean"
        },
        "det_auto_rotate": {
          "default": false,
          "title": "Det Auto Rotate",
          "type": "boolean"
        },
        "det_invert": {
          "default": false,
          "title": "Det Invert",
          "type": "boolean"
        },
        "det_gamma_correct": {
          "default": false,
          "title": "Det Gamma Correct",
          "type": "boolean"
        },
        "box_threshold": {
          "default": 0.75,
          "title": "Box Threshold",
          "type": "number"
        },
        "unclip_ratio": {
          "default": 2.3,
          "title": "Unclip Ratio",
          "type": "number"
        }
      },
      "title": "DetectorConfig",
      "type": "object"
    },
    "Direction": {
      "enum": [
        "auto",
        "horizontal",
        "vertical"
      ],
      "title": "Direction",
      "type": "string"
    },
    "InpaintPrecision": {
      "enum": [
        "fp32",
        "fp16",
        "bf16"
      ],
      "title": "InpaintPrecision",
      "type": "string"
    },
    "Inpainter": {
      "enum": [
        "default",
        "lama_large",
        "lama_mpe",
        "sd",
        "none",
        "original"
      ],
      "title": "Inpainter",
      "type": "string"
    },
    "InpainterConfig": {
      "properties": {
        "inpainter": {
          "$ref": "#/$defs/Inpainter",
          "default": "lama_large"
        },
        "inpainting_size": {
          "default": 2048,
          "title": "Inpainting Size",
          "type": "integer"
        },
        "inpainting_precision": {
          "$ref": "#/$defs/InpaintPrecision",
          "default": "bf16"
        }
      },
      "title": "InpainterConfig",
      "type": "object"
    },
    "Ocr": {
      "enum": [
        "32px",
        "48px",
        "48px_ctc",
        "mocr"
      ],
      "title": "Ocr",
      "type": "string"
    },
    "OcrConfig": {
      "properties": {
        "use_mocr_merge": {
          "default": false,
          "title": "Use Mocr Merge",
          "type": "boolean"
        },
        "ocr": {
          "$ref": "#/$defs/Ocr",
          "default": "48px"
        },
        "min_text_length": {
          "default": 0,
          "title": "Min Text Length",
          "type": "integer"
        },
        "ignore_bubble": {
          "default": 0,
          "title": "Ignore Bubble",
          "type": "integer"
        }
      },
      "title": "OcrConfig",
      "type": "object"
    },
    "RenderConfig": {
      "properties": {
        "renderer": {
          "$ref": "#/$defs/Renderer",
          "default": "default"
        },
        "alignment": {
          "$ref": "#/$defs/Alignment",
          "def

查看更多 manga-image-translator 相关镜像 →

atlassian/default-image

by Atlassian

认证

Bitbucket流水线的默认构建环境，用于支持其构建流程。

8650M+ pulls

上次更新：7 个月前

grafana/loki-build-image

grafana/grafana-image-renderer

by Grafana Labs

认证

Grafana 远程图像渲染器镜像，通过无头 Chrome 将 Grafana 面板和仪表板渲染为 PNG 格式，支持集成到报表生成、自动化流程等场景。

Primes项目CI构建的Docker镜像集合

100K+ pulls

上次更新：17 天前

grafana/grafana-image-tags

by Grafana Labs

认证

用于保存按架构分类的Docker镜像标签的仓库

1M+ pulls

上次更新：5 天前

grafana/mimir-build-image

by Grafana Labs

认证

用于构建Grafana Mimir的镜像。

500K+ pulls

上次更新：11 天前

轩辕镜像配置手册

探索更多轩辕镜像的使用方法，找到最适合您系统的配置方式

登录仓库拉取

通过 Docker 登录认证访问私有仓库

Linux

在 Linux 系统配置镜像服务

Windows/Mac

在 Docker Desktop 配置镜像

Docker Compose

Docker Compose 项目配置

K8s Containerd

Kubernetes 集群配置 Containerd

K3s

K3s 轻量级 Kubernetes 镜像加速

宝塔面板

在宝塔面板一键配置镜像

群晖

Synology 群晖 NAS 配置

飞牛

飞牛 fnOS 系统配置镜像

极空间

极空间 NAS 系统配置服务

爱快路由

爱快 iKuai 路由系统配置

绿联

绿联 NAS 系统配置镜像

威联通

QNAP 威联通 NAS 配置

Podman

Podman 容器引擎配置

Singularity/Apptainer

HPC 科学计算容器配置

其他仓库配置

ghcr、Quay、nvcr 等镜像仓库

专属域名拉取

无需登录使用专属域名

需要其他帮助？请查看我们的常见问题Docker 镜像访问常见问题解答或提交工单

镜像拉取常见问题

轩辕镜像免费版与专业版有什么区别？

免费版仅支持 Docker Hub 访问，不承诺可用性和速度；专业版支持更多镜像源，保证可用性和稳定速度，提供优先客服响应。

轩辕镜像支持哪些镜像仓库？

专业版支持 docker.io、gcr.io、ghcr.io、registry.k8s.io、nvcr.io、quay.io、mcr.microsoft.com、docker.elastic.co 等；免费版仅支持 docker.io。

流量耗尽错误提示

当返回 402 Payment Required 错误时，表示流量已耗尽，需要充值流量包以恢复服务。

410 错误问题

通常由 Docker 版本过低导致，需要升级到 20.x 或更高版本以支持 V2 协议。

manifest unknown 错误

先检查 Docker 版本，版本过低则升级；版本正常则验证镜像信息是否正确。

镜像拉取成功后，如何去掉轩辕镜像域名前缀？

使用 docker tag 命令为镜像打上新标签，去掉域名前缀，使镜像名称更简洁。

查看全部问题→

用户好评

来自真实用户的反馈，见证轩辕镜像的优质服务

oldzhang

运维工程师

Linux服务器

"Docker访问体验非常流畅，大镜像也能快速完成下载。"

zyddnys/manga-image-translator

manga-image-translator from [***]

12 收藏0 次下载activezyddnys镜像

🚀专业版镜像服务，面向生产环境设计

镜像简介版本下载

🚀专业版镜像服务，面向生产环境设计

Manga/Image Translator (English Readme)

Last Updated: 2025/05/10

!Commit activity !Lines of code !License !Contributors ![]([]

One-click translation of text in various images
中文说明 | Changelog
Welcome to join our *** <[***]>

Note: This project is still in the early stages of development and has many shortcomings. We need your help to improve it!

📂 Directory

Showcase
Online Version
Installation
- Local Setup
  - Using Pip/venv (Recommended)
  - Notes for Windows Users
- Docker
  - Run Web Server
    - Using Nvidia GPU
  - Use as CLI
  - Build Locally
Usage
- Local (Batch) Mode
- Web Mode
  - Old UI
  - New UI
- API Mode
  - API Documentation
- Config-help Mode
Option and Configuration
- Recommended Options
  - Tips to Improve Translation Quality
- Command Line Options
  - Basic Options
  - Additional Options
    - Local Mode Options
    - WebSocket Mode Options
    - API Mode Options
    - Web Mode Options
- Configuration File
  - Render Options
  - Upscale Options
  - Translator Options
  - Detector Options
  - Inpainter Options
  - Colorizer Options
  - OCR Options
  - Other Options
- Language Code Reference
- Translator Reference
- Glossary
- Replacement Dictionary
- Environment Variables Summary
- GPT Configuration Reference
- Rendering with Gimp
Future Plans
Support Us
- Thanks to all contributors
Star Growth Curve

Showcase

The following examples may not be frequently updated and may not represent the effect of the current main branch version.

Original Image Translated Image

Original Image	Translated Image
</a> <br /> (Source @09ra_19ra) </td> <td align="center" width="50%"> <a href="[*]"> </a> <br /> (Mask) </td> </tr> <tr> <td align="center" width="50%"> <a href="[]"> </a> <br /> (Source @VERTIGRIS_ART) </td> <td align="center" width="50%"> <a href="[]"> </a> <br /> <code>--detector ctd</code> (Mask) </td> </tr> <tr> <td align="center" width="50%"> <a href="[]"> </a> <br /> (Source @hiduki_yayoi) </td> <td align="center" width="50%"> <a href="[]"> </a> <br /> <code>--translator none</code> (Mask) </td> </tr> <tr> <td align="center" width="50%"> <a href="[]"> </a> <br /> (Source @rikak) </td> <td align="center" width="50%"> <a href="[**]"> </a> <br /> (Mask) </td> </tr>

    </a>
    <br />
    (Source @09ra_19ra)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Mask)
  </td>
</tr>
<tr>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Source @VERTIGRIS_ART)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    <code>--detector ctd</code>
    (Mask)
  </td>
</tr>
<tr>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Source @hiduki_yayoi)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    <code>--translator none</code>
    (Mask)
  </td>
</tr>
<tr>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Source @rikak)
  </td>
  <td align="center" width="50%">
    <a href="[***]">
      
    </a>
    <br />
    (Mask)
  </td>
</tr>

Online Version

Official demo site (maintained by zyddnys): <[]>
Browser script (maintained by QiroNT): <[]>

Note: If the online version is inaccessible, it might be due to Google GCP restarting the server. Please wait a moment for the service to restart.
The online version uses the latest version from the main branch.

Installation

Local Setup

Using Pip/venv (Recommended)

bash
# First, ensure you have Python 3.10 or later installed on your machine
# The very latest version of Python might not be compatible with some PyTorch libraries yet
$ python --version
Python 3.10.6

# Clone this repository
$ git clone [***]

# Create a venv (optional, but recommended)
$ python -m venv venv

# Activate the venv
$ source venv/bin/activate

# If you want to use the --use-gpu option, please visit [***] to install PyTorch, which needs to correspond to your CUDA version.
# If you did not use venv to create a virtual environment, you need to add --upgrade --force-reinstall to the pip command to overwrite the currently installed PyTorch version.

# Install dependencies
$ pip install -r requirements.txt

Models will be automatically downloaded to the ./models directory at runtime.

Notes for Windows Users:

Please install Microsoft C++ Build Tools (Download, Instructions) before performing the pip install, as some pip dependencies need it to compile. (See #114).

To use CUDA on Windows, install the correct PyTorch version as described on <[***]>.

Docker

Requirements:

Docker (19.03+ for CUDA / GPU acceleration)
Docker Compose (Optional, if you want to use the files in demo/doc folder)
Nvidia Container Runtime (Optional, if you want to use CUDA)

Run Web Server

You can start the Web Server (CPU) using the following command:

Note that you need to add the required environment variables using -e or --env

bash
docker run \
  --name manga_image_translator_cpu \
  -p 5003:5003 \
  --ipc=host \
  --entrypoint python \
  --rm \
  -v /demo/doc/../../result:/app/result \
  -v /demo/doc/../../server/main.py:/app/server/main.py \
  -v /demo/doc/../../server/instance.py:/app/server/instance.py \
  -e OPENAI_API_KEY='' \
  -e OPENAI_API_BASE='' \
  -e OPENAI_MODEL='' \
  zyddnys/manga-image-translator:main \
  server/main.py --verbose --start-instance --host=0.0.0.0 --port=5003

Or use the compose file

Note that you need to add the required environment variables in the file first

bash
docker-compose -f demo/doc/docker-compose-web-with-cpu.yml up

The Web Server starts on port 8000 by default, and the translation results will be saved in the /result folder.

Using Nvidia GPU

To use a supported GPU, please read the Docker section above first. You will need some special dependencies.

You can start the Web Server (GPU) using the following command:

Note that you need to add the required environment variables using -e or --env

bash
docker run \
  --name manga_image_translator_gpu \
  -p 5003:5003 \
  --ipc=host \
  --gpus all \
  --entrypoint python \
  --rm \
  -v /demo/doc/../../result:/app/result \
  -v /demo/doc/../../server/main.py:/app/server/main.py \
  -v /demo/doc/../../server/instance.py:/app/server/instance.py \
  -e OPENAI_API_KEY='' \
  -e OPENAI_API_BASE='' \
  -e OPENAI_MODEL='' \
  -e OPENAI_HTTP_PROXY='' \
  zyddnys/manga-image-translator:main \
  server/main.py --verbose --start-instance --host=0.0.0.0 --port=5003 --use-gpu

Or use the compose file (for Web Server + GPU):

Note that you need to add the required environment variables in the file first

bash
docker-compose -f demo/doc/docker-compose-web-with-gpu.yml up

Use as CLI

To use Docker via CLI (i.e., Batch Mode):

Some translation services require API keys to run, pass them to your docker container as environment variables.

bash
docker run --env="DEEPL_AUTH_KEY=xxx" -v <targetFolder>:/app/<targetFolder> -v <targetFolder>-translated:/app/<targetFolder>-translated  --ipc=host --rm zyddnys/manga-image-translator:main local -i=/app/<targetFolder> <cli flags>

Build Locally

To build the docker image locally, you can run the following command (you need to have make tool installed on your machine):

bash
make build-image

Then test the built image, run:

Some translation services require API keys to run, pass them to your docker container as environment variables. Add environment variables in the Dockerfile.

bash
make run-web-server

Usage

Local (Batch) Mode

bash
# Replace <path> with the path to your image folder or file.
$ python -m manga_translator local -v -i <path>
# The results can be found in `<path_to_image_folder>-translated`.

Web Mode

Old UI

bash
# Start a web server.
$ cd server
$ python main.py --use-gpu
# The web demo service address is [***]

New UI

Documentation

API Mode

bash
# Start a web server.
$ cd server
$ python main.py --use-gpu
# The API service address is [***]

API Documentation

Read the openapi documentation at: 127.0.0.1:8000/docs

FastAPI-html

Config-help Mode

bash
python -m manga_translator config-help

Options and Configuration Description

Recommended Options

Detector:

English: ??
Japanese: ??
Chinese (Simplified): ??
Korean: ??
Using {"detector":{"detector": "ctd"}} can increase the number of text lines detected Update: Actual testing shows that default works better with related parameter adjustments in black and white comics.

OCR:

English: ??
Japanese: 48px
Chinese (Simplified): ??
Korean: 48px

Translator:

Japanese -> English: Sugoi
Chinese (Simplified) -> English: ??
Chinese (Simplified) -> Japanese: ??
Japanese -> Chinese (Simplified): sakura or opanai
English -> Japanese: ??
English -> Chinese (Simplified): ??

Inpainter: lama_large

Colorizer: mc2

Tips to Improve Translation Quality

Small resolutions can sometimes trip up the detector, which is not so good at picking up irregular text sizes. To circumvent this you can use an upscaler by specifying upscale_ratio 2 or any other value
If the rendered text is too small to read, specify font_size_offset or use the --manga2eng renderer, which will try to fit the detected text bubble rather than detected textline area.
Specify a font with --font-path fonts/anime_ace_3.ttf for example
Set mask_dilation_offset to 10~30 to increase the mask coverage and better wrap the source text
change inpainter.
Increasing the box_threshold can help filter out gibberish from OCR error detection to some extent.
Use OpenaiTranslator to load the glossary file (custom_openai cannot load it)
When the image resolution is low, lower detection_size, otherwise it may cause some sentences to be missed. The opposite is true when the image resolution is high.
When the image resolution is high, increase inpainting_size, otherwise it may not completely cover the mask, resulting in source text leakage. In other cases, you can increase kernel_size to reduce the accuracy of text removal so that the model gets a larger field of view (Note: Judge whether the text leakage is caused by inpainting based on the consistency between the source text and the translated text. If consistent, it is caused by inpainting, otherwise it is caused by text detection and OCR)

Command Line Options

Basic Options

text
-h, --help                     show this help message and exit
-v, --verbose                  print debug messages and save intermediate images in results folder
--attempts ATTEMPTS            Number of attempts when an error occurs. -1 for infinite attempts.
--ignore-errors                Skip images when an error occurs.
--model-dir MODEL_DIR          Model directory (defaults to ./models in the project root)
--use-gpu                      Turns on/off GPU (automatically switches between mps and cuda)
--use-gpu-limited              Turns on/off GPU (excluding offline translators)
--font-path FONT_PATH          Path to the font file
--pre-dict PRE_DICT            Path to the pre-translation replacement dictionary file
--post-dict POST_DICT          Path to the post-translation replacement dictionary file
--kernel-size KERNEL_SIZE      Set the kernel size for the convolution of text erasure area to completely clear residual text
--context-size                 Pages of context are needed for translating the current page. currently, this only applies to openaitranslator.

Additional Options

Local Mode Options

text
local                         run in batch translation mode
-i, --input INPUT [INPUT ...] Image folder path (required)
-o, --dest DEST               Destination folder path for translated images (default: '')
-f, --format FORMAT           Output format for the translation. Options: [List OUTPUT_FORMATS here, png,webp,jpg,jpeg,xcf,psd,pdf]
--overwrite                   Overwrite already translated images
--skip-no-text                Skip images with no text (won't be saved).
--use-mtpe                    Turn on/off Machine Translation Post-Editing (MTPE) on the command line (currently Linux only)
--save-text                   Save extracted text and translations to a text file.
--load-text                   Load extracted text and translations from a text file.
--save-text-file SAVE_TEXT_FILE  Similar to --save-text, but with a specified file path. (default: '')
--prep-manual                 Prepare for manual typesetting by outputting blanked, inpainted images, and copies of the original image for reference
--save-quality SAVE_QUALITY   Quality of saved JPEG images, from 0 to 100 where 100 is best (default: 100)
--config-file CONFIG_FILE     Path to a configuration file (default: None)

WebSocket Mode Options

text
ws                  run in WebSocket mode
--host HOST         Host of the WebSocket service (default: 127.0.0.1)
--port PORT         Port of the WebSocket service (default: 5003)
--nonce NONCE       Nonce used to secure internal WebSocket communication
--ws-url WS_URL     Server URL for WebSocket mode (default: ws://localhost:5000)
--models-ttl MODELS_TTL  Time in seconds to keep models in memory after last use (0 means forever)

API Mode Options

text
shared              run in API mode
--host HOST         Host of the API service (default: 127.0.0.1)
--port PORT         Port of the API service (default: 5003)
--nonce NONCE       Nonce used to secure internal API server communication
--report REPORT     Report to server to register instance (default: None)
--models-ttl MODELS_TTL  TTL of models in memory in seconds (0 means forever)

Web Mode Options (missing some basic options, still needs to be added)

text
--host HOST           Host address (default: 127.0.0.1)
--port PORT           Port number (default: 8000)
--start-instance      Whether an instance of the translator should be started automatically
--nonce NONCE         Nonce used to secure internal Web Server communication
--models-ttl MODELS_TTL  Time in seconds to keep models in memory after last use (0 means forever)

Configuration File

Run python -m manga_translator config-help >> config-info.json to see the documentation for the JSON schema An example config file can be found in example/config-example.json

Expand the full config JSON

json
{
  "$defs": {
    "Alignment": {
      "enum": [
        "auto",
        "left",
        "center",
        "right"
      ],
      "title": "Alignment",
      "type": "string"
    },
    "Colorizer": {
      "enum": [
        "none",
        "mc2"
      ],
      "title": "Colorizer",
      "type": "string"
    },
    "ColorizerConfig": {
      "properties": {
        "colorization_size": {
          "default": 576,
          "title": "Colorization Size",
          "type": "integer"
        },
        "denoise_sigma": {
          "default": 30,
          "title": "Denoise Sigma",
          "type": "integer"
        },
        "colorizer": {
          "$ref": "#/$defs/Colorizer",
          "default": "none"
        }
      },
      "title": "ColorizerConfig",
      "type": "object"
    },
    "Detector": {
      "enum": [
        "default",
        "dbconvnext",
        "ctd",
        "craft",
        "paddle",
        "none"
      ],
      "title": "Detector",
      "type": "string"
    },
    "DetectorConfig": {
      "properties": {
        "detector": {
          "$ref": "#/$defs/Detector",
          "default": "default"
        },
        "detection_size": {
          "default": 2048,
          "title": "Detection Size",
          "type": "integer"
        },
        "text_threshold": {
          "default": 0.5,
          "title": "Text Threshold",
          "type": "number"
        },
        "det_rotate": {
          "default": false,
          "title": "Det Rotate",
          "type": "boolean"
        },
        "det_auto_rotate": {
          "default": false,
          "title": "Det Auto Rotate",
          "type": "boolean"
        },
        "det_invert": {
          "default": false,
          "title": "Det Invert",
          "type": "boolean"
        },
        "det_gamma_correct": {
          "default": false,
          "title": "Det Gamma Correct",
          "type": "boolean"
        },
        "box_threshold": {
          "default": 0.75,
          "title": "Box Threshold",
          "type": "number"
        },
        "unclip_ratio": {
          "default": 2.3,
          "title": "Unclip Ratio",
          "type": "number"
        }
      },
      "title": "DetectorConfig",
      "type": "object"
    },
    "Direction": {
      "enum": [
        "auto",
        "horizontal",
        "vertical"
      ],
      "title": "Direction",
      "type": "string"
    },
    "InpaintPrecision": {
      "enum": [
        "fp32",
        "fp16",
        "bf16"
      ],
      "title": "InpaintPrecision",
      "type": "string"
    },
    "Inpainter": {
      "enum": [
        "default",
        "lama_large",
        "lama_mpe",
        "sd",
        "none",
        "original"
      ],
      "title": "Inpainter",
      "type": "string"
    },
    "InpainterConfig": {
      "properties": {
        "inpainter": {
          "$ref": "#/$defs/Inpainter",
          "default": "lama_large"
        },
        "inpainting_size": {
          "default": 2048,
          "title": "Inpainting Size",
          "type": "integer"
        },
        "inpainting_precision": {
          "$ref": "#/$defs/InpaintPrecision",
          "default": "bf16"
        }
      },
      "title": "InpainterConfig",
      "type": "object"
    },
    "Ocr": {
      "enum": [
        "32px",
        "48px",
        "48px_ctc",
        "mocr"
      ],
      "title": "Ocr",
      "type": "string"
    },
    "OcrConfig": {
      "properties": {
        "use_mocr_merge": {
          "default": false,
          "title": "Use Mocr Merge",
          "type": "boolean"
        },
        "ocr": {
          "$ref": "#/$defs/Ocr",
          "default": "48px"
        },
        "min_text_length": {
          "default": 0,
          "title": "Min Text Length",
          "type": "integer"
        },
        "ignore_bubble": {
          "default": 0,
          "title": "Ignore Bubble",
          "type": "integer"
        }
      },
      "title": "OcrConfig",
      "type": "object"
    },
    "RenderConfig": {
      "properties": {
        "renderer": {
          "$ref": "#/$defs/Renderer",
          "default": "default"
        },
        "alignment": {
          "$ref": "#/$defs/Alignment",
          "def

查看更多 manga-image-translator 相关镜像 →

atlassian/default-image

by Atlassian

认证

Bitbucket流水线的默认构建环境，用于支持其构建流程。

8650M+ pulls

上次更新：7 个月前

grafana/loki-build-image