diff --git a/cv/ocr/sar/pytorch/README.md b/cv/ocr/sar/pytorch/README.md index 3b19d3894c53f44cf3f59dfbf5cd29af84bcfa08..06a7dee289507036c6ee26d47a64d6c2f077ffd5 100755 --- a/cv/ocr/sar/pytorch/README.md +++ b/cv/ocr/sar/pytorch/README.md @@ -4,7 +4,7 @@ Recognizing irregular text in natural scene images is challenging due to the large variance in text appearance, such as curvature, orientation and distortion. Most existing approaches rely heavily on sophisticated model designs and/or extra fine-grained annotations, which, to some extent, increase the difficulty in algorithm implementation and data collection. In this work, we propose an easy-to-implement strong baseline for irregular scene text recognition, using off-the-shelf neural network components and only word-level annotations. It is composed of a 31-layer ResNet, an LSTM-based encoder-decoder framework and a 2-dimensional attention module. Despite its simplicity, the proposed method is robust and achieves state-of-the-art performance on both regular and irregular scene text recognition benchmarks. -## Step 1: Installing packages +## Step 1: Installation ```shell cd csrc/ @@ -17,15 +17,14 @@ pip3 install -r requirements.txt ## Step 2: Preparing datasets -```shell +```bash mkdir data -ln -s /path/to/mixture ./data/ +cd data ``` -Download datasets from this [page](https://mmocr.readthedocs.io/zh_CN/latest/datasets/recog.html), -data folder would be like below: +Reffering to [MMOCR Docs](https://mmocr.readthedocs.io/zh_CN/dev-1.x/user_guides/data_prepare/datasetzoo.html) to prepare datasets. Datasets path would look like below: -``` +```bash ├── mixture │ ├── coco_text │ │ ├── train_label.txt @@ -99,12 +98,12 @@ data folder would be like below: ## Step 3: Training ### Training on single card -```shell +```bash python3 train.py configs/sar_r31_parallel_decoder_academic.py ``` ### Training on mutil-cards -```shell +```bash bash dist_train.sh configs/sar_r31_parallel_decoder_academic.py 8 ``` diff --git a/cv/ocr/satrn/pytorch/base/README.md b/cv/ocr/satrn/pytorch/base/README.md index 6a334ffcc4cdbf3e06f5fa8e87512a90d848a1ad..fd09fb2d76972934312a4c6456ce65f3b0ace89c 100755 --- a/cv/ocr/satrn/pytorch/base/README.md +++ b/cv/ocr/satrn/pytorch/base/README.md @@ -5,27 +5,27 @@ Scene text recognition (STR) is the task of recognizing character sequences in natural scenes. While there have been great advances in STR methods, current methods still fail to recognize texts in arbitrary shapes, such as heavily curved or rotated texts, which are abundant in daily life (e.g. restaurant signs, product labels, company logos, etc). This paper introduces a novel architecture to recognizing texts of arbitrary shapes, named Self-Attention Text Recognition Network (SATRN), which is inspired by the Transformer. SATRN utilizes the self-attention mechanism to describe two-dimensional (2D) spatial dependencies of characters in a scene text image. Exploiting the full-graph propagation of self-attention, SATRN can recognize texts with arbitrary arrangements and large inter-character spacing. As a result, SATRN outperforms existing STR models by a large margin of 5.7 pp on average in "irregular text" benchmarks. We provide empirical analyses that illustrate the inner mechanisms and the extent to which the model is applicable (e.g. rotated and multi-line text). We will open-source the code. -## Step 1: Installing packages - -``` -$ cd /satrn/pytorch/base/csrc -$ bash clean.sh -$ bash build.sh -$ bash install.sh -$ cd .. -$ pip3 install -r requirements.txt +## Step 1: Installation + +```bash +cd /satrn/pytorch/base/csrc +bash clean.sh +bash build.sh +bash install.sh +cd .. +pip3 install -r requirements.txt ``` ## Step 2: Preparing datasets -```shell -$ mkdir data -$ cd data +```bash +mkdir data +cd data ``` Reffering to [MMOCR Docs](https://mmocr.readthedocs.io/zh_CN/dev-1.x/user_guides/data_prepare/datasetzoo.html) to prepare datasets. Datasets path would look like below: -``` +```bash ├── mixture │ ├── coco_text │ │ ├── train_label.txt @@ -99,13 +99,13 @@ Reffering to [MMOCR Docs](https://mmocr.readthedocs.io/zh_CN/dev-1.x/user_guides ## Step 3: Training ### Training on single card -```shell -$ python3 train.py configs/models/satrn_academic.py +```bash +python3 train.py configs/models/satrn_academic.py ``` ### Training on mutil-cards -```shell -$ bash dist_train.sh configs/models/satrn_academic.py 8 +```bash +bash dist_train.sh configs/models/satrn_academic.py 8 ``` ## Results on BI-V100