From d507d43cf8724ea10f9dd5aed93e23ba7331b1d8 Mon Sep 17 00:00:00 2001 From: majorli Date: Mon, 18 Dec 2023 15:17:33 +0800 Subject: [PATCH] update SAR model datasets link Signed-off-by: majorli --- cv/ocr/sar/pytorch/README.md | 15 ++++++------- cv/ocr/satrn/pytorch/base/README.md | 34 ++++++++++++++--------------- 2 files changed, 24 insertions(+), 25 deletions(-) diff --git a/cv/ocr/sar/pytorch/README.md b/cv/ocr/sar/pytorch/README.md index 3b19d3894..06a7dee28 100755 --- a/cv/ocr/sar/pytorch/README.md +++ b/cv/ocr/sar/pytorch/README.md @@ -4,7 +4,7 @@ Recognizing irregular text in natural scene images is challenging due to the large variance in text appearance, such as curvature, orientation and distortion. Most existing approaches rely heavily on sophisticated model designs and/or extra fine-grained annotations, which, to some extent, increase the difficulty in algorithm implementation and data collection. In this work, we propose an easy-to-implement strong baseline for irregular scene text recognition, using off-the-shelf neural network components and only word-level annotations. It is composed of a 31-layer ResNet, an LSTM-based encoder-decoder framework and a 2-dimensional attention module. Despite its simplicity, the proposed method is robust and achieves state-of-the-art performance on both regular and irregular scene text recognition benchmarks. -## Step 1: Installing packages +## Step 1: Installation ```shell cd csrc/ @@ -17,15 +17,14 @@ pip3 install -r requirements.txt ## Step 2: Preparing datasets -```shell +```bash mkdir data -ln -s /path/to/mixture ./data/ +cd data ``` -Download datasets from this [page](https://mmocr.readthedocs.io/zh_CN/latest/datasets/recog.html), -data folder would be like below: +Reffering to [MMOCR Docs](https://mmocr.readthedocs.io/zh_CN/dev-1.x/user_guides/data_prepare/datasetzoo.html) to prepare datasets. Datasets path would look like below: -``` +```bash ├── mixture │ ├── coco_text │ │ ├── train_label.txt @@ -99,12 +98,12 @@ data folder would be like below: ## Step 3: Training ### Training on single card -```shell +```bash python3 train.py configs/sar_r31_parallel_decoder_academic.py ``` ### Training on mutil-cards -```shell +```bash bash dist_train.sh configs/sar_r31_parallel_decoder_academic.py 8 ``` diff --git a/cv/ocr/satrn/pytorch/base/README.md b/cv/ocr/satrn/pytorch/base/README.md index 6a334ffcc..fd09fb2d7 100755 --- a/cv/ocr/satrn/pytorch/base/README.md +++ b/cv/ocr/satrn/pytorch/base/README.md @@ -5,27 +5,27 @@ Scene text recognition (STR) is the task of recognizing character sequences in natural scenes. While there have been great advances in STR methods, current methods still fail to recognize texts in arbitrary shapes, such as heavily curved or rotated texts, which are abundant in daily life (e.g. restaurant signs, product labels, company logos, etc). This paper introduces a novel architecture to recognizing texts of arbitrary shapes, named Self-Attention Text Recognition Network (SATRN), which is inspired by the Transformer. SATRN utilizes the self-attention mechanism to describe two-dimensional (2D) spatial dependencies of characters in a scene text image. Exploiting the full-graph propagation of self-attention, SATRN can recognize texts with arbitrary arrangements and large inter-character spacing. As a result, SATRN outperforms existing STR models by a large margin of 5.7 pp on average in "irregular text" benchmarks. We provide empirical analyses that illustrate the inner mechanisms and the extent to which the model is applicable (e.g. rotated and multi-line text). We will open-source the code. -## Step 1: Installing packages - -``` -$ cd /satrn/pytorch/base/csrc -$ bash clean.sh -$ bash build.sh -$ bash install.sh -$ cd .. -$ pip3 install -r requirements.txt +## Step 1: Installation + +```bash +cd /satrn/pytorch/base/csrc +bash clean.sh +bash build.sh +bash install.sh +cd .. +pip3 install -r requirements.txt ``` ## Step 2: Preparing datasets -```shell -$ mkdir data -$ cd data +```bash +mkdir data +cd data ``` Reffering to [MMOCR Docs](https://mmocr.readthedocs.io/zh_CN/dev-1.x/user_guides/data_prepare/datasetzoo.html) to prepare datasets. Datasets path would look like below: -``` +```bash ├── mixture │ ├── coco_text │ │ ├── train_label.txt @@ -99,13 +99,13 @@ Reffering to [MMOCR Docs](https://mmocr.readthedocs.io/zh_CN/dev-1.x/user_guides ## Step 3: Training ### Training on single card -```shell -$ python3 train.py configs/models/satrn_academic.py +```bash +python3 train.py configs/models/satrn_academic.py ``` ### Training on mutil-cards -```shell -$ bash dist_train.sh configs/models/satrn_academic.py 8 +```bash +bash dist_train.sh configs/models/satrn_academic.py 8 ``` ## Results on BI-V100 -- Gitee