diff --git a/cv/classification/repvgg/paddlepaddle/README.md b/cv/classification/repvgg/paddlepaddle/README.md new file mode 100644 index 0000000000000000000000000000000000000000..67fe48605c11d1add8e07dd160a719ad11f46a48 --- /dev/null +++ b/cv/classification/repvgg/paddlepaddle/README.md @@ -0,0 +1,67 @@ +# RepVGG +## Model description + A simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology. Such decoupling of the training-time and inference-time architecture is realized by a structural re-parameterization technique so that the model is named RepVGG. + +## Step 1: Installing + +```bash +git clone --recursive https://github.com/PaddlePaddle/PaddleClas.git +cd PaddleClas +pip3 install -r requirements.txt +``` + +## Step 2: Download data + +Download the [ImageNet Dataset](https://www.image-net.org/download.php) + +```bash +# IMAGENET PATH as follow: +ls -al /home/datasets/imagenet_jpeg/ +total 52688 +drwxr-xr-x 1002 root root 24576 Mar 1 15:33 train +-rw-r--r-- 1 root root 43829433 May 16 07:55 train_list.txt +drwxr-xr-x 1002 root root 24576 Mar 1 15:41 val +-rw-r--r-- 1 root root 2144499 May 16 07:56 val_list.txt +----------------------- +# train_list.txt has the following format +train/n01440764/n01440764_10026.JPEG 0 +... + +# val_list.txt has the following format +val/ILSVRC2012_val_00000001.JPEG 65 +----------------------- +``` + +## Step 3: Run RepVGG + +```bash +# Make sure your dataset path is the same as above +#OR +# Modify the image_root of Train mode and Eval mode in the file: PaddleClas/ppcls/configs/ImageNet/RepVGG/RepVGG_A0.yaml + +DataLoader: + Train: + dataset: + name: ImageNetDataset + image_root: ./dataset/ILSVRC2012/ + cls_label_path: ./dataset/ILSVRC2012/train_list.txt +... +... + Eval: + dataset: + name: ImageNetDataset + image_root: ./dataset/ILSVRC2012/ + cls_label_path: ./dataset/ILSVRC2012/val_list.txt + +cd PaddleClas +# Link your dataset to default location +ln -s /home/datasets/imagenet_jpeg/ ./dataset/ILSVRC2012 +export FLAGS_cudnn_exhaustive_search=True +export FLAGS_cudnn_batchnorm_spatial_persistent=True +export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 +python3 -u -m paddle.distributed.launch --gpus=0,1,2,3,4,5,6,7 tools/train.py -c ppcls/configs/ImageNet/RepVGG/RepVGG_A0.yaml -o Arch.pretrained=False -o Global.device=gpu +``` + +| GPU | FP32 | +| ----------- | ------------------------------------ | +| 8 cards | Acc@1=0.6990 | diff --git a/cv/classification/resnest50/paddlepaddle/README.md b/cv/classification/resnest50/paddlepaddle/README.md new file mode 100644 index 0000000000000000000000000000000000000000..431924223c97978ab819d9d1099d8bf890cd16bb --- /dev/null +++ b/cv/classification/resnest50/paddlepaddle/README.md @@ -0,0 +1,67 @@ +# ResNeSt50 +## Model description +A ResNest is a variant on a ResNet, which instead stacks Split-Attention blocks. The cardinal group representations are then concatenated along the channel dimension.As in standard residual blocks, the final output of otheur Split-Attention block is produced using a shortcut connection. + +## Step 1: Installing + +```bash +git clone --recursive https://github.com/PaddlePaddle/PaddleClas.git +cd PaddleClas +pip3 install -r requirements.txt +``` + +## Step 2: Download data + +Download the [ImageNet Dataset](https://www.image-net.org/download.php) + +```bash +# IMAGENET PATH as follow: +ls -al /home/datasets/imagenet_jpeg/ +total 52688 +drwxr-xr-x 1002 root root 24576 Mar 1 15:33 train +-rw-r--r-- 1 root root 43829433 May 16 07:55 train_list.txt +drwxr-xr-x 1002 root root 24576 Mar 1 15:41 val +-rw-r--r-- 1 root root 2144499 May 16 07:56 val_list.txt +----------------------- +# train_list.txt has the following format +train/n01440764/n01440764_10026.JPEG 0 +... + +# val_list.txt has the following format +val/ILSVRC2012_val_00000001.JPEG 65 +----------------------- +``` + +## Step 3: Run ResNeSt50 + +```bash +# Make sure your dataset path is the same as above +# OR +# Modify the image_root of Train mode and Eval mode in the file: PaddleClas/ppcls/configs/ImageNet/ResNeSt/ResNeSt50.yaml + +DataLoader: + Train: + dataset: + name: ImageNetDataset + image_root: ./dataset/ILSVRC2012/ + cls_label_path: ./dataset/ILSVRC2012/train_list.txt +... +... + Eval: + dataset: + name: ImageNetDataset + image_root: ./dataset/ILSVRC2012/ + cls_label_path: ./dataset/ILSVRC2012/val_list.txt + +cd PaddleClas +# Link your dataset to default location +ln -s /home/datasets/imagenet_jpeg/ ./dataset/ILSVRC2012 +export FLAGS_cudnn_exhaustive_search=True +export FLAGS_cudnn_batchnorm_spatial_persistent=True +export CUDA_VISIBLE_DEVICES=0,1,2,3 +python3 -u -m paddle.distributed.launch --gpus=0,1,2,3 tools/train.py -c ppcls/configs/ImageNet/ResNeSt/ResNeSt50.yaml -o Arch.pretrained=False -o Global.device=gpu +``` + +| GPU | FP32 | +| ----------- | ------------------------------------ | +| 8 cards | Acc@1=0.7677 | diff --git a/cv/classification/swin_transformer/paddlepaddle/README.md b/cv/classification/swin_transformer/paddlepaddle/README.md new file mode 100644 index 0000000000000000000000000000000000000000..def17535ba792f87fe7ff3e0ce3ba0eb231a8d69 --- /dev/null +++ b/cv/classification/swin_transformer/paddlepaddle/README.md @@ -0,0 +1,67 @@ +# Swin-Transformer +## Model description +The Swin Transformer is a type of Vision Transformer. It builds hierarchical feature maps by merging image patches (shown in gray) in deeper layers and has linear computation complexity to input image size due to computation of self-attention only within each local window (shown in red). It can thus serve as a general-purpose backbone for both image classification and dense recognition tasks. + +## Step 1: Installing + +```bash +git clone --recursive https://github.com/PaddlePaddle/PaddleClas.git +cd PaddleClas +pip3 install -r requirements.txt +``` + +## Step 2: Download data + +Download the [ImageNet Dataset](https://www.image-net.org/download.php) + +```bash +# IMAGENET PATH as follow: +ls -al /home/datasets/imagenet_jpeg/ +total 52688 +drwxr-xr-x 1002 root root 24576 Mar 1 15:33 train +-rw-r--r-- 1 root root 43829433 May 16 07:55 train_list.txt +drwxr-xr-x 1002 root root 24576 Mar 1 15:41 val +-rw-r--r-- 1 root root 2144499 May 16 07:56 val_list.txt +----------------------- +# train_list.txt has the following format +train/n01440764/n01440764_10026.JPEG 0 +... + +# val_list.txt has the following format +val/ILSVRC2012_val_00000001.JPEG 65 +----------------------- +``` + +## Step 3: Run Swin-Transformer + +```bash +# Make sure your dataset path is the same as above +# OR +# Modify the image_root of Train mode and Eval mode in the file: PaddleClas/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml + +DataLoader: + Train: + dataset: + name: ImageNetDataset + image_root: ./dataset/ILSVRC2012/ + cls_label_path: ./dataset/ILSVRC2012/train_list.txt +... +... + Eval: + dataset: + name: ImageNetDataset + image_root: ./dataset/ILSVRC2012/ + cls_label_path: ./dataset/ILSVRC2012/val_list.txt + +cd PaddleClas +# Link your dataset to default location +ln -s /home/datasets/imagenet_jpeg/ ./dataset/ILSVRC2012 +export FLAGS_cudnn_exhaustive_search=True +export FLAGS_cudnn_batchnorm_spatial_persistent=True +export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 +python3 -u -m paddle.distributed.launch --gpus=0,1,2,3,4,5,6,7 tools/train.py -c ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml -o Arch.pretrained=False -o Global.device=gpu +``` + +| GPU | FP32 | +| ----------- | ------------------------------------ | +| 8 cards | Acc@1=0.8024 | diff --git a/cv/semantic_segmentation/bisenetv2/paddlepaddle/README.md b/cv/semantic_segmentation/bisenetv2/paddlepaddle/README.md new file mode 100644 index 0000000000000000000000000000000000000000..60cf6a180f029549bbfa6f0463bc8570decbfde5 --- /dev/null +++ b/cv/semantic_segmentation/bisenetv2/paddlepaddle/README.md @@ -0,0 +1,67 @@ +# BiSeNetV2 + +## Model description + +A novel Bilateral Segmentation Network (BiSeNet). +First design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. +Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. +On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. + +## Step 1: Installing + +```bash +git clone -b release/2.7 https://github.com/PaddlePaddle/PaddleSeg.git +cd PaddleSeg +pip3 install -r requirements.txt +``` + +## Step 2: Download data + +Download the [CityScapes Dataset](https://www.cityscapes-dataset.com/) + +```bash +# Datasets preprocessing +pip3 install cityscapesscripts + +python3 tools/convert_cityscapes.py --cityscapes_path /home/datasets/cityscapes/ --num_workers 8 + +python3 tools/create_dataset_list.py /home/datasets/cityscapes --type cityscapes --separator "," +# CityScapes PATH as follow: +ls -al /home/datasets/cityscapes/ +total 11567948 +drwxr-xr-x 4 root root 227 Jul 18 03:32 . +drwxr-xr-x 6 root root 179 Jul 18 06:48 .. +-rw-r--r-- 1 root root 298 Feb 20 2016 README +drwxr-xr-x 5 root root 58 Jul 18 03:30 gtFine +-rw-r--r-- 1 root root 252567705 Jul 18 03:22 gtFine_trainvaltest.zip +drwxr-xr-x 5 root root 58 Jul 18 03:30 leftImg8bit +-rw-r--r-- 1 root root 11592327197 Jul 18 03:27 leftImg8bit_trainvaltest.zip +-rw-r--r-- 1 root root 1646 Feb 17 2016 license.txt +-rw-r--r-- 1 root root 193690 Jul 18 03:32 test.txt +-rw-r--r-- 1 root root 398780 Jul 18 03:32 train.txt +-rw-r--r-- 1 root root 65900 Jul 18 03:32 val.txt +``` + +## Step 3: Run BiSeNetV2 + +```bash +# Make sure your dataset path is the same as above +data_dir=${data_dir:-/home/datasets/cityscapes/} +sed -i "s#: data/cityscapes#: ${data_dir}#g" configs/_base_/cityscapes.yml +export FLAGS_cudnn_exhaustive_search=True +export FLAGS_cudnn_batchnorm_spatial_persistent=True +# One GPU +export CUDA_VISIBLE_DEVICES=0 +python3 train.py --config configs/bisenet/bisenet_cityscapes_1024x1024_160k.yml --do_eval --use_vdl --save_interval 500 --save_dir output + +# Four GPUs +export CUDA_VISIBLE_DEVICES=0,1,2,3 +python3 -u -m paddle.distributed.launch --gpus 0,1,2,3 train.py \ + --config configs/bisenet/bisenet_cityscapes_1024x1024_160k.yml \ + --do_eval \ + --use_vdl +``` + +| GPU | FP32 | +| ----------- | ------------------------------------ | +| 8 cards | mIoU=73.45% | \ No newline at end of file diff --git a/cv/semantic_segmentation/deeplabv3plus/paddlepaddle/README.md b/cv/semantic_segmentation/deeplabv3plus/paddlepaddle/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a00aa6c524fae7512ed8a5c2147fd06a8769f285 --- /dev/null +++ b/cv/semantic_segmentation/deeplabv3plus/paddlepaddle/README.md @@ -0,0 +1,65 @@ +# DeepLabV3+ + +## Model description + +DeepLabv3 is a semantic segmentation architecture that improves upon DeepLabv2 with several modifications. +To handle the problem of segmenting objects at multiple scales, modules are designed which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates. + +## Step 1: Installing + +```bash +git clone -b release/2.7 https://github.com/PaddlePaddle/PaddleSeg.git +cd PaddleSeg +pip3 install -r requirements.txt +``` + +## Step 2: Download data + +Download the [CityScapes Dataset](https://www.cityscapes-dataset.com/) + +```bash +# Datasets preprocessing +pip3 install cityscapesscripts + +python3 tools/convert_cityscapes.py --cityscapes_path /home/datasets/cityscapes/ --num_workers 8 + +python3 tools/create_dataset_list.py /home/datasets/cityscapes --type cityscapes --separator "," +# CityScapes PATH as follow: +ls -al /home/datasets/cityscapes/ +total 11567948 +drwxr-xr-x 4 root root 227 Jul 18 03:32 . +drwxr-xr-x 6 root root 179 Jul 18 06:48 .. +-rw-r--r-- 1 root root 298 Feb 20 2016 README +drwxr-xr-x 5 root root 58 Jul 18 03:30 gtFine +-rw-r--r-- 1 root root 252567705 Jul 18 03:22 gtFine_trainvaltest.zip +drwxr-xr-x 5 root root 58 Jul 18 03:30 leftImg8bit +-rw-r--r-- 1 root root 11592327197 Jul 18 03:27 leftImg8bit_trainvaltest.zip +-rw-r--r-- 1 root root 1646 Feb 17 2016 license.txt +-rw-r--r-- 1 root root 193690 Jul 18 03:32 test.txt +-rw-r--r-- 1 root root 398780 Jul 18 03:32 train.txt +-rw-r--r-- 1 root root 65900 Jul 18 03:32 val.txt +``` + +## Step 3: Run DeepLabV3+ + +```bash +# Make sure your dataset path is the same as above +data_dir=${data_dir:-/home/datasets/cityscapes/} +sed -i "s#: data/cityscapes#: ${data_dir}#g" configs/_base_/cityscapes.yml +export FLAGS_cudnn_exhaustive_search=True +export FLAGS_cudnn_batchnorm_spatial_persistent=True +# One GPU +export CUDA_VISIBLE_DEVICES=0 +python3 train.py --config configs/deeplabv3p/deeplabv3p_resnet50_os8_cityscapes_1024x512_80k.yml --do_eval --use_vdl --save_interval 500 --save_dir output + +# Four GPUs +export CUDA_VISIBLE_DEVICES=0,1,2,3 +python3 -u -m paddle.distributed.launch --gpus 0,1,2,3 train.py \ + --config configs/deeplabv3p/deeplabv3p_resnet50_os8_cityscapes_1024x512_80k.yml \ + --do_eval \ + --use_vdl +``` + +| GPU | FP32 | +| ----------- | ------------------------------------ | +| 8 cards | mIoU =80.42% | \ No newline at end of file