diff --git a/README.md b/README.md index 44ef3d793e93a64576eb888683e8909de0a578b8..c06100e3689b05fdc20429face34badce30cc15c 100644 --- a/README.md +++ b/README.md @@ -5,23 +5,23 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 ## 模型列表 - Computer Vision - - - [Classification](#classification) - - [Face Detection](#face-detection) - - [Face Recognition](#face-recognition) - - [Instance Segmentation](#instance-segmentation) - - [Knowledge Distillation](#knowledge-distillation) - - [Network Pruning](#network-pruning) - - [Object Detection](#object-detection) - - [3D Object Detection](#3d-object-detection) - - [OCR](#ocr) - - [Point Cloud](#point-cloud) - - [Pose Estimation](#pose-estimation) - - [Self-Supervised Learning](#self-supervised-learning) - - [Semantic Segmentation](#semantic-segmentation) - - [Super Resolution](#super-resolution) - - [Tracking](#tracking) - - [Traffic Forecast](#traffic-forecast) + + - [Classification](#classification) + - [Face Detection](#face-detection) + - [Face Recognition](#face-recognition) + - [Instance Segmentation](#instance-segmentation) + - [Knowledge Distillation](#knowledge-distillation) + - [Network Pruning](#network-pruning) + - [Object Detection](#object-detection) + - [3D Object Detection](#3d-object-detection) + - [OCR](#ocr) + - [Point Cloud](#point-cloud) + - [Pose Estimation](#pose-estimation) + - [Self-Supervised Learning](#self-supervised-learning) + - [Semantic Segmentation](#semantic-segmentation) + - [Super Resolution](#super-resolution) + - [Tracking](#tracking) + - [Traffic Forecast](#traffic-forecast) - Graph Neural Network (GNN) @@ -37,36 +37,35 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 - Natural Language Processing (NLP) - - [Cloze Test](#cloze-test) - - [Dialogue Generation](#dialogue-generation) - - [Language Modeling](#language-modeling) - - [Large Language Model (LLM)](#large-language-model-llm) - - [Text Correction](#text-correction) - - [Translation](#translation) + - [Cloze Test](#cloze-test) + - [Dialogue Generation](#dialogue-generation) + - [Language Modeling](#language-modeling) + - [Large Language Model (LLM)](#large-language-model-llm) + - [Text Correction](#text-correction) + - [Translation](#translation) - Recommendation - - [Collaborative Filtering](#collaborative-filtering) - - [Click Through Rate](#click-through-rate) + - [Collaborative Filtering](#collaborative-filtering) + - [Click Through Rate](#click-through-rate) - [Reinforcement Learning](#reinforcement-learning) - Speech - - [Speech Recognition](#speech-recognition) - - [Speech Synthesis](#speech-synthesis) + - [Speech Recognition](#speech-recognition) + - [Speech Synthesis](#speech-synthesis) - [3D Reconstruction](#3d-reconstruction) - -------- ### Computer Vision #### Classification -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [ACmix](cv/classification/acmix/pytorch/README.md) | PyTorch | ImageNet [ACNet](cv/classification/acnet/pytorch/README.md) | PyTorch | ImageNet [AlexNet](cv/classification/alexnet/pytorch/README.md) | PyTorch | ImageNet @@ -75,12 +74,12 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 [CBAM](cv/classification/cbam/pytorch/README.md)  | PyTorch | ImageNet [ConvNext](cv/classification/convnext/pytorch/README.md) | PyTorch | ImageNet [CspDarknet53](cv/classification/cspdarknet53/pytorch/README.md)  | PyTorch | ImageNet -[DenseNet121](cv/classification/densenet/paddlepaddle/README.md) | PaddlePaddle | ImageNet -[DenseNet201](cv/classification/densenet/pytorch/README.md) | PyTorch | ImageNet +[DenseNet](cv/classification/densenet/paddlepaddle/README.md) | PaddlePaddle | ImageNet +[DenseNet](cv/classification/densenet/pytorch/README.md) | PyTorch | ImageNet [DPN92](cv/classification/dpn92/pytorch/README.md) | PyTorch | ImageNet [DPN107](cv/classification/dpn107/pytorch/README.md) | PyTorch | ImageNet -[ECA_MobileNet_V2](cv/classification/eca_mobilenet_v2/pytorch/README.md) | PyTorch | ImageNet -[ECA_RESNET152](cv/classification/eca_resnet152/pytorch/README.md) | PyTorch | ImageNet +[ECA-MobileNetV2](cv/classification/eca_mobilenet_v2/pytorch/README.md) | PyTorch | ImageNet +[ECA-ResNet152](cv/classification/eca_resnet152/pytorch/README.md) | PyTorch | ImageNet [Efficientb4](cv/classification/efficientb4/pytorch/README.md) | PyTorch | ImageNet [EfficientNetB0](cv/classification/efficientnet_b0/paddlepaddle/README.md) | PaddlePaddle | ImageNet [FasterNet](cv/classification/fasternet/pytorch/README.md)  | PyTorch | ImageNet @@ -136,14 +135,14 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Face Detection -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [RetinaFace](cv/face/retinaface/pytorch/README.md) | PyTorch | WiderFace #### Face Recognition -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [ArcFace](cv/face/arcface/pytorch/README.md)  | PyTorch | CASIA-WebFaces&LFW [BlazeFace](cv/face/blazeface/paddlepaddle/README.md)  | PaddlePaddle | WIDER-FACE [CosFace](cv/face/cosface/pytorch/README.md)  | PyTorch | CASIA-WebFaces&LFW @@ -152,8 +151,8 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Instance Segmentation -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [SOLO](cv/instance_segmentation/SOLO/pytorch/README.md) | PyTorch | COCO [SOLOv2](cv/detection/solov2/paddlepaddle/README.md) | PaddlePaddle | COCO [SOLOv2](cv/instance_segmentation/solov2/pytorch/README.md) | PyTorch | COCO @@ -161,29 +160,29 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Image Generation -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [DCGAN](cv/image_generation/dcgan/mindspore/README.md) | MindSpore | ImageNet [Pix2Pix](cv/image_generation/Pix2pix/paddlepaddle/README.md) | PaddlePaddle | facades #### Knowledge Distillation -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [CWD](cv/distiller/CWD/pytorch/README.md)  | PyTorch | Cityscapes [RKD](cv/distiller/RKD/pytorch/README.md)  | PyTorch | CUB-200-2011 [WSLD](cv/distiller/WSLD/pytorch/README.md) | PyTorch | ImageNet #### Network Pruning -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [Network Slimming](cv/Pruning/Network-Slimming/pytorch/README.md)  | PyTorch | CIFAR-10/100 #### Object Detection -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [ATSS](cv/detection/atss_mmdet/pytorch/README.md)  | PyTorch (MMDetection) | COCO [AutoAssign](cv/detection/autoassign/pytorch/README.md) | PyTorch | COCO [Cascade R-CNN](cv/detection/cascade_rcnn_mmdet/pytorch/README.md)  | PyTorch (MMDetection) | COCO @@ -226,8 +225,8 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### 3D Object Detection -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [BEVFormer](cv/3d_detection/BEVFormer/pytorch/README.md) | PyTorch | nuScenes&CAN bus [CenterPoint](cv/3d_detection/centerpoint/pytorch/README.md) | PyTorch | nuScenes [PAConv](cv/3d_detection/PAConv/pytorch/README.md) | PyTorch | S3DIS @@ -242,8 +241,8 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### OCR -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [CRNN](cv/ocr/crnn/mindspore/README.md) | MindSpore | OCR_Recog [CRNN](cv/ocr/crnn/paddlepaddle/README.md) | PaddlePaddle | LMDB [DBNet](cv/ocr/dbnet/pytorch/README.md) | PyTorch | ICDAR2015 @@ -258,14 +257,14 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Point Cloud -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [Point-BERT](cv/point_cloud/Point-BERT/pytorch/README.md) | PyTorch | ShapeNet55 & processed ModelNet #### Pose Estimation -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [AlphaPose](cv/pose/alphapose/pytorch/README.md)  | PyTorch | COCO [HRNet](cv/pose/hrnet/pytorch/README.md) | PyTorch | COCO [HRNet-W32](cv/pose/hrnet/paddlepaddle/README.md) | PaddlePaddle | COCO @@ -273,14 +272,14 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Self-Supervised Learning -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [MAE](cv/self_supervised_learning/MAE/pytorch/README.md)  | PyTorch | ImageNet #### Semantic Segmentation -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [3D-UNet](cv/semantic_segmentation/unet3d/pytorch/README.md) | PyTorch | kits19 [APCNet](cv/semantic_segmentation/apcnet/pytorch/README.md) | PyTorch | Cityscapes [Attention U-net](cv/semantic_segmentation/att_unet/pytorch/README.md)  | PyTorch | Cityscapes @@ -334,11 +333,11 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Super Resolution -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [basicVSR++](cv/super_resolution/basicVSR++/pytorch/README.md) | PyTorch | REDS [basicVSR](cv/super_resolution/basicVSR/pytorch/README.md) | PyTorch | REDS -[ESRGAN](cv/super_resolution/esrgan/pytorch/README.md) | PyTorch | DIV2K +[ESRGAN](cv/super_resolution/esrgan/pytorch/README.md) | PyTorch | DIV2K [LIIF](cv/super_resolution/liif/pytorch/README.md) | PyTorch | DIV2K [RealBasicVSR](cv/super_resolution/real_basicVSR/pytorch/README.md) | PyTorch | REDS [TTSR](cv/super_resolution/ttsr/pytorch/README.md) | PyTorch | CUFED @@ -346,51 +345,49 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Tracking -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [ByteTrack](cv/tracking/bytetrack/paddlepaddle/README.md) | PaddlePaddle | MOT17 [FairMOT](cv/tracking/fairmot/pytorch/README.md) | PyTorch | MOT17 #### Traffic Forecast -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [Graph WaveNet](cv/traffic_forecast/graph_wavenet/pytorch/README.md) | PyTorch | METR-LA & PEMS-BAY ### GNN #### Graph Attention -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [GAT](gnn/graph_attention/gat/paddlepaddle/README.md) | PaddlePaddle | CORA #### Node Classification -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [GraphSAGE](gnn/node_classification/graphsage/paddlepaddle/README.md) | PaddlePaddle | Reddit #### Text Classification -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [GCN](gnn/text_classification/GCN/mindspore/README.md) | MindSpore | CORA & Citeseer [GCN](gnn/text_classification/GCN/paddlepaddle/README.md) | PaddlePaddle | CORA & PubMed & Citeseer - ### HPC #### Molecular Dynamics -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [Water/se_e2_a](hpc/molecular_dynamics/water_se_e2_a/tensorflow/README.md) | TensorFlow (DeePMD-kit) | data_water - ### Multimodal -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [BLIP](multimodal/BLIP/pytorch/README.md) | PyTorch | COCO [CLIP](multimodal/Language-Image_Pre-Training/clip/pytorch/README.md) | PyTorch | CIFAR100 [ControlNet](multimodal/diffusion/ControlNet/README.md) | PyTorch | Fill50K @@ -402,20 +399,20 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Cloze Test -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [GLM](nlp/cloze_test/glm/pytorch/GLMForMultiTokenCloze/README.md) | PyTorch | GLMForMultiTokenCloze #### Dialogue Generation -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [CPM](nlp/dialogue_generation/cpm/pytorch/README.md) | PyTorch | STC #### Language Modeling -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [BART](nlp/language_model/bart_fairseq/pytorch/README.md)  | PyTorch (Fairseq) | RTE [BERT NER](nlp/ner/bert/pytorch/README.md) | PyTorch | CoNLL-2003 [BERT Pretraining](nlp/language_model/bert/pytorch/README.md) | PyTorch | MLCommon Wikipedia (2048_shards_uncompressed) @@ -431,8 +428,9 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Large Language Model (LLM) -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- +[Bloom-7B1](nlp/llm/bloom-7b1/firefly/README.md) | PyTorch (Firefly) | school_math_0.25M & bloom-7b1 [ChatGLM-6B](nlp/llm/chatglm-6b/deepspeed/README.md) | PyTorch (DeepSpeed) | ADGEN & chatglm-6b [ChatGLM2-6B SFT](nlp/llm/ChatGLM2-6b-sft/README.md) | PyTorch | ADGEN & chatglm2-6b [LLaMA-7B](nlp/llm/llama-7b/colossalai/README.md) | PyTorch (Colossal-AI) | llama-7b-hf @@ -440,17 +438,18 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 [Llama-2-7B Reward Model Finetuning](nlp/llm/llama2-7b_reward_sft/deepspeed/README.md) | PyTorch (DeepSpeed) | Dahoas/rm-static [Llama-7B RLHF](nlp/llm/llama2-7b_rlhf/megatron-deepspeed/README.md) | PyTorch (Megatron-DeepSpeed) | llama2-7b&tiny-llama [Llama-2-7B SFT](nlp/llm/llama2-7b_sft/megatron-deepspeed/README.md) | PyTorch (Megatron-DeepSpeed) | gpt_small-117M +[QWen-7B](nlp/llm/qwen-7b/firefly/README.md) | PyTorch (Firefly) | qwen-7b #### Text Correction -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [Ernie](nlp/text_correction/ernie/paddlepaddle/README.md) | PaddlePaddle | corpus #### Translation -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [Convolutional](nlp/translation/convolutional_fairseq/pytorch/README.md)  | PyTorch (Fairseq) | WMT14 [T5](nlp/translation/t5/pytorch/README.md) | PyTorch | wmt14-en-de-pre-processed [Transformer](nlp/translation/transformer/paddlepaddle/README.md) | PaddlePaddle | wmt14-en-de-pre-processed @@ -460,14 +459,14 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Collaborative Filtering -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [NCF](recommendation/collaborative_filtering/ncf/pytorch/README.md) | PyTorch | movielens #### Click Through Rate -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [DLRM](recommendation/ctr/dlrm/pytorch/README.md) | PyTorch | Criteo_Terabyte [DLRM](recommendation/ctr/dlrm/paddlepaddle/README.md) | PaddlePaddle | Criteo_Terabyte [FFM](recommendation/ctr/ffm/paddlepaddle/README.md) | PaddlePaddle | Criteo_Terabyte @@ -475,20 +474,18 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 [Wide&Deep](recommendation/ctr/wide_deep/paddlepaddle/README.md) | PaddlePaddle | Criteo_Terabyte [xDeepFM](recommendation/ctr/xdeepfm/paddlepaddle/README.md) | PaddlePaddle | Criteo_Terabyte +### Reinforcement Learning -### Reinforement Learning - -模型名称 | 框架 | 数据集 --------- | ------ | ---- -[DQN](Reinforement_Learning/DQN/paddlepaddle/README.md) | PaddlePaddle | CartPole-v0 - +模型名称 | 框架 | 数据集 +-------- | ------ | ---- +[DQN](reinforcement_learning/DQN/paddlepaddle/README.md) | PaddlePaddle | CartPole-v0 ### Speech #### Speech Recognition -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [Conformer](speech/speech_recognition/conformer_wenet/pytorch/README.md) | PyTorch (WeNet) | AISHELL [Efficient Conformer v2](speech/speech_recognition/efficient_conformer_v2_wenet/pytorch/README.md) | PyTorch (WeNet) | AISHELL [PP-ASR-Conformer](speech/speech_recognition/conformer/paddlepaddle/README.md) | PaddlePaddle | AISHELL @@ -499,8 +496,8 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 #### Speech Synthesis -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [PP-TTS-FastSpeech2](speech/speech_synthesis/fastspeech2/paddlepaddle/README.md) | PaddlePaddle | CSMSC [PP-TTS-HiFiGAN](speech/speech_synthesis/hifigan/paddlepaddle/README.md) | PaddlePaddle | CSMSC [Tacotron2](speech/speech_synthesis/tacotron2/pytorch/README.md) | PyTorch | LJSpeech @@ -509,17 +506,17 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 ### 3D Reconstruction -模型名称 | 框架 | 数据集 --------- | ------ | ---- +模型名称 | 框架 | 数据集 +-------- | ------ | ---- [HashNeRF](3d-reconstruction/ngp-nerf/pytorch/README.md) | PyTorch | fox -------- +-------- ## 容器镜像构建方式 社区用户可参考[容器镜像构建说明](docker/Iluvatar/README.md)在本地构建出能够运行DeepSparkHub仓库中模型的容器镜像。 -------- +-------- ## 社区 @@ -538,4 +535,3 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 ## 许可证 本项目许可证遵循[Apache-2.0](LICENSE)。 - diff --git a/cv/classification/densenet/paddlepaddle/README.md b/cv/classification/densenet/paddlepaddle/README.md index b662155655f52890ddcb5287d18bcd83e1c29ea9..ea27049f87bc8523bfa2452bf2c3460a18788183 100644 --- a/cv/classification/densenet/paddlepaddle/README.md +++ b/cv/classification/densenet/paddlepaddle/README.md @@ -1,8 +1,11 @@ # DenseNet + ## Model description + A DenseNet is a type of convolutional neural network that utilises dense connections between layers, through Dense Blocks, where we connect all layers (with matching feature-map sizes) directly with each other. To preserve the feed-forward nature, each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers. ## Step 1: Installation + ```bash git clone --recursive https://github.com/PaddlePaddle/PaddleClas.git @@ -18,6 +21,7 @@ python3 setup.py install ``` ## Step 2: Preparing Datasets + Sign up and login in [ImageNet official website](https://www.image-net.org/index.php), then choose 'Download' to download the whole ImageNet dataset. Specify `/path/to/imagenet` to your ImageNet path in later training process. The ImageNet dataset path structure should look like: @@ -58,4 +62,5 @@ python3 -u -m paddle.distributed.launch --gpus=0,1,2,3 tools/train.py -c ppcls/c | BI-V100 x 4 | 0.757 | 0.925 | 171 | ## Reference + - [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) diff --git a/cv/classification/densenet/pytorch/README.md b/cv/classification/densenet/pytorch/README.md index c2c13fee809e2899a864cd528ff3d5e0199ab357..ed15508c370301032764a7f812b440c8c6a69157 100755 --- a/cv/classification/densenet/pytorch/README.md +++ b/cv/classification/densenet/pytorch/README.md @@ -1,8 +1,11 @@ # DenseNet + ## Model description + A DenseNet is a type of convolutional neural network that utilises dense connections between layers, through Dense Blocks, where we connect all layers (with matching feature-map sizes) directly with each other. To preserve the feed-forward nature, each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers. ## Step 1: Installing + ```bash pip install torch torchvision ``` @@ -26,14 +29,19 @@ imagenet ``` ## Step 2: Training + ### One single GPU + ```bash python3 train.py --data-path /path/to/imagenet --model densenet201 --batch-size 128 ``` + ### Multiple GPUs on one machine + ```bash python3 -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --data-path /path/to/imagenet --model densenet201 --batch-size 128 ``` ## Reference -https://github.com/pytorch/vision/blob/main/torchvision/models/densenet.py + +[densenet](https://github.com/pytorch/vision/blob/main/torchvision/models/densenet.py) diff --git a/cv/classification/eca_mobilenet_v2/pytorch/README.md b/cv/classification/eca_mobilenet_v2/pytorch/README.md index 40fa245303c3349de1feefd506a31c0121d81c32..05f2dcd120f04a57bd23d82fe70bf9794c063805 100644 --- a/cv/classification/eca_mobilenet_v2/pytorch/README.md +++ b/cv/classification/eca_mobilenet_v2/pytorch/README.md @@ -1,8 +1,11 @@ -# ECA_MobileNet_V2 +# ECA MobileNet V2 + ## Model description + An ECA-Net is a type of convolutional neural network that utilises an Efficient Channel Attention module. ## Step 1: Installing + ```bash pip3 install -r requirements.txt ``` @@ -25,15 +28,16 @@ imagenet └── val_list.txt ``` - ## Step 2: Training + ### Multiple GPUs on one machine (AMP) + Set data path by `export DATA_PATH=/path/to/imagenet`. The following command uses all cards to train: ```bash bash train_eca_mobilenet_v2_amp_dist.sh ``` - ## Reference + - [torchvision](https://github.com/pytorch/vision/tree/main/references/classification) diff --git a/cv/classification/eca_resnet152/pytorch/README.md b/cv/classification/eca_resnet152/pytorch/README.md index 20620487cefe86d2487ad4186412dfa9b840f289..53f461652c97be54144b49c18c253cfc342d2df7 100644 --- a/cv/classification/eca_resnet152/pytorch/README.md +++ b/cv/classification/eca_resnet152/pytorch/README.md @@ -1,8 +1,11 @@ -# ECA_RESNET152 +# ECA ResNet152 + ## Model description + An ECA-Net is a type of convolutional neural network that utilises an Efficient Channel Attention module. ## Step 1: Installing + ```bash pip3 install -r requirements.txt ``` @@ -25,16 +28,16 @@ imagenet └── val_list.txt ``` - ## Step 2: Training + ### Multiple GPUs on one machine (AMP) + Set data path by `export DATA_PATH=/path/to/imagenet`. The following command uses all cards to train: ```bash bash train_eca_resnet152_amp_dist.sh ``` - - ## Reference + - [torchvision](https://github.com/pytorch/vision/tree/main/references/classification) diff --git a/nlp/llm/bloom-7b1/firefly/README.md b/nlp/llm/bloom-7b1/firefly/README.md index 61d2bdc56d81459e06d13eb0a87c27ac6fd378f2..d662c6ef175d94566b0d6d2c669d5c558298344e 100755 --- a/nlp/llm/bloom-7b1/firefly/README.md +++ b/nlp/llm/bloom-7b1/firefly/README.md @@ -1,46 +1,44 @@ -# Bloom 7B1 +# Bloom-7B1 ## Model description BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks. +## Step 1: Installation -## Step 1: Preparing datasets - +```bash +# install firefly +pushd /toolbox/firefly +pip3 install -r requirements.txt +python3 setup.py develop +popd ``` + +## Step 2: Preparing datasets and checkpoints + +```bash mkdir -p data && cd data # you can download dataset from huggingface, website here: https://huggingface.co/datasets/BelleGroup/school_math_0.25M ``` -## Step 2: Preparing checkpoint - -``` +```bash mkdir -p checkpoint && cd checkpoint # you can download weights from hugginface, website here: https://huggingface.co/bigscience/bloom-7b1 ``` -## Step 3: installation -``` -cd firefly -bash build_firefly.sh && bash install firefly -``` - -## Step 4: Training +## Step 3: Training - -``` +```bash +# how to train bash train.sh {num_gpus} {config_file} {train_type} -``` -for example train with full sft -``` + +# train with sft full bash train.sh 16 configs/bloom-sft-full.json full -``` -for example train with qlora -``` + +# train with qlora bash train.sh 1 configs/bloom-sft-qlora.json qlora ``` - ## Results | No. | model | peft | num_gpus |train_samples_per_second | train_steps_per_second | @@ -48,8 +46,6 @@ bash train.sh 1 configs/bloom-sft-qlora.json qlora | 1 | bloom-7B1 | QLoRA | 1 | 2.041 | 0.128 | | 2 | bloom-7B1 | Full sft | 16 | 4.587 | 0.072 | - ## Reference - [Firefly](https://github.com/yangjianxin1/Firefly) - diff --git a/nlp/llm/qwen-7b/firefly/README.md b/nlp/llm/qwen-7b/firefly/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a84d62afd626f4ef69111d51a0b5f4fc10f3a15b --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/README.md @@ -0,0 +1,51 @@ +# Qwen-7B + +## Model description + +Qwen-7B is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. + +## Step 1: Installation + +```bash +# install firefly +pushd /toolbox/firefly +pip3 install -r requirements.txt +python3 setup.py develop +popd +``` + +## Step 2: Preparing datasets and checkpoints + +```bash +pip install modelscope +python3 ./get_Qwen-7B.py +mkdir -p /home/model_zoo/nlp +mv /root/.cache/modelscope/hub/qwen/Qwen-7B /home/model_zoo/nlp +``` + +## Step 3: Training + +```bash +# how to train + +# train with sft full +bash train.sh 16 configs/qwen-7b-sft-full.json full + +# train with Lora +bash train.sh 1 configs/qwen-7b-sft-lora.json lora + +# train with Ptuning-V2 +bash train.sh 1 configs/qwen-7b-sft-ptuning_v2.json ptuning_v2 +``` + +## Results + +| No. | model | peft | num_gpus |train_samples_per_second | +| ---- | --------- | ----------- | ------------------ | ---------------------- | +| 1 | qwn-7B | Full sft | 16 | 12.430 | +| 2 | qwn-7B | LoRA | 1 | 3.409 | +| 3 | qwn-7B | Ptuning_V2 | 1 | 4.827 | + +## Reference + +- [Firefly](https://github.com/yangjianxin1/Firefly) diff --git a/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config.json b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config.json new file mode 100644 index 0000000000000000000000000000000000000000..914e1368ed2bab7f94b3cbf4a20b225d504207ab --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config.json @@ -0,0 +1,41 @@ +{ + "gradient_accumulation_steps": "auto", + "gradient_clipping": "auto", + "steps_per_print": 200, + "train_batch_size": "auto", + "train_micro_batch_size_per_gpu": "auto", + "wall_clock_breakdown": false, + + "optimizer": { + "type": "Adam", + "params": { + "lr": "auto", + "betas": "auto", + "eps": "auto", + "weight_decay": "auto" + } + }, + "fp16": { + "enabled": "auto", + "loss_scale": 0, + "loss_scale_window": 1000, + "initial_scale_power": 16, + "hysteresis": 2, + "min_loss_scale": 1 + }, + "zero_optimization": { + "stage": 2, + "allgather_partitions": true, + "allgather_bucket_size": 5e8, + "reduce_scatter": true, + "reduce_bucket_size": 5e8 + }, + "scheduler": { + "type": "WarmupLR", + "params": { + "warmup_min_lr": "auto", + "warmup_max_lr": "auto", + "warmup_num_steps": "auto" + } + } +} \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config_bf16.json b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config_bf16.json new file mode 100644 index 0000000000000000000000000000000000000000..7bf2f8567e6433020a3cf924e2c450c89fa3a5f9 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config_bf16.json @@ -0,0 +1,36 @@ +{ + "gradient_accumulation_steps": "auto", + "gradient_clipping": "auto", + "steps_per_print": 200, + "train_batch_size": "auto", + "train_micro_batch_size_per_gpu": "auto", + "wall_clock_breakdown": false, + + "optimizer": { + "type": "Adam", + "params": { + "lr": "auto", + "betas": "auto", + "eps": "auto", + "weight_decay": "auto" + } + }, + "bf16": { + "enabled": true + }, + "zero_optimization": { + "stage": 2, + "allgather_partitions": true, + "allgather_bucket_size": 5e8, + "reduce_scatter": true, + "reduce_bucket_size": 5e8 + }, + "scheduler": { + "type": "WarmupLR", + "params": { + "warmup_min_lr": "auto", + "warmup_max_lr": "auto", + "warmup_num_steps": "auto" + } + } +} \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config_offload.json b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config_offload.json new file mode 100644 index 0000000000000000000000000000000000000000..e752e410e80fd4a9dd0ffe6809a951228d89c97d --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z2_config_offload.json @@ -0,0 +1,49 @@ +{ + "gradient_accumulation_steps": "auto", + "gradient_clipping": "auto", + "steps_per_print": 200, + "train_batch_size": "auto", + "train_micro_batch_size_per_gpu": "auto", + "wall_clock_breakdown": false, + + "optimizer": { + "type": "Adam", + "params": { + "lr": "auto", + "betas": "auto", + "eps": "auto", + "weight_decay": "auto" + } + }, + "fp16": { + "enabled": "auto", + "loss_scale": 0, + "loss_scale_window": 1000, + "initial_scale_power": 16, + "hysteresis": 2, + "min_loss_scale": 1 + }, + "zero_optimization": { + "stage": 2, + "allgather_partitions": true, + "allgather_bucket_size": 5e8, + "reduce_scatter": true, + "reduce_bucket_size": 5e8, + "offload_optimizer": { + "device": "cpu", + "pin_memory": true + }, + "offload_param": { + "device": "cpu", + "pin_memory": true + } + }, + "scheduler": { + "type": "WarmupLR", + "params": { + "warmup_min_lr": "auto", + "warmup_max_lr": "auto", + "warmup_num_steps": "auto" + } + } +} \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config.json b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config.json new file mode 100644 index 0000000000000000000000000000000000000000..18a0cee39ebc393d111e58c79579b7936301a4e6 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config.json @@ -0,0 +1,46 @@ +{ + "gradient_accumulation_steps": "auto", + "gradient_clipping": "auto", + "steps_per_print": 200, + "train_batch_size": "auto", + "train_micro_batch_size_per_gpu": "auto", + "wall_clock_breakdown": false, + + "optimizer": { + "type": "Adam", + "params": { + "lr": "auto", + "betas": "auto", + "eps": "auto", + "weight_decay": "auto" + } + }, + "fp16": { + "enabled": "auto", + "loss_scale": 0, + "loss_scale_window": 1000, + "initial_scale_power": 16, + "hysteresis": 2, + "min_loss_scale": 1 + }, + "zero_optimization": { + "stage": 3, + "overlap_comm": true, + "contiguous_gradients": true, + "sub_group_size": 1e9, + "reduce_bucket_size": "auto", + "stage3_prefetch_bucket_size": "auto", + "stage3_param_persistence_threshold": "auto", + "stage3_max_live_parameters": 1e9, + "stage3_max_reuse_distance": 1e9, + "stage3_gather_16bit_weights_on_model_save": true + }, + "scheduler": { + "type": "WarmupLR", + "params": { + "warmup_min_lr": "auto", + "warmup_max_lr": "auto", + "warmup_num_steps": "auto" + } + } +} \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config_bf16.json b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config_bf16.json new file mode 100644 index 0000000000000000000000000000000000000000..b113d621a8678cc6c3dd674b667dc5d5da3d33d0 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config_bf16.json @@ -0,0 +1,41 @@ +{ + "gradient_accumulation_steps": "auto", + "gradient_clipping": "auto", + "steps_per_print": 200, + "train_batch_size": "auto", + "train_micro_batch_size_per_gpu": "auto", + "wall_clock_breakdown": false, + + "optimizer": { + "type": "Adam", + "params": { + "lr": "auto", + "betas": "auto", + "eps": "auto", + "weight_decay": "auto" + } + }, + "bf16": { + "enabled": true + }, + "zero_optimization": { + "stage": 3, + "overlap_comm": true, + "contiguous_gradients": true, + "sub_group_size": 1e9, + "reduce_bucket_size": "auto", + "stage3_prefetch_bucket_size": "auto", + "stage3_param_persistence_threshold": "auto", + "stage3_max_live_parameters": 1e9, + "stage3_max_reuse_distance": 1e9, + "stage3_gather_16bit_weights_on_model_save": true + }, + "scheduler": { + "type": "WarmupLR", + "params": { + "warmup_min_lr": "auto", + "warmup_max_lr": "auto", + "warmup_num_steps": "auto" + } + } +} \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config_offload.json b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config_offload.json new file mode 100644 index 0000000000000000000000000000000000000000..15a8125c7014b3df651eda856392c4faaa233206 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/ds_config/ds_z3_config_offload.json @@ -0,0 +1,54 @@ +{ + "gradient_accumulation_steps": "auto", + "gradient_clipping": "auto", + "steps_per_print": 200, + "train_batch_size": "auto", + "train_micro_batch_size_per_gpu": "auto", + "wall_clock_breakdown": false, + + "optimizer": { + "type": "Adam", + "params": { + "lr": "auto", + "betas": "auto", + "eps": "auto", + "weight_decay": "auto" + } + }, + "fp16": { + "enabled": "auto", + "loss_scale": 0, + "loss_scale_window": 1000, + "initial_scale_power": 16, + "hysteresis": 2, + "min_loss_scale": 1 + }, + "zero_optimization": { + "stage": 3, + "overlap_comm": true, + "contiguous_gradients": true, + "sub_group_size": 1e9, + "reduce_bucket_size": "auto", + "stage3_prefetch_bucket_size": "auto", + "stage3_param_persistence_threshold": "auto", + "stage3_max_live_parameters": 1e9, + "stage3_max_reuse_distance": 1e9, + "stage3_gather_16bit_weights_on_model_save": true, + "offload_optimizer": { + "device": "cpu", + "pin_memory": true + }, + "offload_param": { + "device": "cpu", + "pin_memory": true + } + }, + "scheduler": { + "type": "WarmupLR", + "params": { + "warmup_min_lr": "auto", + "warmup_max_lr": "auto", + "warmup_num_steps": "auto" + } + } +} \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-full.json b/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-full.json new file mode 100644 index 0000000000000000000000000000000000000000..311d78269721502aae43dc539c0e1827a1c9b5b0 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-full.json @@ -0,0 +1,36 @@ +{ + "output_dir": "output/firefly-qwen-7b-full", + "model_name_or_path": "/home/model_zoo/nlp/Qwen-7B", + "deepspeed": "configs/ds_config/ds_z3_config_bf16.json", + "train_file": "./data/school_math_0.25M.jsonl", + "num_train_epochs": 1, + "max_steps": 50, + "per_device_train_batch_size": 4, + "gradient_accumulation_steps": 4, + "learning_rate": 1e-5, + "max_seq_length": 2048, + "logging_steps": 200, + "save_steps": 500, + "save_total_limit": 1, + "lr_scheduler_type": "cosine", + "warmup_steps": 1000, + + "template_name": "llama2", + "peft_type": "full", + "task_type": "sft", + + "gradient_checkpointing": true, + "disable_tqdm": false, + "optim": "adamw_hf", + "seed": 42, + "bf16": true, + "report_to": "tensorboard", + "dataloader_num_workers": 5, + "save_strategy": "steps", + "weight_decay": 0, + "max_grad_norm": 1.0, + "remove_unused_columns": false +} + + + diff --git a/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-lora.json b/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-lora.json new file mode 100644 index 0000000000000000000000000000000000000000..24c0ae2bebde3eacee87f365ce9bb96b917c1e09 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-lora.json @@ -0,0 +1,36 @@ +{ + "output_dir": "output/firefly-qwen-7b", + "model_name_or_path": "/home/model_zoo/nlp/Qwen-7B", + "train_file": "./data/school_math_0.25M.jsonl", + "num_train_epochs": 1, + "max_steps": 50, + "per_device_train_batch_size": 4, + "gradient_accumulation_steps": 2, + "learning_rate": 1e-4, + "max_seq_length": 1024, + "logging_steps": 300, + "save_steps": 500, + "save_total_limit": 1, + "lr_scheduler_type": "constant_with_warmup", + "warmup_steps": 3000, + + "template_name": "qwen", + "peft_type": "lora", + "task_type": "sft", + "lora_rank": 64, + "lora_alpha": 16, + "lora_dropout": 0.05, + + "gradient_checkpointing": true, + "disable_tqdm": false, + "optim": "adamw_hf", + "seed": 42, + "bf16": true, + "report_to": "tensorboard", + "dataloader_num_workers": 10, + "save_strategy": "steps", + "weight_decay": 0, + "max_grad_norm": 0.3, + "remove_unused_columns": false, + "gradient_checkpointing_kwargs": {"use_reentrant":false} +} diff --git a/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-ptuning_v2.json b/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-ptuning_v2.json new file mode 100644 index 0000000000000000000000000000000000000000..3406935b90da7cbba404008001273d8cad08970f --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/configs/qwen-7b-sft-ptuning_v2.json @@ -0,0 +1,37 @@ +{ + "output_dir": "output/firefly-qwen-7b", + "model_name_or_path": "/home/model_zoo/nlp/Qwen-7B", + "train_file": "./data/school_math_0.25M.jsonl", + "num_train_epochs": 1, + "max_steps": 50, + "per_device_train_batch_size": 4, + "gradient_accumulation_steps": 2, + "learning_rate": 1e-4, + "max_seq_length": 1024, + "logging_steps": 300, + "save_steps": 500, + "save_total_limit": 1, + "lr_scheduler_type": "constant_with_warmup", + "warmup_steps": 3000, + + "template_name": "qwen", + "peft_type": "ptuning_v2", + "task_type": "sft", + "num_virtual_tokens": 20, + "token_dim": 4096, + "num_attention_heads": 32, + "encoder_hidden_size": 768, + "prefix_projection": false, + + "gradient_checkpointing": true, + "disable_tqdm": false, + "optim": "adamw_hf", + "seed": 42, + "bf16": true, + "report_to": "tensorboard", + "dataloader_num_workers": 10, + "save_strategy": "steps", + "weight_decay": 0, + "max_grad_norm": 0.3, + "remove_unused_columns": false +} diff --git a/nlp/llm/qwen-7b/firefly/get_Qwen-7B.py b/nlp/llm/qwen-7b/firefly/get_Qwen-7B.py new file mode 100644 index 0000000000000000000000000000000000000000..696495566222799fc9203c2eaf02f7cba2584dc3 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/get_Qwen-7B.py @@ -0,0 +1,3 @@ +#模型下载 +from modelscope import snapshot_download +model_dir = snapshot_download('qwen/Qwen-7B') \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/main.py b/nlp/llm/qwen-7b/firefly/main.py new file mode 100644 index 0000000000000000000000000000000000000000..c92544a85bedd50260d5d1bb9a7754dfe66067c3 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/main.py @@ -0,0 +1,26 @@ +import sys +from loguru import logger +from firefly.train import setup_everything, init_components + + +def train(): + # 进行一些配置和检查 + args, training_args = setup_everything() + # 加载各种组件 + trainer = init_components(args, training_args) + # 开始训练 + logger.info("*** starting training ***") + # todo resume from checkpoint + # https://github.com/huggingface/transformers/issues/24252 + train_result = trainer.train() + # 保存最后的checkpoint + trainer.save_model(training_args.output_dir) # Saves the tokenizer too + # 保存训练指标 + metrics = train_result.metrics + trainer.log_metrics("train", metrics) + trainer.save_metrics("train", metrics) + trainer.save_state() + + +if __name__ == "__main__": + train() \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/requirements.txt b/nlp/llm/qwen-7b/firefly/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..2ee401253caed10646fb2ac218640e64d5472367 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/requirements.txt @@ -0,0 +1,3 @@ +transformers-stream-generator +accelerate>=0.30.1 +transformers>=4.36.0 \ No newline at end of file diff --git a/nlp/llm/qwen-7b/firefly/train.sh b/nlp/llm/qwen-7b/firefly/train.sh new file mode 100644 index 0000000000000000000000000000000000000000..7920b53a65c8b83bb4a6e3612ad44c0f18b9d299 --- /dev/null +++ b/nlp/llm/qwen-7b/firefly/train.sh @@ -0,0 +1,34 @@ +#!/bin/bash + +MASTER_PORT=$(shuf -n 1 -i 10000-65535) +NUM_GPUS=$1 +CONFIG_FILE=$2 +PEFT=$3 +MODELS_DIR="$PWD/models" +CHECKPOINT_DIR="$PWD/checkpoint" + +# 把模型结构拷贝到对应预训练文件路径,方便加载时使用 +for source_dir in "$MODELS_DIR"/*; do + if [ -d "$source_dir" ]; then + source_dir_name=$(basename "$source_dir") + + for dest_dir in "$CHECKPOINT_DIR"/*; do + if [ -d "$dest_dir" ]; then + dest_dir_name=$(basename "$dest_dir") + + if [[ "$dest_dir_name" == *"$source_dir_name"* ]]; then + echo "Copying files from $source_dir to $dest_dir" + cp -r "$source_dir"/* "$dest_dir" + fi + fi + done + fi +done + + +echo "==> Training with $NUM_GPUS gpus | config=$CONFIG_FILE | peft=$PEFT" + +torchrun --master_port $MASTER_PORT --nproc_per_node=$NUM_GPUS main.py --train_args_file $CONFIG_FILE --peft_type $PEFT + +# for example +# bash run_peft.sh 1 configs/llama2-sft-qlora.json qlora \ No newline at end of file diff --git a/Reinforement_Learning/DQN/paddlepaddle/README.md b/reinforcement_learning/DQN/paddlepaddle/README.md similarity index 100% rename from Reinforement_Learning/DQN/paddlepaddle/README.md rename to reinforcement_learning/DQN/paddlepaddle/README.md diff --git a/Reinforement_Learning/DQN/paddlepaddle/evaluate.py b/reinforcement_learning/DQN/paddlepaddle/evaluate.py similarity index 100% rename from Reinforement_Learning/DQN/paddlepaddle/evaluate.py rename to reinforcement_learning/DQN/paddlepaddle/evaluate.py diff --git a/Reinforement_Learning/DQN/paddlepaddle/gym_animation.gif b/reinforcement_learning/DQN/paddlepaddle/gym_animation.gif similarity index 100% rename from Reinforement_Learning/DQN/paddlepaddle/gym_animation.gif rename to reinforcement_learning/DQN/paddlepaddle/gym_animation.gif diff --git a/nlp/llm/bloom-7b1/firefly/firefly/build_firefly.sh b/toolbox/firefly/build_firefly.sh similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/build_firefly.sh rename to toolbox/firefly/build_firefly.sh diff --git a/nlp/llm/bloom-7b1/firefly/firefly/clean_firefly.sh b/toolbox/firefly/clean_firefly.sh similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/clean_firefly.sh rename to toolbox/firefly/clean_firefly.sh diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/__init__.py b/toolbox/firefly/firefly/__init__.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/__init__.py rename to toolbox/firefly/firefly/__init__.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/__init__.py b/toolbox/firefly/firefly/component/__init__.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/__init__.py rename to toolbox/firefly/firefly/component/__init__.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/argument.py b/toolbox/firefly/firefly/component/argument.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/argument.py rename to toolbox/firefly/firefly/component/argument.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/collator.py b/toolbox/firefly/firefly/component/collator.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/collator.py rename to toolbox/firefly/firefly/component/collator.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/dataset.py b/toolbox/firefly/firefly/component/dataset.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/dataset.py rename to toolbox/firefly/firefly/component/dataset.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/loss.py b/toolbox/firefly/firefly/component/loss.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/loss.py rename to toolbox/firefly/firefly/component/loss.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/model.py b/toolbox/firefly/firefly/component/model.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/model.py rename to toolbox/firefly/firefly/component/model.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/template.py b/toolbox/firefly/firefly/component/template.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/template.py rename to toolbox/firefly/firefly/component/template.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/trainer.py b/toolbox/firefly/firefly/component/trainer.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/trainer.py rename to toolbox/firefly/firefly/component/trainer.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/component/utils.py b/toolbox/firefly/firefly/component/utils.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/component/utils.py rename to toolbox/firefly/firefly/component/utils.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/firefly/train.py b/toolbox/firefly/firefly/train.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/firefly/train.py rename to toolbox/firefly/firefly/train.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/install_firefly.sh b/toolbox/firefly/install_firefly.sh similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/install_firefly.sh rename to toolbox/firefly/install_firefly.sh diff --git a/nlp/llm/bloom-7b1/firefly/firefly/models/bloom/configuration_bloom.py b/toolbox/firefly/models/bloom/configuration_bloom.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/models/bloom/configuration_bloom.py rename to toolbox/firefly/models/bloom/configuration_bloom.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/models/bloom/modeling_bloom.py b/toolbox/firefly/models/bloom/modeling_bloom.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/models/bloom/modeling_bloom.py rename to toolbox/firefly/models/bloom/modeling_bloom.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/models/llama2/configuration_llama.py b/toolbox/firefly/models/llama2/configuration_llama.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/models/llama2/configuration_llama.py rename to toolbox/firefly/models/llama2/configuration_llama.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/models/llama2/modeling_llama.py b/toolbox/firefly/models/llama2/modeling_llama.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/models/llama2/modeling_llama.py rename to toolbox/firefly/models/llama2/modeling_llama.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/models/llama2/tokenization_llama.py b/toolbox/firefly/models/llama2/tokenization_llama.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/models/llama2/tokenization_llama.py rename to toolbox/firefly/models/llama2/tokenization_llama.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/models/llama2/tokenization_llama_fast.py b/toolbox/firefly/models/llama2/tokenization_llama_fast.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/models/llama2/tokenization_llama_fast.py rename to toolbox/firefly/models/llama2/tokenization_llama_fast.py diff --git a/nlp/llm/bloom-7b1/firefly/firefly/requirements.txt b/toolbox/firefly/requirements.txt similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/requirements.txt rename to toolbox/firefly/requirements.txt diff --git a/nlp/llm/bloom-7b1/firefly/firefly/setup.py b/toolbox/firefly/setup.py similarity index 100% rename from nlp/llm/bloom-7b1/firefly/firefly/setup.py rename to toolbox/firefly/setup.py