From 8d07e27a585457e0d7a13e565560be80da1917f4 Mon Sep 17 00:00:00 2001
From: "mingjiang.li" <mingjiang.li@iluvatar.com>
Date: Mon, 24 Mar 2025 14:33:25 +0800
Subject: [PATCH 1/2] unify readme format of rl models

---
 .../dqn/paddlepaddle/README.md                | 34 ++++++++++---------
 1 file changed, 18 insertions(+), 16 deletions(-)
diff --git a/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md b/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md
index d9423ebcd..f8b45af44 100644
--- a/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md
+++ b/reinforcement_learning/q-learning-networks/dqn/paddlepaddle/README.md
@@ -1,11 +1,16 @@
 # DQN
 
-## Model description
+## Model Description
 
-The classic DQN algorithm in reinforcement learning is a value-based rather than a policy-based method. DQN does not
-learn a policy, but a critic. Critic does not directly take action, but evaluates the quality of the action.
+DQN (Deep Q-Network) is a foundational reinforcement learning algorithm that combines Q-Learning with deep neural
+networks. As a value-based method, it uses a critic network to estimate action quality in high-dimensional state spaces.
+DQN introduces experience replay and target network stabilization to enable stable training. This approach
+revolutionized AI capabilities in complex environments, achieving human-level performance in Atari games and forming the
+basis for advanced decision-making systems in robotics and game AI.
 
-## Step 1: Installation
+## Model Preparation
+
+### Install Dependencies
 
 ```bash
 git clone  https://github.com/PaddlePaddle/PARL.git
@@ -15,29 +20,26 @@ pip3 install matplotlib
 pip3 install urllib3==1.26.6
 ```
 
-## Step 2: Training
+## Model Training
 
 ```bash
-# 1 GPU
+# 1 GPU Training
 python3 train.py
-```
 
-## Step 3: Evaluating
-
-```bash
+# Evaluation
 mv ../../../evaluate.py ./
 python3 evaluate.py
 ```
 
-## Result
+## Model Results
 
-Performance of DQN playing CartPole-v0
+Performance of DQN playing CartPole-v0.
 
-|  GPUs   | Reward |
-|---------|--------|
-| BI-V100 | 200.0  |
+| Model | GPU     | Reward |
+|-------|---------|--------|
+| DQN   | BI-V100 | 200.0  |
 
 ## Reference
 
 - [PARL](https://github.com/PaddlePaddle/PARL)
-- [paper](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
+- [Paper](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
-- 
Gitee


From c05184125bc8d1292274b0129fbabd9d5ec1ce58 Mon Sep 17 00:00:00 2001
From: "mingjiang.li" <mingjiang.li@iluvatar.com>
Date: Mon, 24 Mar 2025 14:33:38 +0800
Subject: [PATCH 2/2] update 25.03 release notes

---
 RELEASE.md | 43 +++++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index a05197ced..3e205d319 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -2,43 +2,43 @@
 
 ## DeepSparkHub 25.03 Release Notes
 
-### 特性和增强
+### 模型与算法
 
-#### 模型与算法
-● 新增了9个大模型训练示例，涉及MoE-LLaVA，DeepSpeed和LLaMA-Factory工具箱
+* 新增了9个大模型训练示例，涉及DeepSpeed，MoE-LLaVA和LLaMA-Factory工具箱
 
 <table>
     </tr>
         <tr align="left"><th colspan=5>大模型</th></tr>
     <tr>
-        <td>MoE-LLaVA-Phi2-2.7B(MoE-LLaVA)</td>
-        <td>MoE-LLaVA-Qwen-1.8B(MoE-LLaVA)</td>
-        <td>MoE-LLaVA-StableLM-1.6B(MoE-LLaVA)</td>
+        <td>GLM-4</td>
+        <td>MiniCPM(DeepSpeed)</td>
+        <td>Phi-3</td>
     </tr>
     <tr>
-        <td>Yi_6B(DeepSpeed)</td>
-        <td>Yi-1.5_6B(DeepSpeed)</td>
-        <td>Yi-VL-6B(LLaMA-Factory)</td>
+        <td>MoE-LLaVA-Phi2-2.7B</td>
+        <td>MoE-LLaVA-Qwen-1.8B</td>
+        <td>MoE-LLaVA-StableLM-1.6B</td>
     </tr>
     <tr>
-        <td>GLM-4</td>
-        <td>MiniCPM(DeepSpeed)</td>
-        <td>Phi-3</td>
+        <td>Yi-6B (DeepSpeed)</td>
+        <td>Yi-1.5-6B (DeepSpeed)</td>
+        <td>Yi-VL-6B (LLaMA-Factory)</td>
     </tr>
 </table>
 
-● 更新了cv/multi_object_tracking、cv/gnn、cv/face_recognition等分类名称。
-● 调整了kan、graph wavenet、hashnerf等模型的分类路径。
-● 删除了convnext、co-detr、centernet等模型的冗余代码与社区版本对齐。
-● 更新了相关模型README说明，增加了模型所支持的IXUCA SDK版本。
-● 更新了ATSS、Cascade R-CNN、CornerNet等模型代码，适配了MMDetection社区v3.3.0版本。
-● 增加了cv/classification、cv/detection自动化ci脚本。
-● 同步了tacotron2模型的代码。
+### 问题修复
+
+* 同步了Tacotron2 PyTorch模型的最新代码。
+* 删除了ConvNeXt，Co-DETR和CenterNet等模型的冗余代码，并对齐社区版本。
+* 更新了MMDetection工具箱版本至v3.3.0，并同步ATSS、Cascade R-CNN、CornerNet等模型代码。
+* 增加了cv/classification和cv/detection的自动化CI脚本。
+* 更新了所有模型README文档格式，补充了模型所支持的IXUCA SDK版本。
+
+### 版本关联
 
-#### 版本关联
 DeepSparkHub 25.03对应天数软件栈4.2.0版本。
 
-#### 贡献者
+### 贡献者
 
 感谢以下社区贡献者
 
@@ -46,7 +46,6 @@ DeepSparkHub 25.03对应天数软件栈4.2.0版本。
 
 欢迎以任何形式为DeepSparkHub项目贡献。
 
-
 ## DeepSparkHub 24.12 Release Notes
 
 ### 特性和增强
-- 
Gitee


大模型
MoE-LLaVA-Phi2-2.7B(MoE-LLaVA)	MoE-LLaVA-Qwen-1.8B(MoE-LLaVA)	MoE-LLaVA-StableLM-1.6B(MoE-LLaVA)	GLM-4	MiniCPM(DeepSpeed)	Phi-3
Yi_6B(DeepSpeed)	Yi-1.5_6B(DeepSpeed)	Yi-VL-6B(LLaMA-Factory)	MoE-LLaVA-Phi2-2.7B	MoE-LLaVA-Qwen-1.8B	MoE-LLaVA-StableLM-1.6B
GLM-4	MiniCPM(DeepSpeed)	Phi-3	Yi-6B (DeepSpeed)	Yi-1.5-6B (DeepSpeed)	Yi-VL-6B (LLaMA-Factory)