From 8d955ed717558e5a47c95e2b01aaeb474ab2e984 Mon Sep 17 00:00:00 2001
From: majorli <mingjiang.li@iluvatar.com>
Date: Tue, 13 Aug 2024 10:56:52 +0800
Subject: [PATCH 1/2] add llama3 8b megatron deepspeed model

Signed-off-by: majorli <mingjiang.li@iluvatar.com>
---
 .../llama3-8b/megatron-deepspeed/README.md    | 45 +++++++++++++++++++
 1 file changed, 45 insertions(+)
 create mode 100644 nlp/llm/llama3-8b/megatron-deepspeed/README.md

diff --git a/nlp/llm/llama3-8b/megatron-deepspeed/README.md b/nlp/llm/llama3-8b/megatron-deepspeed/README.md
new file mode 100644
index 000000000..f530a243f
--- /dev/null
+++ b/nlp/llm/llama3-8b/megatron-deepspeed/README.md
@@ -0,0 +1,45 @@
+# Llama3-8B (Megatron-DeepSpeed)
+
+## Model description
+
+Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
+
+## Step 1: Installation
+
+```bash
+# Clone
+git clone https://gitee.com/deep-spark/Megatron-DeepSpeed.git
+cd Megatron-DeepSpeed/
+# Install
+bash build_megatron-deepspeed.sh && bash install_megatron-deepspeed.sh
+pip3 install urllib3==1.23
+```
+
+## Step 2: Preparing datasets
+
+```bash
+pushd dataset
+# get gpt_small_117M_llama3.tar
+wget http://files.deepspark.org.cn:880/deepspark/gpt_small_117M_llama3.tar
+tar -xf gpt_small_117M_llama3.tar
+rm -f gpt_small_117M_llama3.tar
+popd
+```
+
+## Step 3: Training
+
+```bash
+export NCCL_SOCKET_IFNAME="eth0"
+cd examples/llama3
+bash run_te_llama3_8b_node1.sh
+```
+
+## Results
+
+|  GPUs   |             Model              | Training speed |
+| :-----: | :----------------------------: | :------------: |
+| BI-V150 | Llama3-8B (Megatron-DeepSpeed) |                |
+
+## Reference
+
+- [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed)
-- 
Gitee


From 4c98baa870438bec98a9b540010da5e914a45e7b Mon Sep 17 00:00:00 2001
From: "mingjiang.li" <mingjiang.li@iluvatar.com>
Date: Mon, 2 Sep 2024 15:33:41 +0800
Subject: [PATCH 2/2] add llama3-8b megatron-deepspeed to model list

Signed-off-by: mingjiang.li <mingjiang.li@iluvatar.com>
---
 README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/README.md b/README.md
index d2118ca6a..987cf4f59 100644
--- a/README.md
+++ b/README.md
@@ -447,6 +447,7 @@ DeepSparkHub甄选上百个应用算法和模型，覆盖AI和通用计算各领
 [Llama2-7B SFT](nlp/llm/llama2-7b_sft/megatron-deepspeed/README.md)  | PyTorch (Megatron-DeepSpeed) | gpt_small-117M
 [Llama2-13B](nlp/llm/llama2-13b/megatron-deepspeed/README.md)  | PyTorch (Megatron-DeepSpeed) | Bookcorpus
 [Llama2-34B](nlp/llm/llama2-34b/megatron-deepspeed/README.md)  | PyTorch (Megatron-DeepSpeed) | Bookcorpus
+[Llama3-8B](nlp/llm/llama3-8b/megatron-deepspeed/README.md)  | PyTorch (Megatron-DeepSpeed) | Bookcorpus
 [QWen-7B](nlp/llm/qwen-7b/firefly/README.md)  | PyTorch (Firefly) | qwen-7b
 [QWen1.5-7B](nlp/llm/qwen1.5-7b/firefly/README.md)  | PyTorch (Firefly) | school_math
 [QWen1.5-14B](nlp/llm/qwen1.5-14b/firefly/README.md)  | PyTorch (Firefly) | school_math
-- 
Gitee