# SAM-DEEP **Repository Path**: zhy186/sam-deep ## Basic Information - **Project Name**: SAM-DEEP - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-12-16 - **Last Updated**: 2024-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## VRP-SAM: SAM with Visual Reference Prompt **Update**: 1. The manuscript has been accepted in __CVPR 2024__. 2. **Core code has been updated** This is the official implementation based on pytorch of the paper [VRP-SAM: SAM with Visual Reference Prompt](https://arxiv.org/abs/2402.17726) Authors: Yanpeng Sun, Jiahui Chen, Shan Zhang, Xinyu Zhang, Qiang Chen, Gang Zhang, Errui Ding, Jingdong Wang, [Zechao Li](https://zechao-li.github.io/)

## Requirements - Python 3.10 - PyTorch 1.12 - cuda 11.6 Conda environment settings: ```bash conda create -n vrpsam python=3.10 conda activate vrpsam conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge ``` Segment-Anything-Model setting: ```bash cd ./segment-anything pip install -v -e . cd .. ``` ## Preparing Few-Shot Segmentation Datasets Download following datasets: > #### 1. PASCAL-5i > Download PASCAL VOC2012 devkit (train/val data): > ```bash > wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar > ``` > Download PASCAL VOC2012 SDS extended mask annotations from our [[Google Drive](https://drive.google.com/file/d/10zxG2VExoEZUeyQl_uXga2OWHjGeZaf2/view?usp=sharing)]. > #### 2. COCO-20i > Download COCO2014 train/val images and annotations: > ```bash > wget http://images.cocodataset.org/zips/train2014.zip > wget http://images.cocodataset.org/zips/val2014.zip > wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip > ``` > Download COCO2014 train/val annotations from our Google Drive: [[train2014.zip](https://drive.google.com/file/d/1cwup51kcr4m7v9jO14ArpxKMA4O3-Uge/view?usp=sharing)], [[val2014.zip](https://drive.google.com/file/d/1PNw4U3T2MhzAEBWGGgceXvYU3cZ7mJL1/view?usp=sharing)]. > (and locate both train2014/ and val2014/ under annotations/ directory). Create a directory '../dataset' for the above few-shot segmentation datasets and appropriately place each dataset to have following directory structure: ../ # parent directory ├── ./ # current (project) directory │ ├── common/ # (dir.) helper functions │ ├── data/ # (dir.) dataloaders and splits for each FSSS dataset │ ├── model/ # (dir.) implementation of VRP-SAM │ ├── segment-anything/ # code for SAM │ ├── README.md # intstruction for reproduction │ ├── train.py # code for training HSNet │ └── SAM2Pred.py # code for prediction module │ └── Datasets_HSN/ ├── VOC2012/ # PASCAL VOC2012 devkit │ ├── Annotations/ │ ├── ImageSets/ │ ├── ... │ └── SegmentationClassAug/ └── COCO2014/ ├── annotations/ │ ├── train2014/ # (dir.) training masks (from Google Drive) │ ├── val2014/ # (dir.) validation masks (from Google Drive) │ └── ..some json files.. ├── train2014/ └── val2014/ ## Training We provide a example training script "train.sh". Detailed training argumnets are as follows: > ```bash > python3 -m torch.distributed.launch --nproc_per_node=$GPUs$ train.py \ > --datapath $PATH_TO_YOUR_DATA$ \ > --logpath $PATH_TO_YOUR_LOG$ \ > --benchmark {coco, pascal} \ > --backbone {vgg16, resnet50, resnet101} \ > --fold {0, 1, 2, 3} \ > --condition {point, scribble, box, mask} \ > --num_queirs 50 \ > --epochs 50 \ > --lr 1e-4 \ > --bsz 2 > ``` #### Example qualitative results (1-shot):

## BibTeX If you use this code for your research, please consider citing: ````BibTeX @inproceedings{sun2024vrp, title={VRP-SAM: SAM with Visual Reference Prompt}, author={Sun, Yanpeng and Chen, Jiahui and Zhang, Shan and Zhang, Xinyu and Chen, Qiang and Zhang, Gang and Ding, Errui and Wang, Jingdong and Li, Zechao}, booktitle={Conference on Computer Vision and Pattern Recognition 2024}, year={2024} } ````