版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
DataFunCon#
2023多模態(tài)預(yù)訓(xùn)練模型在OPPO端云場(chǎng)景的落地實(shí)踐陳宸-OPPO研究院-高級(jí)算法工程師03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11Contents目錄端側(cè)圖文檢索技術(shù)研究圖文生成&理解模型的應(yīng)用優(yōu)化文圖生成模型的端側(cè)輕量化03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11端側(cè)圖文檢索技術(shù)研究03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11query1:
和女朋友去迪士尼 query2:
山頂婚紗照一句話搜索的意義:用戶體驗(yàn):真正解決用戶想什么就能搜什么的痛點(diǎn),“智慧搜圖,搜你所想”。依托于大模型預(yù)訓(xùn)練技術(shù),不再依賴于標(biāo)簽的迭代和擴(kuò)展大模型輕量端側(cè)化的技術(shù)意義:成本節(jié)約:將云側(cè)大模型才能體驗(yàn)的效果搬向到端側(cè),大幅節(jié)約計(jì)算資源;隱私保護(hù):直接在端側(cè)處理用戶的私人照片,無需上傳到云端,保護(hù)用戶隱私;https://b /s端側(cè)圖文檢索技術(shù)研究——解決了什么問題?端側(cè)檢索demo實(shí)現(xiàn)端側(cè)智慧搜索的關(guān)鍵因素:其一,“人話”解讀能力。智慧搜圖不僅能單獨(dú)搜詞,也能放一起搜,實(shí)現(xiàn)真正的口語(yǔ)化表達(dá)搜索,所想即所得,如“去年在動(dòng)物園拍的老虎”等。因此需要類似多模態(tài)大模型
CLIP(OpenAI)的圖文理解能力。其二,高效搜索速度。相比動(dòng)輒翻上十幾分鐘半個(gè)小時(shí)的相冊(cè),現(xiàn)在無論從桌面下拉智慧搜索、打開相冊(cè)、或是用語(yǔ)音助手,都只需要一句話就能搜到想要的圖片,系統(tǒng)級(jí)地提升了找信息的效率。因此如何實(shí)現(xiàn)大模型在端側(cè)的輕量化部署有重大的意義。大模型輕量化端側(cè)技術(shù)落地的難點(diǎn):壓縮多模態(tài)大模型并確保精度。這并非簡(jiǎn)單用剪枝或量化等方法,直接壓縮幾倍模型大小就能搞定。畢竟對(duì)于端側(cè)而言,算力有限的情況下,能部署的模型大小是往往只能達(dá)到大模型的幾十分之一。與算法模型升級(jí)相對(duì)應(yīng)的,需要在端側(cè)實(shí)現(xiàn)一個(gè)性能魯棒的向量檢索引擎,保證大模型下端的工程性能03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11端側(cè)圖文檢索技術(shù)研究——算法優(yōu)化CLIP雙塔模型ALBEF單流模型單雙流多教師蒸餾架構(gòu)損失函數(shù)檢索引擎的計(jì)算分位兩部分:離線部分:掃描相冊(cè)所有圖片,通過圖片編碼器將所有圖片轉(zhuǎn)成向量;并經(jīng)過fp16量化存儲(chǔ)成Nx200的fp矩陣在線部分:每次輸入query,通過文本編碼器將query轉(zhuǎn)成向量;并經(jīng)過fp16量化降低計(jì)算內(nèi)存;最后通過矩陣相乘計(jì)算query向量跟所有圖片的相似分?jǐn)?shù),并通過排序輸出一個(gè)有序列表。Lei,Youbo,etal."MCAD:Multi-teacherCross-modalAlignmentDistillationforefficient
image-textretrieval."arXivpreprintarXiv:2310.19654
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11端側(cè)圖文檢索技術(shù)研究——學(xué)術(shù)集效果各種蒸餾方法的效果對(duì)比Modelnameimagemodeltextmodelfusion
modelimage
encodingtimeretrieval
timeparameternumbertestsetplatformCLIPVIT-L/1412-layertransformerdot
product11.0ms32.5ms427.62Mfilckr5KV100
GPUALBEFVIT-B/166-layertransformer6-layertransformer7.6ms265ms(k=16)1945ms
(k=128)3865ms
(k=256)419.12Mfilckr5KV100
GPU自研小模型mobileVitV2-1.54-layerTinyBertdoc
product3.8
ms14.1
ms25.9
Mfilckr5KV100
GPU自研小模型mobileVitV2-1.54-layerTinyBertdoc
product17.3
ms14.6
ms25.9
Mfilckr5KMTK
DX3大小模型的性能對(duì)比03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11端側(cè)圖文檢索技術(shù)研究——真實(shí)場(chǎng)景效果數(shù)據(jù)量:11個(gè)用戶真實(shí)相冊(cè)共2萬(wàn)+圖片,手寫5400+query數(shù)據(jù)分布:測(cè)試集R@1R@5R@10MRmAP010.47280.6710.74950.63110.6080020.49560.7580.82510.69290.5306030.40190.56650.61080.52640.4889040.45320.68470.73890.62560.6048050.58430.7530.79520.71080.6428060.53230.68550.750.65590.5890070.350.52940.60880.49610.4771080.64170.80830.84170.76390.5943090.59650.68420.71930.66670.5622100.51210.70590.76470.66090.5441110.56540.74180.7810.69610.6336平均0.48480.67680.73600.63250.584003872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11端側(cè)圖文檢索技術(shù)研究——細(xì)粒度優(yōu)化Doveh,Sivan,etal."Teachingstructuredvision&languageconceptstovision&languagemodels."ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecognition.2023.細(xì)粒度屬性詞替換+hard
negative
sampling+
LwF抗遺忘算法03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11文圖生成&理解態(tài)模型的應(yīng)用優(yōu)化03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——中文文生圖大模型繼續(xù)預(yù)訓(xùn)練如何做高質(zhì)量低成本的繼續(xù)預(yù)訓(xùn)練如何對(duì)齊中文的語(yǔ)言文化如何提升生成圖像的細(xì)節(jié)質(zhì)量Parameterefficient
adapterOrthogonal
FinetuningQiu,Zeju,etal."Controllingtext-to-imagediffusionbyorthogonalfinetuning."Thirty-seventhConferenceonNeuralInformationProcessingSystems.
2023.03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11中文語(yǔ)境遷移效果圖文生成&理解模型的應(yīng)用優(yōu)化——中文文生圖大模型繼續(xù)預(yù)訓(xùn)練收斂速度03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11Finetune模型鴛鴦雙棲蝶雙飛,滿園春色惹人醉LoRAControlnetSSD1.3B小模型一只超級(jí)可愛的兔子穿著僧侶服裝,肖像照,皮克斯動(dòng)畫SDXLinpainting青花瓷版的恐龍?jiān)陂L(zhǎng)椅上江南,夏日湖畔的一個(gè)村莊圖文生成&理解模型的應(yīng)用優(yōu)化——中文文生圖大模型繼續(xù)預(yù)訓(xùn)練一個(gè)漂亮的亞洲女孩,電影燈光 西湖,塔和瀑布,日出3D電影,4k,高度細(xì)致,男人坐在馬桶上讀報(bào)帶著墨鏡的貓咪手里拿著劍,在惡魔城堡里,仙劍奇?zhèn)b風(fēng)格LatentCM03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——通用優(yōu)化應(yīng)用壁紙生成春節(jié)檔熱度top1春節(jié)檔熱度top3文生圖模型+超分辨率生成2k高清壁紙03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——通用優(yōu)化應(yīng)用鎖屏雜志生成文生圖模型+微調(diào)LLAVA+LLM
生成圖文并茂的雜志Liu,Haotian,etal."Visualinstructiontuning."arXivpreprintarXiv:2304.08485
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——通用優(yōu)化應(yīng)用Zhang,Pan,etal."Internlm-xcomposer:Avision-languagelargemodelforadvanced
text-imagecomprehensionandcomposition."arXivpreprintarXiv:2309.15112
(2023).Internlm-xcomposer訓(xùn)練框架03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化-人像垂域AI模型畫人的幾個(gè)問題:1.
人臉人手等身體部位的崩壞。2.
過于精致標(biāo)準(zhǔn),渲染過度光滑,在質(zhì)感上失真。3. 細(xì)粒度屬性和文本描述的不對(duì)齊。03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化-人像垂域構(gòu)建細(xì)粒度的人像屬性數(shù)據(jù)03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化-人像垂域U-Net中模塊與圖像中特征的對(duì)應(yīng)關(guān)系,可用于指導(dǎo)LoRA微調(diào)的參數(shù)選擇厚嘴唇薄嘴唇03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化-人像垂域小鼻子大鼻子03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化-人像垂域細(xì)眉毛粗眉毛03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化-人像垂域垂域微調(diào)經(jīng)驗(yàn):大量數(shù)據(jù)粗調(diào),增加模型對(duì)新概念的泛化能力少量高質(zhì)量數(shù)據(jù)精調(diào),提升模型的圖片生成質(zhì)量人臉修復(fù)邏輯:穿著華麗盔甲的玄幻戰(zhàn)士與巨龍激戰(zhàn),雷霆與火焰交織在一起。(隨機(jī)6張,無cherry-pick)03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化-古風(fēng)人像效果古道邊一騎紅塵客正巍然馬背,身披白色斗篷,踏寂靜落阿葉(隨機(jī)6張,無cherry-pick)樹叢中,翩翩少女,紅衣綠裙,手提花傘,踏泥尋徑,仿佛踏入了一幅畫卷(隨機(jī)6張,無cherry-pick)03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——垂域優(yōu)化應(yīng)用廣告營(yíng)銷工具(內(nèi)測(cè)版)03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——文字渲染-問題定義如何在文生圖模型中渲染出正確的文字?Ma,Jian,etal."GlyphDraw:LearningtoDrawChineseCharactersinImageSynthesisModelsCoherently."arXivpreprintarXiv:2303.17870
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——文字渲染-算法GlyphDraw訓(xùn)練框架GlyphDraw推理框架數(shù)據(jù)集圖文對(duì)數(shù)量文字?jǐn)?shù)量中文數(shù)據(jù)集792k3.3M
字英文數(shù)據(jù)集1.9M2.3M
wordsMa,Jian,etal."GlyphDraw:LearningtoDrawChineseCharactersinImageSynthesisModelsCoherently."arXivpreprintarXiv:2303.17870
(2023).GlyphDraw數(shù)據(jù)集構(gòu)建03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——文字渲染-客觀效果Ma,Jian,etal."GlyphDraw:LearningtoDrawChineseCharactersinImageSynthesisModelsCoherently."arXivpreprintarXiv:2303.17870
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——文字渲染-主觀效果Ma,Jian,etal."GlyphDraw:LearningtoDrawChineseCharactersinImageSynthesisModelsCoherently."arXivpreprintarXiv:2303.17870
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-問題定義Ma,Jian,etal."Subject-diffusion:Opendomainpersonalizedtext-to-imagegenerationwithouttest-timefine-tuning."arXivpreprintarXiv:2307.11410
(2023).如何使用一張參考圖像快速生成新圖片并平衡保真度和泛化性?03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-數(shù)據(jù)集SDD數(shù)據(jù)集統(tǒng)計(jì)數(shù)據(jù)SDD數(shù)據(jù)集詞云Ma,Jian,etal."Subject-diffusion:Opendomainpersonalizedtext-to-imagegenerationwithouttest-timefine-tuning."arXivpreprintarXiv:2307.11410
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-算法Ma,Jian,etal."Subject-diffusion:Opendomainpersonalizedtext-to-imagegenerationwithouttest-timefine-tuning."arXivpreprintarXiv:2307.11410
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-效果單實(shí)體生成與各種方法的對(duì)比Ma,Jian,etal."Subject-diffusion:Opendomainpersonalizedtext-to-imagegenerationwithouttest-timefine-tuning."arXivpreprintarXiv:2307.11410
(2023).雙實(shí)體生成與各種方法的對(duì)比03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-效果Ma,Jian,etal."Subject-diffusion:Opendomainpersonalizedtext-to-imagegenerationwithouttest-timefine-tuning."arXivpreprintarXiv:2307.11410
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-效果Ma,Jian,etal."Subject-diffusion:Opendomainpersonalizedtext-to-imagegenerationwithouttest-timefine-tuning."arXivpreprintarXiv:2307.11410
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-效果Ma,Jian,etal."Subject-diffusion:Opendomainpersonalizedtext-to-imagegenerationwithouttest-timefine-tuning."arXivpreprintarXiv:2307.11410
(2023).03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成應(yīng)用外觀多角度生成品牌調(diào)性干預(yù)產(chǎn)品外觀描述生成根據(jù)參照?qǐng)D生成效果圖描述生成品牌調(diào)性/風(fēng)格干預(yù)廣告營(yíng)銷工具產(chǎn)品外觀設(shè)計(jì)(從0-1設(shè)計(jì)) 產(chǎn)品效果圖生成(工作室拍攝的效果圖)產(chǎn)品營(yíng)銷素材生成(海報(bào)/banner)營(yíng)銷文案&圖片生成素材布局生成布局描述生成參照物干預(yù)設(shè)計(jì)草圖生圖Ayellow
hatAgirlwearingthehatandfacing
forest選擇生成【海報(bào)】根據(jù)參考素材生成根據(jù)品牌VI,歷史產(chǎn)品調(diào)性生成產(chǎn)品設(shè)計(jì)03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-應(yīng)用商品設(shè)計(jì)個(gè)性化圖片生成海報(bào)設(shè)計(jì)03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-應(yīng)用Subject-diffusion的個(gè)性化生成03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11圖文生成&理解模型的應(yīng)用優(yōu)化——個(gè)性化生成-應(yīng)用Stable-diffusion的outpainting03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11文圖生成模型的端側(cè)輕量化03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11文圖生成模型的端側(cè)輕量化——技術(shù)路線-模型結(jié)構(gòu)優(yōu)化Unet結(jié)構(gòu)示意圖刪除某個(gè)模塊之后的效果和參數(shù)量波動(dòng)分析模型采樣時(shí)間(DPMsolver++25步)運(yùn)行內(nèi)存UNet參數(shù)量SD
1.51.34s4105M859.52MSD
base-2m0.9s3458M579.38MSD
small-2m0.83s3287M482.35MSD
tiny-2m0.76s2979M323.38MSD
small0.88s3477M579.38MSD
tiny0.75s3043M323.38M不同剪枝模型在V100上測(cè)試結(jié)果03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11文圖生成模型的端側(cè)輕量化——技術(shù)路線-模型結(jié)構(gòu)優(yōu)化采用SDXL蒸餾SD
small模型03872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11-29803872412023-11文圖生成模型的端側(cè)輕量化——技術(shù)路線-采樣加速Salimans,Tim,andJonathanHo."Progressivedistil
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2021年內(nèi)科臨床醫(yī)師三基考試題庫(kù)帶答案
- 小學(xué)生古文100課課件
- 2025年全國(guó)法制宣傳日普法知識(shí)競(jìng)賽題庫(kù)及答案(共200題)
- 《隧道BIM協(xié)同質(zhì)量控制數(shù)據(jù)交付標(biāo)準(zhǔn)》
- 數(shù)生物探新知
- 2020年安徽省合肥市包河區(qū)社區(qū)專職工作者考試《公共基礎(chǔ)知識(shí)》試題及解析
- 2024年湘潭市雨湖區(qū)中醫(yī)院高層次衛(wèi)技人才招聘筆試歷年參考題庫(kù)頻考點(diǎn)附帶答案
- 2015-2019一消《綜合能力》考試真題及答案
- 2024年渭源縣中醫(yī)院高層次衛(wèi)技人才招聘筆試歷年參考題庫(kù)頻考點(diǎn)附帶答案
- 2024年淮南東方醫(yī)院集團(tuán)洞泉醫(yī)院高層次衛(wèi)技人才招聘筆試歷年參考題庫(kù)頻考點(diǎn)附帶答案
- 《冷戰(zhàn)史專題》筆記
- 2024-2030年中國(guó)輪轂電機(jī)行業(yè)市場(chǎng)發(fā)展趨勢(shì)與前景展望戰(zhàn)略分析報(bào)告
- 高中體育課程活動(dòng)方案
- 小學(xué)中高年段語(yǔ)文學(xué)科基于課程標(biāo)準(zhǔn)評(píng)價(jià)指南
- (完整版)獸醫(yī)臨床診斷學(xué)
- GB/T 23586-2022醬鹵肉制品質(zhì)量通則
- 和解協(xié)議裝修合同糾紛
- 抗震支架計(jì)算書
- 大學(xué)生如果提高自己安全意識(shí)
- 意識(shí)障礙的判斷及護(hù)理
- (高清版)JTGT 3650-01-2022 公路橋梁施工監(jiān)控技術(shù)規(guī)程
評(píng)論
0/150
提交評(píng)論