MemoryaugmentedNeuralMachineTranslation專業(yè)知識課件_第1頁
MemoryaugmentedNeuralMachineTranslation專業(yè)知識課件_第2頁
MemoryaugmentedNeuralMachineTranslation專業(yè)知識課件_第3頁
MemoryaugmentedNeuralMachineTranslation專業(yè)知識課件_第4頁
MemoryaugmentedNeuralMachineTranslation專業(yè)知識課件_第5頁
已閱讀5頁,還剩19頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

Memory-augmentedNeuralMachineTranslationShiyue

ZhangNLP

Group,

CSLT,

Tsinghua

UniversityCo-work

with

Yang

Feng,

Dong

Wang,

Andi

ZhangEMNLP’17

(Submitted)OutlineIntroductionAttention-based

NMTMemory-augmented

NMTExperimentsConclusionsFuture

workReferenceIntroductionStatistical

Machine

Translation

(SMT)Phrase-based

machine

translation

(Moses,

Koehn

et

al.

2023

)Phrase

table

+

language

modelAn

example:什么是成人高考|||成人高考簡介Phrase

table:什么是=>簡介,成人高考=>成人高考Language

model

guides

the

orderNeural

Machine

Translation

(NMT)Achieved

significant

success,

especially

when

dataset

is

big

enough,

NMT

performs

quite

better

than

SMT

IntroductionAn

interesting

insight:Let’s

say

we

have

a

zh-en

translation

task,

and

the

number

of

Chinese

words

in

training

set

is

150,000.

In

SMT,

the

vocabulary

size

is

150,000,

OOV

(out

of

vocabulary)

words

only

appear

in

test

set.

In

NMT,

since

“word

embedding”

is

trained

along

with

the

model,

typically,

the

vocabulary

size

has

to

be

set

to

~30,000.

The

remained

120,000

words

are

uniformly

labeled

as

one

word

“UNK”.

So,

OOV

problem

is

dramatically

aggravated

in

NMT.

But,

surprisingly,

NMT

is

better

than

SMT.

Why?NMT

is

very

good

at

reasoning!IntroductionOverfits

to

frequent

observations,

while

overlooks

special

cases.

NMT

gives

a

reasonable

translation,

but

the

meaning

drifts

away.

An

experiment:

after

decoding

training

set,

30,000

English

vocabulary

shrinks

to

26911.

IntroductionOur

aim:

To

address

rare

and

unknown

word

problemsOur

method:

augment

NMT

with

a

memory

component

which

memorizes

source-target

word

pairs.

It’s

like

equipping

a

translator

with

a

dictionary.

OutlineIntroductionAttention-based

NMTMemory-augmented

NMTExperimentsConclusionsFuture

workAttention-based

NMT

OutlineIntroductionAttention-based

NMTMemory-augmented

NMTExperimentsConclusionsFuture

workMemory-augmented

NMT

Memory-augmented

NMT

Memory-augmented

NMTOOV

treatmentMain

idea:

Represent

an

OOV

word

by

its

similar

word

in

vocabularyAn

example:Src:目前沒有治愈阿爾茲海默癥旳措施Word

mapping:

<阿爾茲海默癥–

alzheimer>UNKNot

UNK<感冒–

alzheimer>Res:

Currentlythereisnocureforalzheimer'sdiseaseNote

that

similar

words

can

either

be

defined

by

human

or

selected

based

on

word

vector

similarity.

OutlineIntroductionAttention-based

NMTMemory-augmented

NMTExperimentsConclusionsFuture

workExperiments

(zh-en)Data:IWSLT:

44K

sentence

pairs

in

training

set,

~13,000

zh

words,

~9,500

en

words.NIST:

1M

sentence

pairs

in

training

set,

~190,000

zh

words,

~100,000

en

words.

Systems:SMT:

MosesNMTNMT-L

(Arthur,P.

et

al.

2023)NMT-PL

(Minh-ThangLuong

et

al.

2023)

M-NMTEvaluation

metrics:BLEU:

the

average

of

1-4

grams

bleu

multiplied

by

a

brevity

penaltyTranslation

baselineOOV

baselineExperiments

(zh-en)Two

observations:M-NMT

performs

bestM-NMT

brings

more

improvement

on

IWSLT

corpusTwo

conclusions:M-NMT

is

effectiveM-NMT

is

robustExperiments

(zh-en)M-NMT

recalls

more

OOV

words.Experiments

(zh-en)Experiments

(zh-uy)Data:

180k

sentence

pairs,

~170,000

Uyghur

words,

~130,000

Chinese

wordsPerformance:SystemsSMTNMTM-NMT1-gramBLEU54.557.758.82-gramBLEU34.639.840.83-gramBLEU26.631.932.44-gramBLEU22.127.027.1Brevitypenalty1.0000.9390.968BLEU32.4435.2436.88SystemsRecalled

words

in

testSMT3680/6666NMT3509/6666M-NMT3560/6666*6666

is

the

number

of

words

in

referenceOutlineIntroductionAttention-based

NMTMemory-augmented

NMTExperimentsConclusionsFuture

workConclusionsM-NMT

alleviates

rare

word

and

under-translation

problems

in

NMT.M-NMT

provides

a

way

to

address

OOV

problem.So

far,

M-NMT

brings

at

least

1.6

BLEU

improvement

on

different

datasets.

OutlineIntroductionAttention-based

NMTMemory-augmented

NMTExperimentsConclusionsFuture

workFuture

workBetter

OOV

treatment?

No

need

to

do

similar

word

replacementImplement

to

the

whole

datasetPhrase-based

memory?ReferenceKoehn,Philipp,Hoang,Hieu,Alexandra,&CallisonBurch,etal.(2023).Moses:opensourcetoolkitforstatisticalmachinetranslation.

inProceedingsoftheAssociationforComputationalLinguistics(ACL’07,9(1),177--180.Bahdanau,D.,Cho,K.,&Bengio,Y.(2023).Neuralmachinetranslationbyjointlylearningtoalignandtranslate.

ComputerScience.Arthur,P.,Neubig,G.,&Nakamura,S.(2023).Incorporating

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論