版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
CFGsandPCFGs(Probabilistic)Context-FreeGrammarsAphrasestructuregrammarSNPVPVPVNPVP
VNPPPNPNPNPNP
NPPPNPNNPePPPNPpeoplefishtankspeoplefishwithrodsNpeopleNfishNtanksN
rodsVpeopleVfishVtanksPwithPhrasestructuregrammars
=context-freegrammars(CFGs)G=(T,N,S,R)TisasetofterminalsymbolsNisasetofnonterminalsymbolsSisthestartsymbol(S∈N)Risasetofrules/productionsoftheformXX∈Nand∈(N∪T)*AgrammarGgeneratesalanguageL.PhrasestructuregrammarsinNLPG=(T,C,N,S,L,R)TisasetofterminalsymbolsCisasetofpreterminalsymbolsNisasetofnonterminalsymbolsSisthestartsymbol(S∈N)Listhelexicon,asetofitemsoftheformXxX∈Pandx∈TRisthegrammar,asetofitemsoftheformXX∈Nand∈(N∪C)*Byusualconvention,Sisthestartsymbol,butinstatisticalNLP,weusuallyhaveanextranodeatthetop(ROOT,TOP)Weusuallywriteeforanemptysequence,ratherthannothingAphrasestructuregrammarSNPVPVPVNPVP
VNPPPNPNPNPNPNPPPNPNNPePPPNPpeoplefishtankspeoplefishwithrodsN
people
N
fish
N
tanks
N
rods
V
people
V
fish
V
tanks
P
with
Probabilistic–orstochastic–context-freegrammars(PCFGs)G=(T,N,S,R,P)TisasetofterminalsymbolsNisasetofnonterminalsymbolsSisthestartsymbol(S∈N)Risasetofrules/productionsoftheformXPisaprobabilityfunctionP:R[0,1]
AgrammarGgeneratesalanguagemodelL.APCFGSNPVP 1.0VPVNP 0.6VP
VNPPP 0.4NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people
0.5N
fish 0.2N
tanks 0.2N
rods 0.1V
people 0.1V
fish 0.6V
tanks 0.3P
with 1.0[WithemptyNPremovedsolessambiguous]TheprobabilityoftreesandstringsP(t)–Theprobabilityofatreet
istheproductoftheprobabilitiesoftherulesusedtogenerateit.P(s)–Theprobabilityofthestrings
isthesumoftheprobabilitiesofthetreeswhichhavethatstringastheiryieldP(s)=Σj
P(s,t)wheretisaparseofs
=Σj
P(t)TreeandStringProbabilities s
=
peoplefishtankswithrodsP(t1)=1.0×0.7×0.4×0.5×0.6×0.7×1.0×0.2×1.0×0.7×0.1
=0.0008232P(t2)=1.0×0.7×0.6×0.5×0.6×0.2
×0.7×1.0×0.2×1.0×0.7×0.1=0.00024696
P(s)=P(t1)+P(t2) =0.0008232+0.00024696=0.00107016
VerbattachNounattachCFGs
andPCFGs(Probabilistic)Context-FreeGrammarsGrammar
TransformsRestrictingthegrammarformforefficientparsingChomskyNormalFormAllrulesareoftheformXYZorXwX,Y,Z∈Nandw∈TAtransformationtothisformdoesn’tchangetheweakgenerativecapacityofaCFGThatis,itrecognizesthesamelanguageButmaybewithdifferenttreesEmptiesandunariesareremovedrecursivelyn-aryrulesaredividedbyintroducingnewnonterminals(n>2)AphrasestructuregrammarSNPVPVPVNPVP
VNPPPNPNPNPNPNPPPNPNNPePPPNPN
people
N
fish
N
tanks
N
rods
V
people
V
fish
V
tanks
P
with
ChomskyNormalFormstepsSNPVPS
VPVPVNPVP
VVP
VNPPPVP
VPPNPNPNPNP
NPNPNPPPNP
PPNPNPPPNPPP
PN
people
N
fish
N
tanks
N
rods
V
people
V
fish
V
tanks
P
with
ChomskyNormalFormstepsSNPVPVPVNPS
VNPVP
VSVVP
VNPPPSVNPPPVP
VPPSVPPNPNPNPNP
NPNPNPPPNP
PPNPNPPPNPPP
PN
people
N
fish
N
tanks
N
rods
V
people
V
fish
V
tanks
P
with
ChomskyNormalFormstepsSNPVPVPVNPS
VNPVP
VVP
VNPPPSVNPPPVP
VPPSVPPNPNPNPNP
NPNPNPPPNP
PPNPNPPPNPPP
PN
people
N
fish
N
tanks
N
rods
V
people
S
people
V
fish
S
fish
V
tanks
S
tanks
P
with
ChomskyNormalFormstepsSNPVPVPVNPS
VNPVP
VNPPPSVNPPPVP
VPPSVPPNPNPNPNP
NPNPNPPPNP
PPNPNPPPNPPP
PN
people
N
fish
N
tanks
N
rods
V
people
S
people
VP
people
V
fish
S
fish
VP
fish
V
tanks
S
tanks
VP
tanks
P
with
ChomskyNormalFormstepsSNPVPVPVNPS
VNPVP
VNPPPSVNPPPVP
VPPSVPPNPNPNPNPNPPPNPPNPPPPNPNP
people
NP
fish
NP
tanks
NP
rods
V
people
S
people
VP
people
V
fish
S
fish
VP
fish
V
tanks
S
tanks
VP
tanks
P
with
PP
with
ChomskyNormalFormstepsSNPVPVPVNPS
VNPVP
V@VP_V@VP_V
NPPPSV@S_V@S_V
NPPPVP
VPPSVPPNPNPNPNPNPPPNPPNPPPPNPNP
people
NP
fish
NP
tanks
NP
rods
V
people
S
people
VP
people
V
fish
S
fish
VP
fish
V
tanks
S
tanks
VP
tanks
P
with
PP
with
AphrasestructuregrammarSNPVPVPVNPVP
VNPPPNPNPNPNPNPPPNPNNPePPPNPN
people
N
fish
N
tanks
N
rods
V
people
V
fish
V
tanks
P
with
ChomskyNormalFormstepsSNPVPVPVNPS
VNPVP
V@VP_V@VP_V
NPPPSV@S_V@S_V
NPPPVP
VPPSVPPNPNPNPNPNPPPNPPNPPPPNPNP
people
NP
fish
NP
tanks
NP
rods
V
people
S
people
VP
people
V
fish
S
fish
VP
fish
V
tanks
S
tanks
VP
tanks
P
with
PP
with
ChomskyNormalFormYoushouldthinkofthisasatransformationforefficientparsingWithsomeextrabook-keepinginsymbolnames,youcanevenreconstructthesametreeswithadetransformInpracticefullChomskyNormalFormisapainReconstructingn-ariesiseasyReconstructingunaries/emptiesistrickierBinarizationiscrucialforcubictimeCFGparsingTherestisn’tnecessary;itjustmakesthealgorithmscleanerandabitquickerROOTSNPVPNpeopleVNPPPPNProdswithtanksfishNNAnexample:beforebinarization…PNProdsNwithNPNpeopletanksfishNVPVNPPP@VP_VROOTSAfterbinarization…Treebank:emptiesandunariesROOTS-HLNNP-SUBJVPVB-NONE-eAtonePTBTreeROOTSNPVPVB-NONE-eAtoneNoFuncTagsROOTSVPVBAtoneNoEmptiesROOTSAtoneNoUnariesROOTVBAtoneHighLowUnaryrules:
alchemyinthelandoftreebanksSame-SpanReachabilityADJPADVPFRAGINTJNPPPPRNQPSSBARUCPVPWHNPTOPLSTCONJPWHADJPWHADVPWHPPNXNoEmptiesNACSBARQSINVRRCSQXPRTGrammar
TransformsRestrictingthegrammarformforefficientparsingCKYParsingExactpolynomialtimeparsingof(P)CFGsConstituencyParsing fish
peoplefishtanksRuleProbθi
SNPVP θ0NPNPNP θ1…Nfish θ42Npeople θ43V
fish θ44…PCFGNNVNVPNPNPSCocke-Kasami-Younger(CKY)
ConstituencyParsingfishpeoplefish
tanksViterbi(Max)ScorespeoplefishNP
0.35V 0.1N 0.5VP
0.06NP 0.14V 0.6N 0.2NP→NNNNS 0.13iNP=(0.13)(0.0023)(0.0014)=1.87×10-7NP→NNPNNS 0.056iNP=(0.056)(0.001)(0.0014)=7.84×10-8S1.87×10-7VPViterbi(Max)ScorespeoplefishNP
0.35V 0.1N 0.5VP
0.06NP 0.14V 0.6N 0.2SNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0ExtendedCKYparsingUnariescanbeincorporatedintothealgorithmMessy,butdoesn’tincreasealgorithmiccomplexityEmptiescanbeincorporatedUsefencepostsDoesn’tincreasecomplexity;essentiallylikeunariesBinarizationisvitalWithoutbinarization,youdon’tgetparsingcubicinthelengthofthesentenceandinthenumberofnonterminalsinthegrammarBinarizationmaybeanexplicittransformationorimplicitinhowtheparserworks(Early-styledottedrules),butit’salwaysthere.functionCKY(words,grammar)returns[most_probable_parse,prob]score=newdouble[#(words)+1][#(words)+1][#(nonterms)]back=newPair[#(words)+1][#(words)+1][#nonterms]]fori=0;i<#(words);i++forAinnontermsifA->words[i]ingrammarscore[i][i+1][A]=P(A->words[i])//handleunariesbooleanadded=truewhileaddedadded=falseforA,Binnontermsifscore[i][i+1][B]>0&&A->Bingrammarprob=P(A->B)*score[i][i+1][B]
ifprob>score[i][i+1][A]score[i][i+1][A]=probback[i][i+1][A]=Badded=trueTheCKYalgorithm(1960/1965)
…extendedtounariesforspan=2to#(words)forbegin=0to#(words)-spanend=begin+spanforsplit=begin+1toend-1forA,B,Cinnontermsprob=score[begin][split][B]*score[split][end][C]*P(A->BC)
ifprob>score[begin][end][A]score[begin]end][A]=probback[begin][end][A]=newTriple(split,B,C)
//handleunaries
booleanadded=true
whileadded
added=false
forA,Binnonterms
prob=P(A->B)*score[begin][end][B];
ifprob>score[begin][end][A]
score[begin][end][A]=prob
back[begin][end][A]=B
added=truereturnbuildTree(score,back)TheCKYalgorithm(1960/1965)
…extendedtounariesQuizQuestion!runsdownNNS 0.0023VB 0.001PP 0.2IN 0.0014NNS 0.0001PP→IN 0.002NP→NNSNNS 0.01NP→NNSNP 0.005NP→NNSPP 0.01VP→VBPP 0.045VP→VBNP 0.015?? ???? ??Whatconstituents(withwhatprobabilitycanyoumake?CKYParsingExactpolynomialtimeparsingof(P)CFGsCKYParsingAworkedexampleThegrammar:
Binary,noepsilons,SNPVP 0.9SVP 0.1VPVNP 0.5VP
V 0.1VP
V@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish 0.2N
tanks 0.2N
rods 0.1V
people 0.1V
fish 0.6V
tanks 0.3P
with 1.0score[0][1]score[1][2]score[2][3]score[3][4]score[0][2]score[1][3]score[2][4]score[0][3]score[1][4]score[0][4]012341234fishpeoplefishtanks012341234fishpeoplefishtanksSNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish
0.2N
tanks
0.2N
rods
0.1V
people 0.1V
fish
0.6V
tanks 0.3P
with
1.0fori=0;i<#(words);i++forAinnontermsifA->words[i]ingrammarscore[i][i+1][A]=P(A->words[i]);Nfish0.2Vfish0.6Npeople0.5Vpeople0.1Nfish0.2Vfish0.6N
tanks0.2V
tanks0.1012341234fishpeoplefishtanksSNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish
0.2N
tanks
0.2N
rods
0.1V
people 0.1V
fish
0.6V
tanks 0.3P
with
1.0//handleunariesbooleanadded=truewhileaddedadded=falseforA,Binnontermsifscore[i][i+1][B]>0&&A->Bingrammarprob=P(A->B)*score[i][i+1][B]if(prob>score[i][i+1][A])score[i][i+1][A]=probback[i][i+1][A]=Badded=trueNfish0.2Vfish0.6NP
N0.14VP
V0.06S
VP0.006Npeople0.5Vpeople0.1NPN0.35VPV0.01SVP0.001Nfish0.2Vfish0.6NPN0.14VPV0.06SVP0.006N
tanks0.2V
tanks0.1NPN0.14VPV0.03SVP0.003012341234fishpeoplefishtanksSNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish
0.2N
tanks
0.2N
rods
0.1V
people 0.1V
fish
0.6V
tanks 0.3P
with
1.0prob=score[begin][split][B]*score[split][end][C]*P(A->BC)if(prob>score[begin][end][A])
score[begin]end][A]=prob
back[begin][end][A]=newTriple(split,B,C)Nfish0.2Vfish0.6NP
N0.14VP
V0.06S
VP0.006Npeople0.5Vpeople0.1NPN0.35VPV0.01SVP0.001Nfish0.2Vfish0.6NPN0.14VPV0.06SVP0.006N
tanks0.2V
tanks0.1NPN0.14VPV0.03SVP0.003NP
NPNP
0.0049VPVNP
0.105S
NPVP
0.00126NPNPNP
0.0049VPVNP
0.007SNPVP
0.0189NPNPNP
0.00196VPVNP
0.042SNPVP
0.00378012341234fishpeoplefishtanksSNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish
0.2N
tanks
0.2N
rods
0.1V
people 0.1V
fish
0.6V
tanks 0.3P
with
1.0//handleunariesbooleanadded=truewhileadded
added=false
forA,Binnonterms
prob=P(A->B)*score[begin][end][B];
ifprob>score[begin][end][A]
score[begin][end][A]=prob
back[begin][end][A]=B
added=trueNfish0.2Vfish0.6NP
N0.14VP
V0.06S
VP0.006Npeople0.5Vpeople0.1NPN0.35VPV0.01SVP0.001Nfish0.2Vfish0.6NPN0.14VPV0.06SVP0.006N
tanks0.2V
tanks0.1NPN0.14VPV0.03SVP0.003NP
NPNP
0.0049VPVNP
0.105S
VP
0.0105NPNPNP
0.0049VPVNP
0.007SNPVP
0.0189NPNPNP
0.00196VPVNP
0.042S
VP
0.0042012341234fishpeoplefishtanksSNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish
0.2N
tanks
0.2N
rods
0.1V
people 0.1V
fish
0.6V
tanks 0.3P
with
1.0forsplit=begin+1toend-1
forA,B,Cinnonterms
prob=score[begin][split][B]*score[split][end][C]*P(A->BC)
ifprob>score[begin][end][A]
score[begin]end][A]=prob
back[begin][end][A]=newTriple(split,B,C)Nfish0.2Vfish0.6NP
N0.14VP
V0.06S
VP0.006Npeople0.5Vpeople0.1NPN0.35VPV0.01SVP0.001Nfish0.2Vfish0.6NPN0.14VPV0.06SVP0.006N
tanks0.2V
tanks0.1NPN0.14VPV0.03SVP0.003NP
NPNP
0.0049VPVNP
0.105S
VP
0.0105NPNPNP
0.0049VPVNP
0.007SNPVP
0.0189NPNPNP
0.00196VPVNP
0.042S
VP
0.0042NPNPNP
0.0000686VPVNP
0.00147SNPVP
0.000882012341234fishpeoplefishtanksSNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish
0.2N
tanks
0.2N
rods
0.1V
people 0.1V
fish
0.6V
tanks 0.3P
with
1.0forsplit=begin+1toend-1
forA,B,Cinnonterms
prob=score[begin][split][B]*score[split][end][C]*P(A->BC)
ifprob>score[begin][end][A]
score[begin]end][A]=prob
back[begin][end][A]=newTriple(split,B,C)Nfish0.2Vfish0.6NP
N0.14VP
V0.06S
VP0.006Npeople0.5Vpeople0.1NPN0.35VPV0.01SVP0.001Nfish0.2Vfish0.6NPN0.14VPV0.06SVP0.006N
tanks0.2V
tanks0.1NPN0.14VPV0.03SVP0.003NP
NPNP
0.0049VPVNP
0.105S
VP
0.0105NPNPNP
0.0049VPVNP
0.007SNPVP
0.0189NPNPNP
0.00196VPVNP
0.042S
VP
0.0042NPNPNP
0.0000686VPVNP
0.00147SNPVP
0.000882NPNPNP0.0000686VPVNP
0.000098SNPVP
0.01323012341234fishpeoplefishtanksSNPVP 0.9SVP 0.1VPVNP 0.5VPV 0.1VPV@VP_V 0.3VPVPP 0.1@VP_VNPPP 1.0NPNPNP 0.1NPNPPP 0.2NPN 0.7PPPNP 1.0N
people 0.5N
fish
0.2N
tanks
0.2N
rods
0.1V
people 0.1V
fish
0.6V
tanks 0.3P
with
1.0forsplit=begin+1toend-1
forA,B,Cinnonterms
prob=score[begin][split][B]*score[split][end][C]*P(A->BC)
ifprob>score[begin][end][A]
score[begin]end][A]=prob
back[begin][end][A]=newTriple(split,B,C)Nfish0.2Vfish0.6NP
N0.14VP
V0.06S
VP0.006Npeople0.5Vpeople0.1NPN0.35VPV0.01SVP0.001Nfish0.2Vfish0.6NPN0.14VPV0.0
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年度互聯(lián)網(wǎng)企業(yè)派遣員工網(wǎng)絡(luò)安全合同3篇
- 2025年全新公對公借款合同模板下載及服務(wù)支持10篇
- 二零二五年度體育館租賃合同附體育賽事推廣及贊助招商服務(wù)
- 2025版智能工廠生產(chǎn)線改造施工合同4篇
- 二零二五年度新能源產(chǎn)品銷售代理合作合同范本3篇
- Bobath技術(shù)閆秀麗講解
- 2025年度個人藝術(shù)品租賃借款合同范本及租賃期限約定
- 2025年室內(nèi)墻面批白工程售后服務(wù)合同
- 二零二五年度戶外廣告照明外接電源供應(yīng)合同
- 2025年度個人房屋抵押貸款擔保及養(yǎng)老保障服務(wù)合同
- 道路瀝青工程施工方案
- 2025年度正規(guī)離婚協(xié)議書電子版下載服務(wù)
- 《田口方法的導(dǎo)入》課件
- 內(nèi)陸?zhàn)B殖與水產(chǎn)品市場營銷策略考核試卷
- 電力電纜工程施工組織設(shè)計
- 醫(yī)生給病人免責協(xié)議書(2篇)
- 票據(jù)業(yè)務(wù)居間合同模板
- 高中物理選擇性必修2教材習(xí)題答案
- 應(yīng)急預(yù)案評分標準表
- “網(wǎng)絡(luò)安全課件:高校教師網(wǎng)絡(luò)安全與信息化素養(yǎng)培訓(xùn)”
- 鋰離子電池健康評估及剩余使用壽命預(yù)測方法研究
評論
0/150
提交評論