版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領
文檔簡介
FinanceandEconomicsDiscussionSeries
FederalReserveBoard,Washington,D.C.
ISSN1936-2854(Print)
ISSN2767-3898(Online)
ManufacturingSentiment:ForecastingIndustrialProductionwithTextAnalysis
TomazCajner,LelandD.Crane,ChristopherKurz,NormanMorin,PaulE.Soto,BetsyVrankovich
2024-026
Pleasecitethispaperas:
Cajner,Tomaz,LelandD.Crane,ChristopherKurz,NormanMorin,PaulE.Soto,andBetsyVrankovich(2024).“ManufacturingSentiment:ForecastingIndustrialProductionwithTextAnalysis,”FinanceandEconomicsDiscussionSeries2024-026.Washington:BoardofGov-ernorsoftheFederalReserveSystem,
/10.17016/FEDS.2024.026
.
NOTE:StafworkingpapersintheFinanceandEconomicsDiscussionSeries(FEDS)arepreliminarymaterialscirculatedtostimulatediscussionandcriticalcomment.TheanalysisandconclusionssetfortharethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearchstafortheBoardofGovernors.ReferencesinpublicationstotheFinanceandEconomicsDiscussionSeries(otherthanacknowledgement)shouldbeclearedwiththeauthor(s)toprotectthetentativecharacterofthesepapers.
ManufacturingSentiment:
ForecastingIndustrialProductionwithTextAnalysis*
TomazCajner
NormanMorin
LelandD.Crane
PaulE.Soto
April2024
ChristopherKurz
BetsyVrankovich
Abstract
Thispaperexaminesthelinkbetweenindustrialproductionandthesentimentex-pressedinnaturallanguagesurveyresponsesfromU.S.manufacturingirms.Wecom-pareseveralnaturallanguageprocessing(NLP)techniquesforclassifyingsentiment,rangingfromdictionary-basedmethodstomoderndeeplearningmethods.Usingamanuallylabeledsampleasgroundtruth,weindthatdeeplearningmodels—partiallytrainedonahuman-labeledsampleofourdata—outperformothermethodsforclas-sifyingthesentimentofsurveyresponses.Further,wecapitalizeonthepanelnatureofthedatatotrainmodelswhichpredictirm-levelproductionusinglaggedirm-leveltext.Thisallowsustoleveragealargesampleof“naturallyoccurring”labelswithnomanualinput.Wethenassesstheextenttowhicheachsentimentmeasure,aggregatedtomonthlytimeseries,canserveasausefulstatisticalindicatorandforecastindustrialproduction.Ourresultssuggestthatthetextresponsesprovideinformationbeyondtheavailablenumericaldatafromthesamesurveyandimproveout-of-sampleforecast-ing;deeplearningmethodsandtheuseofnaturallyoccurringlabelsseemespeciallyusefulforforecasting.Wealsoexplorewhatdrivesthepredictionsmadebythedeeplearningmodels,andindthatarelativelysmallnumberofwords—associatedwithverypositive/negativesentiment—accountformuchofthevariationintheaggregatesentimentindex.
JELcodes:C1,E17,O14
Keywords:IndustrialProduction,NaturalLanguageProcessing,MachineLearning,Forecasting
*AllauthorsareattheFederalReserveBoardofGovernors.WethanktheInstituteforSupplyManage-ment,includingKristinaCahill,TomDerry,DebbieFogel-Monnissen,RoseMarieGoupil,PaulLee,SusanMarty,andDenisWolowiecki,foraccesstoandhelpwiththemanufacturingsurveydatathatunderlietheworkdescribedbythispaper.WearethankfulforcommentsandsuggestionsfromStephenHansen,AndreasJoseph,JuriMarcucci,ArthurTurrell,andparticipantsattheSocietyforGovernmentEconomistsAnnualConference,theESCoEConferenceonEconomicMeasurement,theGovernmentAdvancesinStatisticalProgrammingConference,theSocietyforEconomicMeasurementConference,andtheNontraditionalData,MachineLearning,andNaturalLanguageProcessinginMacroeconomicsConference.Theanalysisandcon-clusionssetforthherearethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearchstafortheBoardofGovernors.
2
1Introduction
Inrecentyearstherehasbeenanexplosionofinterestinnaturallanguageprocessing(NLP)withininanceandmacroeconomics.Theuseoftextdatatoforecastandassistinmodelestimationisbecomingincreasinglycommonplace.Still,therearemanyopenquestionsaroundtheuseofNLPinempiricalwork.Forexample,whichofthenumerousavailablemethodsworkbest,andworkbestinspeciiccontexts?Areof-the-shelftoolsappropriate,oraretheregreaterreturnstospecializingmodelstothedataathand?Howusefulistextforforecastingrealoutputindicators,suchasmanufacturingoutput?WhatexplainsthepredictionsmadebycomplicatedNLPmodels?Thispaperaddressesthesequestions,usinganoveldatasetandavarietyofNLPmethodsrangingfromtraditionaldictionariestoine-tunedtransformerneuralnetworks.
OurprimarydatasourceisthemonthlysurveymicrodataunderlyingtheInstituteforSupplyManagement’s(ISM)ManufacturingReportonBusiness.ThesurveyistakenbypurchasingmanagersatarepresentativesampleofU.S.manufacturingirms.Partofthesurveyconsistsofcategorical-responsequestionsaboutaspectsoftheircurrentoperations,includingproduction,inventories,backlogs,employment,andneworders.Theanswerstothesequestionsareoftheform“worse/thesame/betterthanlastmonth”,andareaggregatedintothewidely-reportedISMdifusionindexes.Butthesurveyalsoincludesfree-responsetextboxes,wherepurchasingmanagerscanprovidefurthercommentseitheringeneraloraboutspeciicaspectsoftheirbusinesses;thesecommentsareanovelsourceofsignalabouttheeconomyandourfocusinthispaper
.1
Ourirststepistoquantifythetextintoaneconomicallyimportantandinterpretablemeasure.Wefocusonsentiment,giventhatwavesofoptimismandpessimismhavehis-
toricallybeenlinkedtobusinesscycleluctuations(Keynes,
1937).Webeginbyevaluating
variousNLPmethodsintermsoftheirabilitytocorrectlyclassifythesentimentexpressedinindividualcomments.Ourcontextisfairlyspeciic:thedataaremanufacturing-sectorpurchasingmanagersopiningaboutaboutthebusinessoutlookfortheirirm,withoutmuchdiscussionofinancialconditions.Whiletherearenumeroussentimentclassiicationmod-elsavailable,manyweredevelopedwithotherdatainmind,suchassocialmediaposts
(Nielsen
,
2011)
.Evenwithineconomicsandinance,mostworkhasfocusedoninance-
1WhileISMcollectstheseresponsesthroughthesurvey,thistextisconidentialandnotincor-poratedintothepublicizedindexes.AsampleofresponsesarepublishedinthemonthlyISMRe-portonBusiness(see
/supply-management-news-and-reports/reports/
ism-report-on-business/)
.
3
relatedlanguage(Araci,
2019;
Correaetal.,
2021;
Huangetal.,
2022)
.Thelackofresultsformanufacturing-speciicdatasetsmotivatesourassessmentofavarietyofNLPtechniques.
Onecommonapproachistocountthefrequencyofwordswithinasentimentdictionary.Economistsinitiallyusedpositiveandnegativewordsfrompsychologyliterature,buthavesincemovedontousingdomain-speciicwords(e.g.,
Correaetal.,
2021)andusingsimple
wordcountstomeasureothertypesoftone,suchasuncertainty(see
Bakeretal.,
2016
and
Gentzkowetal.,
2019).Whilethismethodistransparent,itmayfailtocapturenegation,
synonyms,andoftenrequirescontext-speciicdictionariesthatmaynotbeavailable.Morerecentlydevelopedtechniquesemploydeeplearningmethodsthataccountforthenuancesoflanguage.WefocusonvariantsofBERT(see
Devlinetal.,
2018),aprecursorofpopular
largelanguagemodelslikeChatGPT.Thesemodelsarepre-trained:theparametersaresetbyexposingthemodeltoalargecorpusoftext—suchastheentiretyofWikipedia—andattemptingtopredictmissingwordsortherelationshipbetweensentences.Thepre-trainedmodelscanbeusedtoclassifysentimentdirectly,ortheycanbefurthertrained(“ine-tuned”)onaspeciicdataset.Thelatterapproachattemptstogetthebestofbothworlds:asolidabilitytoparselanguagefromtheexposuretoalargequantityoftrainingdata,plusthecontext-speciicnuancefromtheine-tuningdata.Whiledeeplearninggetsenormousattention,itisex-anteunclearwhetheritshouldoutperformcarefullycurateddictionariesinourcontext.
Comparingtheaccuracyofthesediferentmethodsonasampleofhand-codedcommentsfromourdatasetweindthatdeeplearningdoeshaveanadvantageonourdata,inpartbecausethebrevityofthecommentsmeansthatmanycommentshavenooverlapwithdictionaryterms.Inaddition,weindthatthereisvalueinspecializingthemodelstoourdata:themodelsine-tunedonourdatahavethehighestsentimentclassiicationaccuracyonahold-outsample.Theseresultspointtotheadvantagesofusingpre-trainedmodels,aswellascarefullyspecializingthemtothetaskathand.OurhopeisthattheseresultshelpguideothereconomistswhendecidingbetweenNLPapproaches.
Thesentimentmeasuresbasedonfree-formtextualresponsesintheISMdataaggre-gateintoindexesthatcloselymirrorboththedifusionindexbasedontheresponsestothecategoricalsurveyandaggregatemanufacturingoutput,asmeasuredbythemanufactur-ingcomponentofindustrialproduction.Wefurtherinvestigatetherelationshipbetweentheaveragesentimentexpressedbypurchasingmanagersandmanufacturingoutputeconometri-cally.Ourbaselineforecastingmodelaskswhethersentimentcanhelpforecastmanufacturingoutputandincludes—amongothercontrols—someoftheISMdifusionindexes,sothetest
4
iswhetherthesentimentindexeshaveadditionalinformationbeyondtheISMcategoricalresponsesdata.Weindthatmostdictionary-basedtextvariablesdonothelppredictman-ufacturingoutput,withtheexceptionofacuratedinancialstability-speciicdictionary.Ontheotherhand,sentimentvariablesfromthedeeplearningmodelsarepredictiveoffuturemanufacturingoutput.Out-of-sampleforecastingexercisesshowthattheinancialstabil-itydictionaryanddeeplearningtechniquessigniicantlyreducethemeansquaredforecasterrorsaswell.Overall,ourresultssuggestthatpurchasingmanagers’surveyresponsescon-tainusefulforward-lookinginformation,andthatsentiment-basedmeasurescanimprovetheaccuracyofforecastsofmanufacturingoutput.
Theexercisesdescribedaboverelyonamanually-labeledsampleofthedata,bothtoassesstheaccuracyofdiferentmethodsandtohelpine-tunesomeofthedeep-learningbasedmethods.However,thepanelmicrodataallowforadiferentapproach.Sinceirmsareinthesurveyformultiplemonths,wecanlinkthetext(andother)datafromagivenmonthtonextmonth’sirm-levelproductiondata.Fittingamodeltothesedataletsusforecastirm-levelproductionusingirm-levellaggedinformation.Thismethodologyhastwoadvantages.First,itgivesusamuchlargertrainingsamplesizeascomparedtothemanuallylabeleddata.Second,italignsthetrainingdataobjectiveverypreciselywiththeaggregateforecastingobjective.Onthissecondpoint,wedoourbestwhenmanuallylabelingdatatodiscernwhetherthecommentisindicativeofrisingorfallingindustrialproduction.Butthereareplentyofambiguouscases,sotherearesomeclearadvantagestolettingthedataspeak,andseeingwhattextisactuallyassociatedwithfuture(irmlevel)changesinproduction.Weindthatine-tuninginthiswayiscompetitivewithusingthemanuallabels,andinsomecasespreferable.
Finally,wemakeprogressontheexplainabilityofdeeplearningmodels.Thesemodelsarenotoriouslyopaque,aconsequenceoftheirveryhighparametercountandextremelynonlineararchitecture.Thiscanmakeitdi伍culttotrusttheoutputsofsuchmodels,asitisnotinitiallycleariftheseeminglygoodpredictionsarebasedonsolidfoundations.Weuseastandardmachinelearninginterpretabilitymethod—Shapleydecompositions—toscorethecontributionofeachindividualwordineachcomment.Ourresultspointtoasensibleinterpretationofourdeeplearningmodels.First,thescoreforeachwordisroughlyconstantovertime:wordsdonotdramaticallychangetheiraverageconnotation(thoughtheunderlyingdeeplearningmodelallowsforthis).Second,therearefattailstothescores:mostwordshavescoresveryclosetozero(neutral),witharelativelysmallnumberofwordshavingextremesentiment.Forexample,themostpositivewordsinclude
5
“brisk”,“excellent”,“booming”,“improve”,and“e伍cient”;amongthemostnegativewords
are“unstable”,“insu伍cient”,“fragile”,“inconsistent”,and“questionable”.Theclose-to-neutralwordscontributeverylittletoaggregatesentiment,evenafteraccountingforthefactthattheyoccurveryfrequently.Finally,weindthatchangesinouraggregatedsentimentindexarelargelyaccountedforbychangesinthefrequencyofthewordswiththemostextreme(positiveornegative)sentimentscores,withthevastmajorityofwordsplayinglittlerole.Thus,whileitmaybedi伍culttomanuallyconstructadomain-speciicdictionaryfromscratch,itispossibletoextractafairlysimple,interpretabledictionaryfromthedeeplearningmodel.
Ourpapercontributestotwostrandsofliterature.First,ourcomparisonofNLPtech-niquesformeasuringsentimentaddstothegrowingbodyofliteratureincorporatingNLPintoeconomicandinancialresearch.Sincetheseminalworkof
Tetlock
(2007),manystudies
haveuseddictionary-basedmethods(Bakeretal.,
2016;
Hassanetal.,
2019;
Youngetal.,
2021
;
Cowheyetal.,
2022),andreinedlexiconsforspeciiccontextshavebeenshownto
improveperformanceinmeasurementandforecasting(Correaetal.,
2021;
Gardneretal.,
2022
;
Sharpeetal.,
2023).Machinelearningtechniqueshavealsobeenusedtoselectword
lists(ManelaandMoreira,
2017;
Soto,
2021).Morerecentpapersincorporatemoresophis
-
ticatedmachinelearningmethodstoextractthetenseandtopicoftexts(Angelicoetal.,
2022
;
HanleyandHoberg,
2019;
Hansenetal.,
2018;
Kalamaraetal.,
2022)
.AdvancesinNLP,particularlytheuseofdeeplearningtechniques,havesigniicantlyimprovedsentiment
classiication(HestonandSinha,
2017;
Araci,
2019;
Huangetal.,
2022;
Bybee,
2023;
Jhaet
al.
,
2024)
.
Second,wecontributetotheliteratureonforecastingindustrialproduction(D’Agostino
andSchnatz
,
2012;
LahiriandMonokroussos,
2013;
Ardiaetal.,
2019;
Cimadomoetal.,
2022
;
Andreouetal.,
2017).Ouranalysisoftherelationshipbetweensentimentandindus
-trialproductionprovidesnewinsightsintotheroleofunstructuredtextdataineconomic
forecasting(Marcucci,
2024)
.BycomparingvariousNLPtechniques,weareabletoiden-tifywhichmethodsaremostefectiveforclassifyingsentimentandincorporatingthemintopredictivemodelsofindustrialproduction.
Thepapermostsimilartooursis
Shapiroetal.
(2022),whoindthatdomainspeciic
dictionariescanimprovepredictionsofhumanratedsentiment.Weindbroadlysimilarresultsusingainancialstability(ratherthanageneralpurpose)dictionarytomeasuresentiment,butmoveonestepfurtherbyprovidingarobustcomparisontolargelanguagemodels.Ourpaperdifersfromtheirsintwoimportantways.First,wefocusoncreating
6
asentimentindexfromirm-leveldata,ratherthanbeginningtheanalysisatanaggregatemacroeconomiclevel.Insteadofmeasuringconsumersentimentthroughnewspaperarticles,wemeasuremanufacturingsentimentfromapanelofsurveyresponses.Ouruniquemicro-leveldataallowustounderstandthevalueoftextbeyondcategoricalresponsesandnaturallyoccurringlabels.Second,
Shapiroetal.
(2022)compareslexicon-basedsentimentapproaches
onlytobaselineBERT,whichatthetimewasthemostdevelopedtransfer-learningbasedmodel.WealsoconsidernewerdeeplearningmodelsbasedonBERT,particularlythoseine-tunedondomainspeciicandnaturallyoccurringdata.Weapplyinterpretabilitytechniquestothese‘blackbox’modelsandshowthataggregatesentimentindexesderivedfromdeeplearninghingeonthefrequenciesofrelativelyfewwords.
Theremainderofthepaperisstructuredasfollows.Section
2
presentsourdata.Section
3
reviewshowwemeasuresentimentfromthetextualsurveydataandSection
4
overviewstheresultingindexes.Section
5
presentstheempiricalstrategyandindings,andSection
6
evaluatesthemechanismsthroughwhichirmsurveyresponsespredictindustrialproduction.Section
7
concludes.
2Data
TheprimarydataforthisstudycomesfromtheInstituteforSupplyManagement(ISM).Eachmonth,ISMconductsasurveyofpurchasingmanagersfromasampleofmanufacturingirmsintheUnitedStates
.2
Difusionindexesbasedontheresponses(describedbelow)arepublishedveryrapidly,andarecloselywatchedbymarkets.Ashighlightedin
Boketal.
(2018),notonlydoessuchsurveydataprovideimportantsignalaboutthestateofthe
economy,buttheISMdatainparticularprovidesthe“earliestavailableinformationforthenationaleconomyonanygivenquarter”.Inaddition,theISMdatahavealongtime
series,whichisconducivetotime-seriesmodeling.3
Thetimelinessandrelevanceofthedatamotivatesourexplorationofthefree-responsetext.
TheISMsurveyincludesaseriesofquestionsabouttherespondents’operations,includingtheirproductionlevels,neworders,backlog,employment,supplierdeliverytimes,inputinventories,exports,andimports.Thesequestionshaveacategoricalresponse,wherethepurchasingmanagersspecifywhetherthesemetricshaveincreased,decreased,orstayedthesamebetweenlastmonthandthecurrentmonth.Thecategoricalresponsesareaggregated
2ISMalsosurveysnon-manufacturingirmsandhospitalsseparately.
3ISMseriesextendbackto1948,butmoststatisticalanalysesusedatathatstartsin1972.
7
intopublicly-releaseddifusionindexes,discussedmorebelow.Inadditiontothecategorical
response,purchasingmanagerscanprovidefurtherexplanationinaccompanyingtextboxes.Therearefreeresponsequestionsaccompanyingnearlyeverycategoricalquestion,askingforthereasonfortheresponse.Inadditionthereisa“GeneralRemarks”ieldatthebeginning,wheretherespondentcanputanygeneralremarkstheywish.TentotwelveofthesetextresponsesarefeaturedintheISM’sdatareleasetoprovidecontextforthedifusionindexes,butotherwisearenotreleasedpublicly.
TheISMmanufacturingsurveydatesbacktothe1930s.Thedatasetweanalyzecoversirm-monthobservationsfromNovember2001toJanuary2020.Mostrecently,thesamplecoversroughly350responsespermonth.Thedark-shadedareaofFigure
1
showstheper-centageofirmsinthesamplewithtextresponsesovertime.Theigureillustratesthatthemajorityofrespondentsprovidetextinadditiontotheirquantitativesurveyanswers.TheblacklineinFigure
1
presentstheaveragewordcountoverthesampleperiod.Thewordcountsrangefrom10to33wordsonaveragepermonth.Themeanwordcountappearstoluctuateoverthebusinesscycleandjumpsdramaticallyin2018.Thesuddenincreaseinwordcountin2018ismostlyduetoheightenedtensionssurroundingtradepolicyatthetime.Indeed,afterremovingresponsesthatcontaintheword“tarif,”weobserveasmootherincreaseinwordcounts(seeFigure
A1
intheappendixforfurtherdetails).
Table
1
providesasummaryofthetextresponses.Nearly49percentofthegeneralremarkssectionscontaintext,whilethenextmostcommonsectionscontainingtextarethoserelatedtoemployment,production,andneworders.Thelastrowshowsstatisticsforallthetextieldsconcatenatedtogether:69percentofirm-monthobservationshaveanytextatall,andthetextisabout17wordslongonaverage.TheaveragewordcountishighestfortheGeneralRemarkssection,withanaverageof8wordsusedintheseresponses.Whenconsideringonlythoseresponsesthatcontaintext,theaveragewordcountfortheGeneralRemarkssectionincreasesto16words.
TurningfromISM’ssurveymicrodata,weuseseveraltimeseriesinourforecastingexer-cises.Ourfocusisonforecastingthemanufacturingindustrialproduction(IP)index.Weuserealtimedataontherighthandside,relectingwhatpolicymakersknewatthetime,andforecastthefullyrevisedseries.InadditiontoIPseries,weusetheISMdifusionindexesasregressors.Thedifusionindexesareaggregationsofthecategoricalresponsequestionsinthesurvey.Forexample,theproductiondifusionindexisaweightedaverageoftheresponsestotheproductionquestion(paraphrasing,“Isproductionhigher/thesame/lowerthanlastmonth?”),withthe“Higher”responsesgettingweight100,“Same”responsesgettingweight
8
50,and“Lower”responsesgettingweight0.Theformulaforthedifusionindexinperiodt,withNt
totalirmsrespondingisshowninequation(1):
Dt=[100·1fResponseiis“Higher”g+50·1fResponseiis“Same”g](1)
Thesedifusionindexeshavevaluesbetween0and100,with0indicatingthatallrespondentssaythingsareworseand100indicatingthatallrespondentssaythingsarebetter
.4
ISMpublishesindexesforeachquestion,aswellasa“PMIComposite”,whichisanequally-weightedaverageofthedifusionindexesforneworders,production,employment,supplierdeliveries,andinventories.
3MeasuringSentiment
OurgoalistoextractusefulinformationfromtheISMsurveytextresponses.Wefocusonsentimentanalysis:measuringtheextenttowhichthepurchasingmanagersresponseispositiveornegative.Evenfocusingonsentimentanalysis,thewiderangeofNLPtechniquesavailablecanmakeitchallengingtochooseanappropriatemethod.Inthissectionwediscussthemethodsweuse,leavingacompletedescriptionoftheapproachestotheAppendix.
3.1Dictionaries
Oneofthesimplestmethodsformeasuringsentimentisdictionary-basedanalysis,whichinvolvescountingthefrequencyofapredeterminedlistofsentimentwordsinthetext.
WeusecommonsentimentdictionariessuchastheHarvard(Tetlock,
2007)andAFINN
(Nielsen
,
2011)wordlists
.However,wealsorecognizethatcertainwordsthatmaybeconsiderednegativeinothercontextsmaynotbeconsiderednegativeinthecontextofinance,suchas“taxing”or“l(fā)iability”.Assuch,wealsoapplyinance-speciicwordlists,includingthesentimentwordlistfrom
LoughranandMcDonald
(2011)(henceforth,“LM”)
andtheinancialstabilitywordlistfrom
Correaetal.
(2021).Foralldictionaries,wescore
commentsonascaleof-1to+1,usingthepercentoftotalwordsinthecommentthatarepositivelessthepercentoftotalwordsthatarenegative.Whenwerequirediscrete
4Theresponsesare“better”,“same”,or“worse”forthenewordersquestion,production,andnewexportorders.Foremployment,inventories,prices,andimportstheresponsesare“higher”,“same”,and“l(fā)ower”.Forbacklogsthechoicesare“greater”,“same”,and“l(fā)ess”.
9
classiications,asinFigure
2,weclassifythecommentaspositiveifthescoreisgreaterthan
zero,negativeifitislessthanzero,andneutralifitequalszero.
3.2DeepLearningModels
Anotherapproachtosentimentanalysisinvolvesittingamodeltothedata.Wetryseveralvariationsonthistheme.Unlikethedictionarymethods,alloftheseapproachesrequirelabeleddata:asampleofobservationsthathavealreadybeenclassiied,whichisusedtoitthemodelandclassifytheremainingobservations.
Wecreatealabeleddatasetfromarandomlyselectedsubsampleof1,000responses
withtextfromtheindividualquestions.5
Eachresponsewasclassiiedforsentimentbytwoeconomistsusingthefollowingquestionasaguide:“Isthiscommentconsistentwithmanu-facturingIPrisingmonthovermonth?”Theclassiicationswereeitherpositive,neutral,ornegative,where“neutral”includescaseswhereisitisimpossibletodeterminethesentiment.Botheconomistsagreedonthesentimentclassiicationforroughly700cases.Thissubsam-pleisfurthersplitintoa“training”dataset,usedtoitthemodels,and“test”dataset,usedtoassesstherelativemeritsofthemodels
.6
Deeplearningmodelshavegainedpopularityinrecentyears,drivenbytheirimpres-siveperformanceonlanguage-relatedtasks.Muchoftheprogresshasoccurredwithinaparticularclassofdeeplearningmodelscalledtransformers(see,e.g.,
Devlinetal.,
2018,
Radfordetal.,
2018,
Chungetal.,
2022,
Ouyangetal.,
2022,and
Touvronetal.,
2023)
.Thedeiningfeatureoftransformers—relativetootherneuralnetworkarchitectures—isamechanismcalledattention;awaytointeractwordswithinasentence,allowingthecon-textofaparticularwordtoinluencethemeaning.Afullexplanationoftransformersandtheattentionmechanismisbeyondthescopeofthispaper,butwedoprovideabriefsum-maryintheAppendix.Theimportantpointsarethat(unlikedictionariesandbag-of-wordsapproaches)transformerstakeintoaccountinteractionsbetweenwords,wordorder,andcontext-dependentmeanings(polysemy).
Onenotabletransformermodelis“BERT”,orBidirectionalEncoderRepresentationsfromTransformers,developedby
Devlinetal.
(2018)
.ItisimportanttonotethatBERTisapre-trainedmodel:
Devlinetal.
(2018)speciiedthearchitectureandthentrainedthe
modelonacorpusincludingtheentiretyof(English)Wikipediaandanumberofbooks.
5Note,thatthecategoricalresponsescanbeconsideredakindoflabelforthecorrespondingtext.InSection
4.1
weinvestigatehowwellmodelscanpredictthecategoricalresponsefromtheassociatedtext.
6Thetestdataconsistsofobservationsfrom2018m1to2020m1andisnotusedbyanyofthemodelsduringtraining.
10
Themodelislargebythestandardsoftheeconomicsliterature,wit
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- GB/T 44918-2024城鎮(zhèn)供水單位節(jié)水管理規(guī)范
- 有關(guān)法律服務所律師工作計劃例文
- 高二英語下整整學期教學計劃
- 工作計劃農(nóng)業(yè)干部學習計劃
- 春季幼兒總園學期工作計劃范文
- 小學語文四年級教學工作計劃表
- 任職目標2025年干部綜合科個人寫作目標計劃
- 公司會計、財務工作計劃文稿
- 《多基因病》課件
- 《蛋白質(zhì)化學最終版》課件
- 部編版歷史九年級上冊第六單元 第17課君主立憲制的英國【課件】y
- 2024年《建筑設備安裝》復習考試題庫及答案(含各題型)
- 2024政府采購評審專家考試題庫附含答案
- 2022年內(nèi)蒙古導游資格考試(含各科)題庫(含答案和必背題)
- 道路清掃環(huán)衛(wèi)報價方案
- 第24課《穿井得一人》公開課一等獎創(chuàng)新教學設計 統(tǒng)編版語文七年級上冊
- 廣告創(chuàng)意與品牌宣傳考核試卷
- 提高吸入劑使用正確率品管圈成果匯報
- 《2025年日歷》電子版模板年歷月歷工作學習計劃橫版 日歷計劃
- 保安保潔保障人員院感培訓
- 會議接待擺臺培訓
評論
0/150
提交評論