美聯(lián)儲-制造業(yè)情緒：用文本分析預測工業(yè)生產(chǎn) Manufacturing Sentiment Forecasting Industrial Production with Text Analysis 2024

上傳人：策*** IP屬地：山西上傳時間：2024-05-14 格式：DOCX 頁數(shù)：71 大小：871.85KB 積分：19.9 舉報 版權(quán)申訴

美聯(lián)儲-制造業(yè)情緒：用文本分析預測工業(yè)生產(chǎn) Manufacturing Sentiment Forecasting Industrial Production with Text Analysis 2024_第2頁

美聯(lián)儲-制造業(yè)情緒：用文本分析預測工業(yè)生產(chǎn) Manufacturing Sentiment Forecasting Industrial Production with Text Analysis 2024_第3頁

美聯(lián)儲-制造業(yè)情緒：用文本分析預測工業(yè)生產(chǎn) Manufacturing Sentiment Forecasting Industrial Production with Text Analysis 2024_第4頁

美聯(lián)儲-制造業(yè)情緒：用文本分析預測工業(yè)生產(chǎn) Manufacturing Sentiment Forecasting Industrial Production with Text Analysis 2024_第5頁

已閱讀5頁，還剩66頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進行舉報或認領

文檔簡介

FinanceandEconomicsDiscussionSeries

FederalReserveBoard,Washington,D.C.

ISSN1936-2854(Print)

ISSN2767-3898(Online)

ManufacturingSentiment:ForecastingIndustrialProductionwithTextAnalysis

TomazCajner,LelandD.Crane,ChristopherKurz,NormanMorin,PaulE.Soto,BetsyVrankovich

2024-026

Pleasecitethispaperas:

Cajner,Tomaz,LelandD.Crane,ChristopherKurz,NormanMorin,PaulE.Soto,andBetsyVrankovich(2024).“ManufacturingSentiment:ForecastingIndustrialProductionwithTextAnalysis,”FinanceandEconomicsDiscussionSeries2024-026.Washington:BoardofGov-ernorsoftheFederalReserveSystem,

/10.17016/FEDS.2024.026

NOTE:StafworkingpapersintheFinanceandEconomicsDiscussionSeries(FEDS)arepreliminarymaterialscirculatedtostimulatediscussionandcriticalcomment.TheanalysisandconclusionssetfortharethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearchstafortheBoardofGovernors.ReferencesinpublicationstotheFinanceandEconomicsDiscussionSeries(otherthanacknowledgement)shouldbeclearedwiththeauthor(s)toprotectthetentativecharacterofthesepapers.

ManufacturingSentiment:

ForecastingIndustrialProductionwithTextAnalysis*

TomazCajner

NormanMorin

LelandD.Crane

PaulE.Soto

April2024

ChristopherKurz

BetsyVrankovich

Abstract

Thispaperexaminesthelinkbetweenindustrialproductionandthesentimentex-pressedinnaturallanguagesurveyresponsesfromU.S.manufacturingirms.Wecom-pareseveralnaturallanguageprocessing(NLP)techniquesforclassifyingsentiment,rangingfromdictionary-basedmethodstomoderndeeplearningmethods.Usingamanuallylabeledsampleasgroundtruth,weindthatdeeplearningmodels—partiallytrainedonahuman-labeledsampleofourdata—outperformothermethodsforclas-sifyingthesentimentofsurveyresponses.Further,wecapitalizeonthepanelnatureofthedatatotrainmodelswhichpredictirm-levelproductionusinglaggedirm-leveltext.Thisallowsustoleveragealargesampleof“naturallyoccurring”labelswithnomanualinput.Wethenassesstheextenttowhicheachsentimentmeasure,aggregatedtomonthlytimeseries,canserveasausefulstatisticalindicatorandforecastindustrialproduction.Ourresultssuggestthatthetextresponsesprovideinformationbeyondtheavailablenumericaldatafromthesamesurveyandimproveout-of-sampleforecast-ing;deeplearningmethodsandtheuseofnaturallyoccurringlabelsseemespeciallyusefulforforecasting.Wealsoexplorewhatdrivesthepredictionsmadebythedeeplearningmodels,andindthatarelativelysmallnumberofwords—associatedwithverypositive/negativesentiment—accountformuchofthevariationintheaggregatesentimentindex.

JELcodes:C1,E17,O14

Keywords:IndustrialProduction,NaturalLanguageProcessing,MachineLearning,Forecasting

*AllauthorsareattheFederalReserveBoardofGovernors.WethanktheInstituteforSupplyManage-ment,includingKristinaCahill,TomDerry,DebbieFogel-Monnissen,RoseMarieGoupil,PaulLee,SusanMarty,andDenisWolowiecki,foraccesstoandhelpwiththemanufacturingsurveydatathatunderlietheworkdescribedbythispaper.WearethankfulforcommentsandsuggestionsfromStephenHansen,AndreasJoseph,JuriMarcucci,ArthurTurrell,andparticipantsattheSocietyforGovernmentEconomistsAnnualConference,theESCoEConferenceonEconomicMeasurement,theGovernmentAdvancesinStatisticalProgrammingConference,theSocietyforEconomicMeasurementConference,andtheNontraditionalData,MachineLearning,andNaturalLanguageProcessinginMacroeconomicsConference.Theanalysisandcon-clusionssetforthherearethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearchstafortheBoardofGovernors.

1Introduction

Inrecentyearstherehasbeenanexplosionofinterestinnaturallanguageprocessing(NLP)withininanceandmacroeconomics.Theuseoftextdatatoforecastandassistinmodelestimationisbecomingincreasinglycommonplace.Still,therearemanyopenquestionsaroundtheuseofNLPinempiricalwork.Forexample,whichofthenumerousavailablemethodsworkbest,andworkbestinspeciiccontexts?Areof-the-shelftoolsappropriate,oraretheregreaterreturnstospecializingmodelstothedataathand?Howusefulistextforforecastingrealoutputindicators,suchasmanufacturingoutput?WhatexplainsthepredictionsmadebycomplicatedNLPmodels?Thispaperaddressesthesequestions,usinganoveldatasetandavarietyofNLPmethodsrangingfromtraditionaldictionariestoine-tunedtransformerneuralnetworks.

OurprimarydatasourceisthemonthlysurveymicrodataunderlyingtheInstituteforSupplyManagement’s(ISM)ManufacturingReportonBusiness.ThesurveyistakenbypurchasingmanagersatarepresentativesampleofU.S.manufacturingirms.Partofthesurveyconsistsofcategorical-responsequestionsaboutaspectsoftheircurrentoperations,includingproduction,inventories,backlogs,employment,andneworders.Theanswerstothesequestionsareoftheform“worse/thesame/betterthanlastmonth”,andareaggregatedintothewidely-reportedISMdifusionindexes.Butthesurveyalsoincludesfree-responsetextboxes,wherepurchasingmanagerscanprovidefurthercommentseitheringeneraloraboutspeciicaspectsoftheirbusinesses;thesecommentsareanovelsourceofsignalabouttheeconomyandourfocusinthispaper

Ourirststepistoquantifythetextintoaneconomicallyimportantandinterpretablemeasure.Wefocusonsentiment,giventhatwavesofoptimismandpessimismhavehis-

toricallybeenlinkedtobusinesscycleluctuations(Keynes,

1937).Webeginbyevaluating

variousNLPmethodsintermsoftheirabilitytocorrectlyclassifythesentimentexpressedinindividualcomments.Ourcontextisfairlyspeciic:thedataaremanufacturing-sectorpurchasingmanagersopiningaboutaboutthebusinessoutlookfortheirirm,withoutmuchdiscussionofinancialconditions.Whiletherearenumeroussentimentclassiicationmod-elsavailable,manyweredevelopedwithotherdatainmind,suchassocialmediaposts

(Nielsen

2011)

.Evenwithineconomicsandinance,mostworkhasfocusedoninance-

1WhileISMcollectstheseresponsesthroughthesurvey,thistextisconidentialandnotincor-poratedintothepublicizedindexes.AsampleofresponsesarepublishedinthemonthlyISMRe-portonBusiness(see

/supply-management-news-and-reports/reports/

ism-report-on-business/)

relatedlanguage(Araci,

2019;

Correaetal.,

2021;

Huangetal.,

2022)

.Thelackofresultsformanufacturing-speciicdatasetsmotivatesourassessmentofavarietyofNLPtechniques.

Onecommonapproachistocountthefrequencyofwordswithinasentimentdictionary.Economistsinitiallyusedpositiveandnegativewordsfrompsychologyliterature,buthavesincemovedontousingdomain-speciicwords(e.g.,

Correaetal.,

2021)andusingsimple

wordcountstomeasureothertypesoftone,suchasuncertainty(see

Bakeretal.,

2016

and

Gentzkowetal.,

2019).Whilethismethodistransparent,itmayfailtocapturenegation,

synonyms,andoftenrequirescontext-speciicdictionariesthatmaynotbeavailable.Morerecentlydevelopedtechniquesemploydeeplearningmethodsthataccountforthenuancesoflanguage.WefocusonvariantsofBERT(see

Devlinetal.,

2018),aprecursorofpopular

largelanguagemodelslikeChatGPT.Thesemodelsarepre-trained:theparametersaresetbyexposingthemodeltoalargecorpusoftext—suchastheentiretyofWikipedia—andattemptingtopredictmissingwordsortherelationshipbetweensentences.Thepre-trainedmodelscanbeusedtoclassifysentimentdirectly,ortheycanbefurthertrained(“ine-tuned”)onaspeciicdataset.Thelatterapproachattemptstogetthebestofbothworlds:asolidabilitytoparselanguagefromtheexposuretoalargequantityoftrainingdata,plusthecontext-speciicnuancefromtheine-tuningdata.Whiledeeplearninggetsenormousattention,itisex-anteunclearwhetheritshouldoutperformcarefullycurateddictionariesinourcontext.

Comparingtheaccuracyofthesediferentmethodsonasampleofhand-codedcommentsfromourdatasetweindthatdeeplearningdoeshaveanadvantageonourdata,inpartbecausethebrevityofthecommentsmeansthatmanycommentshavenooverlapwithdictionaryterms.Inaddition,weindthatthereisvalueinspecializingthemodelstoourdata:themodelsine-tunedonourdatahavethehighestsentimentclassiicationaccuracyonahold-outsample.Theseresultspointtotheadvantagesofusingpre-trainedmodels,aswellascarefullyspecializingthemtothetaskathand.OurhopeisthattheseresultshelpguideothereconomistswhendecidingbetweenNLPapproaches.

Thesentimentmeasuresbasedonfree-formtextualresponsesintheISMdataaggre-gateintoindexesthatcloselymirrorboththedifusionindexbasedontheresponsestothecategoricalsurveyandaggregatemanufacturingoutput,asmeasuredbythemanufactur-ingcomponentofindustrialproduction.Wefurtherinvestigatetherelationshipbetweentheaveragesentimentexpressedbypurchasingmanagersandmanufacturingoutputeconometri-cally.Ourbaselineforecastingmodelaskswhethersentimentcanhelpforecastmanufacturingoutputandincludes—amongothercontrols—someoftheISMdifusionindexes,sothetest

iswhetherthesentimentindexeshaveadditionalinformationbeyondtheISMcategoricalresponsesdata.Weindthatmostdictionary-basedtextvariablesdonothelppredictman-ufacturingoutput,withtheexceptionofacuratedinancialstability-speciicdictionary.Ontheotherhand,sentimentvariablesfromthedeeplearningmodelsarepredictiveoffuturemanufacturingoutput.Out-of-sampleforecastingexercisesshowthattheinancialstabil-itydictionaryanddeeplearningtechniquessigniicantlyreducethemeansquaredforecasterrorsaswell.Overall,ourresultssuggestthatpurchasingmanagers’surveyresponsescon-tainusefulforward-lookinginformation,andthatsentiment-basedmeasurescanimprovetheaccuracyofforecastsofmanufacturingoutput.

Theexercisesdescribedaboverelyonamanually-labeledsampleofthedata,bothtoassesstheaccuracyofdiferentmethodsandtohelpine-tunesomeofthedeep-learningbasedmethods.However,thepanelmicrodataallowforadiferentapproach.Sinceirmsareinthesurveyformultiplemonths,wecanlinkthetext(andother)datafromagivenmonthtonextmonth’sirm-levelproductiondata.Fittingamodeltothesedataletsusforecastirm-levelproductionusingirm-levellaggedinformation.Thismethodologyhastwoadvantages.First,itgivesusamuchlargertrainingsamplesizeascomparedtothemanuallylabeleddata.Second,italignsthetrainingdataobjectiveverypreciselywiththeaggregateforecastingobjective.Onthissecondpoint,wedoourbestwhenmanuallylabelingdatatodiscernwhetherthecommentisindicativeofrisingorfallingindustrialproduction.Butthereareplentyofambiguouscases,sotherearesomeclearadvantagestolettingthedataspeak,andseeingwhattextisactuallyassociatedwithfuture(irmlevel)changesinproduction.Weindthatine-tuninginthiswayiscompetitivewithusingthemanuallabels,andinsomecasespreferable.

Finally,wemakeprogressontheexplainabilityofdeeplearningmodels.Thesemodelsarenotoriouslyopaque,aconsequenceoftheirveryhighparametercountandextremelynonlineararchitecture.Thiscanmakeitdi伍culttotrusttheoutputsofsuchmodels,asitisnotinitiallycleariftheseeminglygoodpredictionsarebasedonsolidfoundations.Weuseastandardmachinelearninginterpretabilitymethod—Shapleydecompositions—toscorethecontributionofeachindividualwordineachcomment.Ourresultspointtoasensibleinterpretationofourdeeplearningmodels.First,thescoreforeachwordisroughlyconstantovertime:wordsdonotdramaticallychangetheiraverageconnotation(thoughtheunderlyingdeeplearningmodelallowsforthis).Second,therearefattailstothescores:mostwordshavescoresveryclosetozero(neutral),witharelativelysmallnumberofwordshavingextremesentiment.Forexample,themostpositivewordsinclude

“brisk”,“excellent”,“booming”,“improve”,and“e伍cient”;amongthemostnegativewords

are“unstable”,“insu伍cient”,“fragile”,“inconsistent”,and“questionable”.Theclose-to-neutralwordscontributeverylittletoaggregatesentiment,evenafteraccountingforthefactthattheyoccurveryfrequently.Finally,weindthatchangesinouraggregatedsentimentindexarelargelyaccountedforbychangesinthefrequencyofthewordswiththemostextreme(positiveornegative)sentimentscores,withthevastmajorityofwordsplayinglittlerole.Thus,whileitmaybedi伍culttomanuallyconstructadomain-speciicdictionaryfromscratch,itispossibletoextractafairlysimple,interpretabledictionaryfromthedeeplearningmodel.

Ourpapercontributestotwostrandsofliterature.First,ourcomparisonofNLPtech-niquesformeasuringsentimentaddstothegrowingbodyofliteratureincorporatingNLPintoeconomicandinancialresearch.Sincetheseminalworkof

Tetlock

(2007),manystudies

haveuseddictionary-basedmethods(Bakeretal.,

2016;

Hassanetal.,

2019;

Youngetal.,

2021

;

Cowheyetal.,

2022),andreinedlexiconsforspeciiccontextshavebeenshownto

improveperformanceinmeasurementandforecasting(Correaetal.,

2021;

Gardneretal.,

2022

;

Sharpeetal.,

2023).Machinelearningtechniqueshavealsobeenusedtoselectword

lists(ManelaandMoreira,

2017;

Soto,

2021).Morerecentpapersincorporatemoresophis

ticatedmachinelearningmethodstoextractthetenseandtopicoftexts(Angelicoetal.,

2022

;

HanleyandHoberg,

2019;

Hansenetal.,

2018;

Kalamaraetal.,

2022)

.AdvancesinNLP,particularlytheuseofdeeplearningtechniques,havesigniicantlyimprovedsentiment

classiication(HestonandSinha,

2017;

Araci,

2019;

Huangetal.,

2022;

Bybee,

2023;

Jhaet

al.

2024)

Second,wecontributetotheliteratureonforecastingindustrialproduction(D’Agostino

andSchnatz

2012;

LahiriandMonokroussos,

2013;

Ardiaetal.,

2019;

Cimadomoetal.,

2022

;

Andreouetal.,

2017).Ouranalysisoftherelationshipbetweensentimentandindus

-trialproductionprovidesnewinsightsintotheroleofunstructuredtextdataineconomic

forecasting(Marcucci,

2024)

.BycomparingvariousNLPtechniques,weareabletoiden-tifywhichmethodsaremostefectiveforclassifyingsentimentandincorporatingthemintopredictivemodelsofindustrialproduction.

Thepapermostsimilartooursis

Shapiroetal.

(2022),whoindthatdomainspeciic

dictionariescanimprovepredictionsofhumanratedsentiment.Weindbroadlysimilarresultsusingainancialstability(ratherthanageneralpurpose)dictionarytomeasuresentiment,butmoveonestepfurtherbyprovidingarobustcomparisontolargelanguagemodels.Ourpaperdifersfromtheirsintwoimportantways.First,wefocusoncreating

asentimentindexfromirm-leveldata,ratherthanbeginningtheanalysisatanaggregatemacroeconomiclevel.Insteadofmeasuringconsumersentimentthroughnewspaperarticles,wemeasuremanufacturingsentimentfromapanelofsurveyresponses.Ouruniquemicro-leveldataallowustounderstandthevalueoftextbeyondcategoricalresponsesandnaturallyoccurringlabels.Second,

Shapiroetal.

(2022)compareslexicon-basedsentimentapproaches

onlytobaselineBERT,whichatthetimewasthemostdevelopedtransfer-learningbasedmodel.WealsoconsidernewerdeeplearningmodelsbasedonBERT,particularlythoseine-tunedondomainspeciicandnaturallyoccurringdata.Weapplyinterpretabilitytechniquestothese‘blackbox’modelsandshowthataggregatesentimentindexesderivedfromdeeplearninghingeonthefrequenciesofrelativelyfewwords.

Theremainderofthepaperisstructuredasfollows.Section

presentsourdata.Section

reviewshowwemeasuresentimentfromthetextualsurveydataandSection

overviewstheresultingindexes.Section

presentstheempiricalstrategyandindings,andSection

evaluatesthemechanismsthroughwhichirmsurveyresponsespredictindustrialproduction.Section

concludes.

2Data

TheprimarydataforthisstudycomesfromtheInstituteforSupplyManagement(ISM).Eachmonth,ISMconductsasurveyofpurchasingmanagersfromasampleofmanufacturingirmsintheUnitedStates

Difusionindexesbasedontheresponses(describedbelow)arepublishedveryrapidly,andarecloselywatchedbymarkets.Ashighlightedin

Boketal.

(2018),notonlydoessuchsurveydataprovideimportantsignalaboutthestateofthe

economy,buttheISMdatainparticularprovidesthe“earliestavailableinformationforthenationaleconomyonanygivenquarter”.Inaddition,theISMdatahavealongtime

series,whichisconducivetotime-seriesmodeling.3

Thetimelinessandrelevanceofthedatamotivatesourexplorationofthefree-responsetext.

TheISMsurveyincludesaseriesofquestionsabouttherespondents’operations,includingtheirproductionlevels,neworders,backlog,employment,supplierdeliverytimes,inputinventories,exports,andimports.Thesequestionshaveacategoricalresponse,wherethepurchasingmanagersspecifywhetherthesemetricshaveincreased,decreased,orstayedthesamebetweenlastmonthandthecurrentmonth.Thecategoricalresponsesareaggregated

2ISMalsosurveysnon-manufacturingirmsandhospitalsseparately.

3ISMseriesextendbackto1948,butmoststatisticalanalysesusedatathatstartsin1972.

intopublicly-releaseddifusionindexes,discussedmorebelow.Inadditiontothecategorical

response,purchasingmanagerscanprovidefurtherexplanationinaccompanyingtextboxes.Therearefreeresponsequestionsaccompanyingnearlyeverycategoricalquestion,askingforthereasonfortheresponse.Inadditionthereisa“GeneralRemarks”ieldatthebeginning,wheretherespondentcanputanygeneralremarkstheywish.TentotwelveofthesetextresponsesarefeaturedintheISM’sdatareleasetoprovidecontextforthedifusionindexes,butotherwisearenotreleasedpublicly.

TheISMmanufacturingsurveydatesbacktothe1930s.Thedatasetweanalyzecoversirm-monthobservationsfromNovember2001toJanuary2020.Mostrecently,thesamplecoversroughly350responsespermonth.Thedark-shadedareaofFigure

showstheper-centageofirmsinthesamplewithtextresponsesovertime.Theigureillustratesthatthemajorityofrespondentsprovidetextinadditiontotheirquantitativesurveyanswers.TheblacklineinFigure

presentstheaveragewordcountoverthesampleperiod.Thewordcountsrangefrom10to33wordsonaveragepermonth.Themeanwordcountappearstoluctuateoverthebusinesscycleandjumpsdramaticallyin2018.Thesuddenincreaseinwordcountin2018ismostlyduetoheightenedtensionssurroundingtradepolicyatthetime.Indeed,afterremovingresponsesthatcontaintheword“tarif,”weobserveasmootherincreaseinwordcounts(seeFigure

intheappendixforfurtherdetails).

Table

providesasummaryofthetextresponses.Nearly49percentofthegeneralremarkssectionscontaintext,whilethenextmostcommonsectionscontainingtextarethoserelatedtoemployment,production,andneworders.Thelastrowshowsstatisticsforallthetextieldsconcatenatedtogether:69percentofirm-monthobservationshaveanytextatall,andthetextisabout17wordslongonaverage.TheaveragewordcountishighestfortheGeneralRemarkssection,withanaverageof8wordsusedintheseresponses.Whenconsideringonlythoseresponsesthatcontaintext,theaveragewordcountfortheGeneralRemarkssectionincreasesto16words.

TurningfromISM’ssurveymicrodata,weuseseveraltimeseriesinourforecastingexer-cises.Ourfocusisonforecastingthemanufacturingindustrialproduction(IP)index.Weuserealtimedataontherighthandside,relectingwhatpolicymakersknewatthetime,andforecastthefullyrevisedseries.InadditiontoIPseries,weusetheISMdifusionindexesasregressors.Thedifusionindexesareaggregationsofthecategoricalresponsequestionsinthesurvey.Forexample,theproductiondifusionindexisaweightedaverageoftheresponsestotheproductionquestion(paraphrasing,“Isproductionhigher/thesame/lowerthanlastmonth?”),withthe“Higher”responsesgettingweight100,“Same”responsesgettingweight

50,and“Lower”responsesgettingweight0.Theformulaforthedifusionindexinperiodt,withNt

totalirmsrespondingisshowninequation(1):

Dt=[100·1fResponseiis“Higher”g+50·1fResponseiis“Same”g](1)

Thesedifusionindexeshavevaluesbetween0and100,with0indicatingthatallrespondentssaythingsareworseand100indicatingthatallrespondentssaythingsarebetter

ISMpublishesindexesforeachquestion,aswellasa“PMIComposite”,whichisanequally-weightedaverageofthedifusionindexesforneworders,production,employment,supplierdeliveries,andinventories.

3MeasuringSentiment

OurgoalistoextractusefulinformationfromtheISMsurveytextresponses.Wefocusonsentimentanalysis:measuringtheextenttowhichthepurchasingmanagersresponseispositiveornegative.Evenfocusingonsentimentanalysis,thewiderangeofNLPtechniquesavailablecanmakeitchallengingtochooseanappropriatemethod.Inthissectionwediscussthemethodsweuse,leavingacompletedescriptionoftheapproachestotheAppendix.

3.1Dictionaries

Oneofthesimplestmethodsformeasuringsentimentisdictionary-basedanalysis,whichinvolvescountingthefrequencyofapredeterminedlistofsentimentwordsinthetext.

WeusecommonsentimentdictionariessuchastheHarvard(Tetlock,

2007)andAFINN

(Nielsen

2011)wordlists

.However,wealsorecognizethatcertainwordsthatmaybeconsiderednegativeinothercontextsmaynotbeconsiderednegativeinthecontextofinance,suchas“taxing”or“l(fā)iability”.Assuch,wealsoapplyinance-speciicwordlists,includingthesentimentwordlistfrom

LoughranandMcDonald

(2011)(henceforth,“LM”)

andtheinancialstabilitywordlistfrom

Correaetal.

(2021).Foralldictionaries,wescore

commentsonascaleof-1to+1,usingthepercentoftotalwordsinthecommentthatarepositivelessthepercentoftotalwordsthatarenegative.Whenwerequirediscrete

4Theresponsesare“better”,“same”,or“worse”forthenewordersquestion,production,andnewexportorders.Foremployment,inventories,prices,andimportstheresponsesare“higher”,“same”,and“l(fā)ower”.Forbacklogsthechoicesare“greater”,“same”,and“l(fā)ess”.

classiications,asinFigure

2,weclassifythecommentaspositiveifthescoreisgreaterthan

zero,negativeifitislessthanzero,andneutralifitequalszero.

3.2DeepLearningModels

Anotherapproachtosentimentanalysisinvolvesittingamodeltothedata.Wetryseveralvariationsonthistheme.Unlikethedictionarymethods,alloftheseapproachesrequirelabeleddata:asampleofobservationsthathavealreadybeenclassiied,whichisusedtoitthemodelandclassifytheremainingobservations.

Wecreatealabeleddatasetfromarandomlyselectedsubsampleof1,000responses

withtextfromtheindividualquestions.5

Eachresponsewasclassiiedforsentimentbytwoeconomistsusingthefollowingquestionasaguide:“Isthiscommentconsistentwithmanu-facturingIPrisingmonthovermonth?”Theclassiicationswereeitherpositive,neutral,ornegative,where“neutral”includescaseswhereisitisimpossibletodeterminethesentiment.Botheconomistsagreedonthesentimentclassiicationforroughly700cases.Thissubsam-pleisfurthersplitintoa“training”dataset,usedtoitthemodels,and“test”dataset,usedtoassesstherelativemeritsofthemodels

Deeplearningmodelshavegainedpopularityinrecentyears,drivenbytheirimpres-siveperformanceonlanguage-relatedtasks.Muchoftheprogresshasoccurredwithinaparticularclassofdeeplearningmodelscalledtransformers(see,e.g.,

Devlinetal.,

2018,

Radfordetal.,

2018,

Chungetal.,

2022,

Ouyangetal.,

2022,and

Touvronetal.,

2023)

.Thedeiningfeatureoftransformers—relativetootherneuralnetworkarchitectures—isamechanismcalledattention;awaytointeractwordswithinasentence,allowingthecon-textofaparticularwordtoinluencethemeaning.Afullexplanationoftransformersandtheattentionmechanismisbeyondthescopeofthispaper,butwedoprovideabriefsum-maryintheAppendix.Theimportantpointsarethat(unlikedictionariesandbag-of-wordsapproaches)transformerstakeintoaccountinteractionsbetweenwords,wordorder,andcontext-dependentmeanings(polysemy).

Onenotabletransformermodelis“BERT”,orBidirectionalEncoderRepresentationsfromTransformers,developedby

Devlinetal.

(2018)

.ItisimportanttonotethatBERTisapre-trainedmodel:

Devlinetal.

(2018)speciiedthearchitectureandthentrainedthe

modelonacorpusincludingtheentiretyof(English)Wikipediaandanumberofbooks.

5Note,thatthecategoricalresponsescanbeconsideredakindoflabelforthecorrespondingtext.InSection

4.1

weinvestigatehowwellmodelscanpredictthecategoricalresponsefromtheassociatedtext.

6Thetestdataconsistsofobservationsfrom2018m1to2020m1andisnotusedbyanyofthemodelsduringtraining.

Themodelislargebythestandardsoftheeconomicsliterature,wit

人人文庫> 全部分類> 應用文書 > 研究報告

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預覽，若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權(quán)或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

美聯(lián)儲-制造業(yè)情緒：用文本分析預測工業(yè)生產(chǎn) Manufacturing Sentiment Forecasting Industrial Production with Text Analysis 2024

文檔簡介

溫馨提示

最新文檔

評論

美聯(lián)儲-制造業(yè)情緒：用文本分析預測工業(yè)生產(chǎn) Manufacturing Sentiment Forecasting Industrial Production with Text Analysis 2024

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔