版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
ExecutiveSummary
Recentdevelopmentshaveimprovedtheabilityoflargelanguagemodels(LLMs)andotherAIsystemstogeneratecomputercode.Whilethisispromisingforthefieldof
softwaredevelopment,thesemodelscanalsoposedirectandindirectcybersecurity
risks.Inthispaper,weidentifythreebroadcategoriesofriskassociatedwithAIcodegenerationmodels:1)modelsgeneratinginsecurecode,2)modelsthemselvesbeingvulnerabletoattackandmanipulation,and3)downstreamcybersecurityimpactssuchasfeedbackloopsintrainingfutureAIsystems.
Existingresearchhasshownthat,underexperimentalconditions,AIcodegenerationmodelsfrequentlyoutputinsecurecode.However,theprocessofevaluatingthe
securityofAI-generatedcodeishighlycomplexandcontainsmanyinterdependentvariables.TofurtherexploretheriskofinsecureAI-writtencode,weevaluated
generatedcodefromfiveLLMs.Eachmodelwasgiventhesamesetofprompts,whichweredesignedtotestlikelyscenarioswherebuggyorinsecurecodemightbe
produced.Ourevaluationresultsshowthatalmosthalfofthecodesnippetsproducedbythesefivedifferentmodelscontainbugsthatareoftenimpactfulandcould
potentiallyleadtomaliciousexploitation.Theseresultsarelimitedtothenarrowscopeofourevaluation,butwehopetheycancontributetothelargerbodyofresearch
surroundingtheimpactsofAIcodegenerationmodels.
Givenbothcodegenerationmodels’currentutilityandthelikelihoodthattheircapabilitieswillcontinuetoimprove,itisimportanttomanagetheirpolicyandcybersecurityimplications.Keyfindingsincludethebelow.
●IndustryadoptionofAIcodegenerationmodelsmayposeriskstosoftware
supplychainsecurity.However,theseriskswillnotbeevenlydistributedacrossorganizations.Larger,morewell-resourcedorganizationswillhaveanadvantageoverorganizationsthatfacecostandworkforceconstraints.
●Multiplestakeholdershaverolestoplayinhelpingtomitigatepotentialsecurity
risksrelatedtoAI-generatedcode.TheburdenofensuringthatAI-generated
codeoutputsaresecureshouldnotrestsolelyonindividualusers,butalsoonAIdevelopers,organizationsproducingcodeatscale,andthosewhocanimprove
securityatlarge,suchaspolicymakingbodiesorindustryleaders.Existing
guidancesuchassecuresoftwaredevelopmentpracticesandtheNIST
CybersecurityFrameworkremainsessentialtoensurethatallcode,regardlessofauthorship,isevaluatedforsecuritybeforeitentersproduction.Other
cybersecurityguidance,suchassecure-by-designprinciples,canbeexpandedto
CenterforSecurityandEmergingTechnology|1
includecodegenerationmodelsandotherAIsystemsthatimpactsoftwaresupplychainsecurity.
●Codegenerationmodelsalsoneedtobeevaluatedforsecurity,butitiscurrentlydifficulttodoso.Evaluationbenchmarksforcodegenerationmodelsoftenfocusonthemodels’abilitytoproducefunctionalcodebutdonotassesstheirabilitytogeneratesecurecode,whichmayincentivizeadeprioritizationofsecurityover
functionalityduringmodeltraining.Thereisinadequatetransparencyaround
models’trainingdata—orunderstandingoftheirinternalworkings—toexplorequestionssuchaswhetherbetterperformingmodelsproducemoreinsecure
code.
CenterforSecurityandEmergingTechnology|2
TableofContents
ExecutiveSummary 1
Introduction 4
Background 5
WhatAreCodeGenerationModels? 5
IncreasingIndustryAdoptionofAICodeGenerationTools 7
RisksAssociatedwithAICodeGeneration 9
CodeGenerationModelsProduceInsecureCode 9
Models’VulnerabilitytoAttack 11
DownstreamImpacts 13
ChallengesinAssessingtheSecurityofCodeGenerationModels 15
IsAIGeneratedCodeInsecure? 18
Methodology 18
EvaluationResults 22
UnsuccessfulVerificationRates 22
VariationAcrossModels 24
SeverityofGeneratedBugs 25
Limitations 26
PolicyImplicationsandFurtherResearch 28
Conclusion 32
Authors 33
Acknowledgments 33
AppendixA:Methodology 34
AppendixB:EvaluationResults 34
Endnotes 35
CenterforSecurityandEmergingTechnology|3
Introduction
AdvancementsinartificialintelligencehaveresultedinaleapintheabilityofAI
systemstogeneratefunctionalcomputercode.Whileimprovementsinlargelanguage
modelshavedrivenagreatdealofrecentinterestandinvestmentinAI,codegenerationhasbeenaviableusecaseforAIsystemsforthelastseveralyears.
SpecializedAIcodingmodels,suchascodeinfillingmodelswhichfunctionsimilarlyto“autocompleteforcode,”and“general-purpose”LLM-basedfoundationmodelsare
bothbeingusedtogeneratecodetoday.Anincreasingnumberofapplicationsand
softwaredevelopmenttoolshaveincorporatedthesemodelstobeofferedasproductseasilyaccessiblebyabroadaudience.
Thesemodelsandassociatedtoolsarebeingadoptedrapidlybythesoftware
developercommunityandindividualusers.AccordingtoGitHub’sJune2023survey,92%ofsurveyedU.S.-baseddevelopersreportusingAIcodingtoolsinandoutof
work.1AnotherindustrysurveyfromNovember2023similarlyreportedahighusagerate,with96%ofsurveyeddevelopersusingAIcodingtoolsandmorethanhalfof
respondentsusingthetoolsmostofthetime.2Ifthistrendcontinues,LLM-generatedcodewillbecomeanintegralpartofthesoftwaresupplychain.
ThepolicychallengeregardingAIcodegenerationisthatthistechnological
advancementpresentstangiblebenefitsbutalsopotentialsystemicrisksforthe
cybersecurityecosystem.Ontheonehand,thesemodelscouldsignificantlyincreaseworkforceproductivityandpositivelycontributetocybersecurityifappliedinareas
suchasvulnerabilitydiscoveryandpatching.Ontheotherhand,researchhasshownthatthesemodelsalsogenerateinsecurecode,posingdirectcybersecurityrisksif
incorporatedwithoutproperreview,aswellasindirectrisksasinsecurecodeendsupinopen-sourcerepositoriesthatfeedintosubsequentmodels.
Asdevelopersincreasinglyadoptthesetools,stakeholdersateverylevelofthe
softwaresupplychainshouldconsidertheimplicationsofwidespreadAI-generated
code.AIresearchersanddeveloperscanevaluatemodeloutputswithsecurityinmind,programmersandsoftwarecompaniescanconsiderhowthesetoolsfitintoexisting
security-orientedprocesses,andpolicymakershavetheopportunitytoaddressbroadercybersecurityrisksassociatedwithAI-generatedcodebysettingappropriate
guidelines,providingincentives,andempoweringfurtherresearch.ThisreportprovidesanoverviewofthepotentialcybersecurityrisksassociatedwithAI-generatedcodeanddiscussesremainingresearchchallengesforthecommunityandimplicationsforpolicy.
CenterforSecurityandEmergingTechnology|4
Background
WhatAreCodeGenerationModels?
CodegenerationmodelsareAImodelscapableofgeneratingcomputercodein
responsetocodeornatural-languageprompts.Forexample,ausermightprompta
modelwith“WritemeafunctioninJavathatsortsalistofnumbers”andthemodelwilloutputsomecombinationofcodeandnaturallanguageinresponse.Thiscategoryof
modelsincludesbothlanguagemodelsthathavebeenspecializedforcodegenerationaswellasgeneral-purposelanguagemodels—alsoknownas“foundationmodels”—
thatarecapableofgeneratingothertypesofoutputsandarenotexplicitlydesignedto
outputcode.ExamplesofspecializedmodelsincludeAmazonCodeWhisperer,
DeepSeekCoder,WizardCoder,andCodeLlama,whilegeneral-purposemodelsincludeOpenAI’sGPTseries,Mistral,Gemini,andClaude.
Earlieriterationsofcodegenerationmodels—manyofwhichpredatedthecurrent
generationofLLMsandarestillinwidespreaduse—functionedsimilarlyto
“autocompleteforcode,”inwhichamodelsuggestsacodesnippettocompletealine
asausertypes.These“autocomplete”models,whichperformwhatisknownascode
infilling,aretrainedspecificallyforthistaskandhavebeenwidelyadoptedinsoftwaredevelopmentpipelines.Morerecentimprovementsinlanguagemodelcapabilitieshaveallowedformoreinteractivity,suchasnatural-languagepromptingorauserinputtingacodesnippetandaskingthemodeltocheckitforerrors.Likegeneral-purposelanguagemodels,userscommonlyinteractwithcodegenerationmodelsviaadedicatedinterfacesuchasachatwindoworaplugininanotherpieceofsoftware.Recently,specialized
scaffoldingsoftwarehasfurtherincreasedwhatAImodelsarecapableofincertaincontexts.Forinstance,somemodelsthatcanoutputcodemayalsobecapableof
executingthatcodeanddisplayingtheoutputstotheuser.3
Aslanguagemodelshavegottenlargerandmoreadvancedoverthepastfewyears,
theircodegenerationcapabilitieshaveimprovedinstepwiththeirnaturallanguage-
generationcapabilities.4Codinglanguagesare,afterall,intentionallydesignedto
encodeandconveyinformation,andhavetheirownrulesandsyntacticalexpectationsmuchlikehumanlanguages.Researchersinthefieldofnaturallanguageprocessing
(NLP)havebeeninterestedintranslatingbetweennaturallanguageandcomputercode
formanyyears,butthesimultaneousintroductionoftransformer-basedlanguage
modelarchitecturesandlargedatasetscontainingcodeledtoarapidimprovementincodegenerationcapabilitiesbeginningaround2018–2019.Asnewmodelswere
released,researchersalsobeganexploringwaystomakethemmoreaccessible.Inmid-2021,forexample,OpenAIreleasedthefirstversionofCodex,aspecializedlanguage
CenterforSecurityandEmergingTechnology|5
modelforcodegeneration,alongwiththeHumanEvalbenchmarkforassessingthe
correctnessofAIcodeoutputs.5Github,inpartnershipwithOpenAI,thenlauncheda
previewofaCodex-poweredAIpairprogrammingtoolcalledGithubCopilot.6Althoughitinitiallyfunctionedmoresimilarlyto“autocompleteforcode”thanacurrent-
generationLLMchatbot,GithubCopilot’srelativeaccessibilityandearlysuccesshelped
spurinterestincodegenerationtoolsamongprogrammers,manyofwhomwereinterestedinadoptingAItoolsforbothworkandpersonaluse.
Tobecomeproficientatcodegeneration,modelsneedtobetrainedondatasets
containinglargeamountsofhuman-writtencode.Modernmodelsareprimarilytrainedonpublicly-available,open-sourcecode.7Muchofthiscodewasscrapedfromopen-
sourcewebrepositoriessuchasGithub,whereindividualsandcompaniescanstore
andcollaborateoncodingprojects.Forexample,thefirstversionofthe6-terabyte
datasetknownasTheStackconsistsofsourcecodefilesin358differentprogramminglanguages,andhasbeenusedtopretrainseveralopencodegenerationmodels.8Otherlanguagemodeltrainingdatasetsareknowntocontaincodeinadditiontonatural-
languagetext.The825-gigabytedatasetcalledThePilecontains95gigabytesofGithubdataand32gigabytesscrapedfromStackExchange,afamilyofquestion-answeringforumsthatincludescodesnippetsandothercontentrelatedto
programming.9However,thereisoftenlimitedvisibilityintothedatasetsthat
developersusefortrainingmodels.Wecanspeculatethatthemajorityofcodebeing
usedtotraincodegenerationmodelshasbeenscrapedfromopen-sourcerepositories,butotherdatasetsusedfortrainingmaycontainproprietarycodeorsimplybeexcludedfrommodelcardsorotherformsofdocumentation.
Additionally,somespecializedmodelsarefine-tunedversionsofgeneral-purpose
models.Usually,theyarecreatedbytraininggeneral-purposemodelswithadditional
dataspecifictotheusecase.Thisisparticularlylikelyininstanceswherethemodel
needstotranslatenatural-languageinputsintocode,asgeneral-purposemodelstendtobebetteratfollowingandinterpretinguserinstructions.OpenAI’sCodexisonesuchexample,asitwascreatedbyfine-tuningaversionofthegeneral-purposeGPT-3
modelon159gigabytesofPythoncodescrapedfromGithub.10CodeLlamaandCodeLlamaPython—basedonMeta’sLlama2model—areotherexamplesofsuchmodels.
ResearchinterestinAIcodegenerationhasconsistentlyincreasedinthepastdecade,especiallyexperiencingasurgeinthepastyearfollowingthereleaseofhigh-
performingfoundationmodelssuchasGPT-4andopen-sourcemodelssuchasLlama2.Figure1illustratesthetrendbycountingthenumberofresearchpapersoncode
generationbyyearfrom2012–2023.Thenumberofresearchpapersoncode
CenterforSecurityandEmergingTechnology|6
generationmorethandoubledfrom2022to2023,demonstratingagrowingresearchinterestinitsusage,evaluation,andimplications.
Figure1:NumberofPapersonCodeGenerationbyYear*
Source:CSET’sMergedAcademicCorpus.
IncreasingIndustryAdoptionofAICodeGenerationTools
Codegenerationpresentsoneofthemostcompellingandwidelyadoptedusecasesforlargelanguagemodels.InadditiontoclaimsfromorganizationssuchasMicrosoftthattheirAIcodingtoolGitHubCopilothad1.8millionpaidsubscribersasofspring2024,
upfrommorethanamillioninmid-2023,11softwarecompaniesarealsoadopting
*ThisgraphcountsthenumberofpapersinCSET’sMergedAcademicCorpusthatcontainthe
keywords“codegeneration,”“AI-assistedprogramming,”“AIcodeassistant,”“codegenerating
LLM,”or“codeLLM”andarealsoclassifiedasAI-orcybersecurity-relatedusingCSET’sAIclassifierandcybersecurityclassifier.NotethatatthetimeofwritinginFebruary2024,CSET’sMerged
AcademicCorpusdidnotyetincludeallpapersfrom2023duetoupstreamcollectionlags,which
mayhaveresultedinanundercountingofpapersin2023.ThecorpuscurrentlyincludesdatafromClarivate’sWebofScience,TheLens,arXiv,PaperswithCode,SemanticScholar,andOpenAlex.
MoreinformationregardingourmethodologyforcompilingtheMergedAcademicCorpusaswellasbackgroundonourclassifiersandadetailedcitationofdatasourcesareavailablehere:
https://eto.tech/dataset-docs/mac/
;
/publication/identifying-ai-research/.
CenterforSecurityandEmergingTechnology|7
internalversionsofthesemodelsthathavebeentrainedonproprietarycodeand
customizedforemployeeuse.GoogleandMetahavecreatednon-public,customcodegenerationmodelsintendedtohelptheiremployeesdevelopnewproductsmore
efficiently.12
ProductivityisoftencitedasoneofthekeyreasonsindividualsandorganizationshaveadoptedAIcodegenerationtools.Metricsformeasuringhowmuchdeveloper
productivityimprovesbyleveragingAIcodegenerationtoolsvarybystudy.Asmall
GitHubstudyusedbothself-perceivedproductivityandtaskcompletiontimeas
productivitymetrics,buttheauthorsacknowledgedthatthereislittleconsensusaboutwhatmetricstouseorhowproductivityrelatestodeveloperwell-being.13AMcKinseystudyusingsimilarmetricsclaimedthatsoftwaredevelopersusinggenerativeAItoolscouldcompletecodingtasksuptotwiceasfastasthosewithoutthem,butthatthesebenefitsvarieddependingontaskcomplexityanddeveloperexperience.14Companieshavealsoruninternalproductivitystudieswiththeiremployees.AMetastudyontheirinternalcodegenerationmodelCodeComposeusedmetricssuchascodeacceptancerateandqualitativedeveloperfeedbacktomeasureproductivity,findingthat20%of
usersstatedthatCodeComposehelpedthemwritecodemorequickly,whileaGooglestudyfounda6%reductionincodingiterationtimewhenusinganinternalcode
completionmodelascomparedtoacontrolgroup.15Morerecently,aSeptember2024studyanalyzingdatafromrandomizedcontroltrialsacrossthreedifferentorganizationsfounda26%increaseinthenumberofcompletedtasksamongdevelopersusing
GitHubCopilotasopposedtodeveloperswhowerenotgivenaccesstothetool.16Moststudiesareinagreementthatcodegenerationtoolsimprovedeveloperproductivityin
general,regardlessoftheexactmetricsused.
AIcodegenerationtoolsareundoubtedlyhelpfultosomeprogrammers,especially
thosewhoseworkinvolvesfairlyroutinecodingtasks.(Generally,themorecommonacodingtaskorcodinglanguage,thebetteracodegenerationmodelcanbeexpectedtoperformbecauseitismorelikelytohavebeentrainedonsimilarexamples.)Automatingrotecodingtasksmayfreeupemployees’timeformorecreativeorcognitively
demandingwork.TheamountofsoftwarecodegeneratedbyAIsystemsisexpectedtoincreaseinthenear-tomedium-termfuture,especiallyasthecodingcapabilitiesof
today’smostaccessiblemodelscontinuetoimprove.
Broadlyspeaking,evidencesuggeststhatcodegenerationtoolshavebenefitsatboththeindividualandorganizationallevels,andthesebenefitsarelikelytoincreaseover
timeasmodelcapabilitiesimprove.Therearealsoplentyofincentives,suchaseaseof
accessandpurportedproductivitygains,fororganizationstoadopt—oratleastexperimentwith—AIcodegenerationforsoftwaredevelopment.
CenterforSecurityandEmergingTechnology|8
RisksAssociatedwithAICodeGeneration
Thistechnologicalbreakthrough,however,mustalsobemetwithcaution.Increasing
usageofcodegenerationmodelsinroutinesoftwaredevelopmentprocessesmeans
thatthesemodelswillsoonbeanimportantpartofthesoftwaresupplychain.Ensuringthattheiroutputsaresecure—orthatanyinsecureoutputstheyproduceareidentifiedandcorrectedbeforecodeentersproduction—willalsobeincreasinglyimportantfor
cybersecurity.However,codegenerationmodelsareseldomtrainedwithsecurityasabenchmarkandareinsteadoftentrainedtomeetvariousfunctionalitybenchmarkssuchasHumanEval,asetof164human-writtenprogrammingproblemsintendedto
evaluatemodels’code-writingcapabilityinthePythonprogramminglanguage.17Asthe
functionalityofthesecodegenerationmodelsincreasesandmodelsareadoptedintothestandardroutineoforganizationsanddevelopers,overlookingthepotential
vulnerabilitiesofsuchcodemayposesystemiccybersecurityrisks.
Theremainderofthissectionwillexaminethreepotentialsourcesofriskingreater
detail:1)codegenerationmodels’likelihoodofproducinginsecurecode,2)themodels’vulnerabilitytoattacks,and3)potentialdownstreamcybersecurityimplicationsrelatedtothewidespreaduseofcodegenerationmodels.
CodeGenerationModelsProduceInsecureCode
Anemergingbodyofresearchonthesecurityofcodegenerationmodelsfocusesonhowtheymightproduceinsecurecode.Thesevulnerabilitiesmaybecontainedwithinthecodeitselforinvolvecodethatcallsapotentiallyvulnerableexternalresource.
Human-computerinteractionfurthercomplicatesthisproblem,as1)usersmay
perceiveAI-generatedcodeasmoresecureormoretrustworthythanhuman-
generatedcode,and2)researchersmaybeunabletopinpointexactlyhowtostopmodelsfromgeneratinginsecurecode.Thissectionexploresthesevarioustopicsinmoredetail.
Firstly,variouscodegenerationmodelsoftensuggestinsecurecodeasoutputs.Pearceetal.(2021)showthatapproximately40%ofthe1,689programsgeneratedbyGithubCopilot18werevulnerabletoMITRE’s“2021CommonWeaknessEnumerations(CWE)Top25MostDangerousSoftwareWeaknesses”list.19SiddiqandSantos(2022)foundthatoutof130codesamplesgeneratedusingInCoderandGithubCopilot,68%and
73%ofthecodesamplesrespectivelycontainedvulnerabilitieswhenchecked
manually.20Khouryetal.(2023)usedChatGPTtogenerate21programsinfive
differentprogramminglanguagesandtestedforCWEs,showingthatonlyfiveoutof21wereinitiallysecure.Onlyafterspecificpromptingtocorrectthecodedidan
CenterforSecurityandEmergingTechnology|9
additionalsevencasesgeneratesecurecode.21Fuetal.(2024)showthatoutof452real-worldcasesofcodesnippetsgeneratedbyGithubCopilotfrompubliclyavailableprojects,32.8%ofPythonand24.5%ofJavaScriptsnippetscontained38different
CWEs,eightofwhichbelongtothe2023CWETop25list.22
Incertaincodinglanguages,codegenerationmodelsarealsolikelytoproducecodethatcallsexternallibrariesandpackages.Theseexternalcodesourcescanpresenta
hostofproblems,somesecurity-relevant:Theymaybenonexistentandmerely
hallucinatedbythemodel,outdatedandunpatchedforvulnerabilities,ormaliciousin
nature(suchaswhenattackersattempttotakeadvantageofcommonmisspellingsinURLsorpackagenames).23Forexample,VulcanCybershowedthatChatGPTroutinelyrecommendednonexistentpackageswhenansweringcommoncodingquestions
sourcedfromStackOverflow—over40outof201questionsinNode.jsandover80outof227questionsinPythoncontainedatleastonenonexistentpackageintheanswer.24Furthermore,someofthesehallucinatedlibraryandpackagenamesarepersistent
acrossbothusecasesanddifferentmodels;asafollow-upstudydemonstrated,a
potentialattackercouldeasilycreateapackagewiththesamenameandgetuserstounknowinglydownloadmaliciouscode.25
Despitetheseempiricalresults,thereareearlyindicationsthatusersperceiveAI-
generatedcodetobemoresecurethanhuman-writtencode.This“automationbias”
towardsAI-generatedcodemeansthatusersmayoverlookcarefulcodereviewand
acceptinsecurecodeasitis.Forinstance,ina2023industrysurveyof537technologyandITworkersandmanagers,76%respondedthatAIcodeismoresecurethanhumancode.26Perryetal.(2023)furthershowedinauserstudythatstudentparticipantswithaccesstoanAIassistantwrotesignificantlylesssecurecodethanthosewithout
access,andweremorelikelytobelievethattheywrotesecurecode.27However,thereissomedisagreementonwhetherornotusersofAIcodegenerationtoolsaremorelikelytowriteinsecurecode;otherstudiessuggestthatuserswithaccesstoAIcode
assistantsmaynotbesignificantlymorelikelytoproduceinsecurecodethanusers
withoutAItools.28Thesecontradictoryfindingsraiseaseriesofrelatedquestions,suchas:Howdoesauser’sproficiencywithcodingaffecttheiruseofcodegeneration
models,andtheirlikelihoodofacceptingAI-generatedcodeassecure?Could
automationbiasleadhumanprogrammerstoaccept(potentiallyinsecure)AI-generatedcodeassecuremoreoftenthanhuman-authoredcode?Regardless,thefactthatAI
codingtoolsmayprovideinexperienceduserswithafalsesenseofsecurityhas
cybersecurityimplicationsifAI-generatedcodeismoretrustedandlessscrutinizedforsecurityflaws.
CenterforSecurityandEmergingTechnology|10
Furthermore,thereremainsuncertaintyaroundwhycodegenerationmodelsproduceinsecurecodeinthefirstplace,andwhatcausesvariationinthesecurityofcode
outputsacrossandwithinmodels.Partoftheanswerliesinthatmanyofthesemodelsaretrainedoncodefromopen-sourcerepositoriessuchasGithub.Theserepositories
containhuman-authoredcodewithknownvulnerabilities,largelydonotenforcesecure
codingpractices,andlackdatasanitizationprocessesforremovingcodewitha
significantnumberofknownvulnerabilities.Recentworkhasshownthatsecurity
vulnerabilitiesinthetrainingdatacanleakintooutputsoftransformer-basedmodels,
whichdemonstratesthatvulnerabilitiesintheunderlyingtrainingdatacontributetotheproblemofinsecurecodegeneration.29Addingtothechallenge,thereisoftenlittleto
notransparencyinexactlywhatcodewasincludedintrainingdatasetsandwhetherornotanyattemptsweremadetoimproveitssecurity.
Manyotheraspectsofthequestionofhow—andwhy—codegenerationmodelsproduceinsecurecodearestillunanswered.Forexample,a2023Metastudythat
comparedseveralversionsofLlama2,CodeLlama,andGPT-3.5and4foundthat
modelswithmoreadvancedcodingcapabilitiesweremorelikelytooutputinsecure
code.30Thissuggestsapossibleinverserelationshipbetweenfunctionalityandsecurityincodegenerationmodelsandshouldbeinvestigatedfurther.Inanotherexample,
researchersconductedacomparativestudyoffourmodels–GPT-3.5,GPT-4,Bard,
andGemini–andfoundthatpromptingmodelstoadopta“securitypersona”eliciteddivergentresults.31WhileGPT-3.5,GPT-4,andBardsawareductioninthenumberofvulnerabilitiesfromthenormalpersona,Gemini’scodeoutputcontainedmore
vulnerabilities.32Theseearlystudieshighlightsomeoftheknowledgegapsconcerning
howinsecurecodeoutputsaregeneratedandhowtheychangeinresponsetovariablessuchasmodelsizeandpromptengineering.
Models’VulnerabilitytoAttack
Inadditiontothecodethattheyoutput,codegenerationmodelsaresoftwaretoolsthatneedtobeproperlysecured.AImodelsarevulnerabletohacking,tampering,or
manipulationinwaysthathumansarenot.33Figure2illustratesthecodegenerationmodeldevelopmentworkflow,wheretheportionsinredindicatevariouswaysa
maliciouscyberactormayattackamodel.
CenterforSecurityandEmergingTechnology|11
Figure2:CodeGenerationModelDevelopmentWorkflowandItsCybersecurityImplications
Source:CSET.
GenerativeAIsystemshaveknownvulnerabilitiestoseveraltypesofadversarial
attacks.Theseincludedatapoisoningattacks,inwhichanattackercontaminatesamodel’strainingdatatoelicitadesiredbehavior,andbackdoorattacks,inwhichan
attackerattemptstoproduceaspecificoutputbypromptingthemodelwitha
predeterminedtriggerphrase.Inthecodegenerationcontext,adatapoisoningattack
maylooklikeanattackermanipulatingamodel’strainingdatatoincreaseitslikelihoodofproducingcodethatimportsamaliciouspackageorlibrary.Abackdoorattackonthemodelitself,meanwhile,coulddramaticallychangeamodel’sbehaviorwithasingle
triggerthatmaypersistevenifdeveloperstrytoremoveit.34Thischangedbehaviorcanresultinanoutputthatviolatesrestrictionsplacedonthemodelbyitsdevelopers(suchas“don’tsuggestcodepatternsassociatedwithmalware”)orthatmayreveal
unwantedorsensitiveinformation.Researchershavepointedoutthatbecausecodegenerationmodelsaretrainedonlargeamountsofdatafromafinitenumberof
unsanitizedcoderepositories,attackerscouldeasilyseedthese
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年離婚財產(chǎn)分配與債務(wù)清算協(xié)議3篇
- 2024年高校科研實驗室入駐合作項目協(xié)議合同范本2篇
- 醫(yī)療行業(yè)營銷工作總結(jié)
- 2025版餐廚垃圾處理項目承包與環(huán)保技術(shù)引進合同3篇
- 2024年財產(chǎn)貸款抵押協(xié)議書
- 2024年股權(quán)投資合同:投資方與融資方之間的約定
- 電信業(yè)銷售工作總結(jié)
- 2024年版設(shè)備購買租賃合同技術(shù)參數(shù)要求
- 2024年蘋果產(chǎn)業(yè)鏈上下游企業(yè)聯(lián)盟合作協(xié)議3篇
- 2024投標標前城市軌道交通項目投標標前合同3篇
- 廣告?zhèn)髅叫袠I(yè)操作人員安全培訓(xùn)
- ICU呼吸系統(tǒng)護理的專業(yè)技巧與注意事項
- 藝術(shù)類院校加強藝術(shù)法教育的思考
- 銀行商會戰(zhàn)略合作協(xié)議書
- 2025年日歷表帶農(nóng)歷【陰歷】完美打印版
- 重點實驗室申報
- 2024年中國華電集團公司招聘筆試參考題庫含答案解析
- 期末備考復(fù)習(xí):語文園地重點知識梳理(課件)五年級上冊語文-部編版
- 濟南版生物八年級下冊全套單元測試題附答案(共3套)
- 機械設(shè)備安裝工施工詳細注意事項培訓(xùn)
- 國際結(jié)算期末復(fù)習(xí)題庫及答案
評論
0/150
提交評論