搜索引擎外文文獻(xiàn)翻譯中英文_第1頁(yè)
搜索引擎外文文獻(xiàn)翻譯中英文_第2頁(yè)
搜索引擎外文文獻(xiàn)翻譯中英文_第3頁(yè)
搜索引擎外文文獻(xiàn)翻譯中英文_第4頁(yè)
搜索引擎外文文獻(xiàn)翻譯中英文_第5頁(yè)
已閱讀5頁(yè),還剩11頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

外文文獻(xiàn)翻譯(含:英文原文及中文譯文)英文原文InvestigatingtheQueryingandBrowsingBehaviorofAdvancedSearchEngineUsersWhite,Ryen,MorrisDanABSTRACTOnewaytohelpallusersofcommercialWebsearchenginesbemoresuccessfulintheirsearchesistobetterunderstandwhatthoseuserswithgreatersearchexpertisearedoing,andusethisknowledgetobenefiteveryone.Inthispaperwestudytheinteractionlogsofadvancedsearchengineusers(andthosenotsoadvanced)tobetterunderstandhowtheseusergroupssearch.Theresultsshowthattherearemarkeddifferencesinthequeries,resultclicks,post-querybrowsing,andsearchsuccessofusersweclassifyasadvanced(basedontheiruseofqueryoperators),relativetothoseclassifiedasnon-advanced.Ourfindingshaveimplicationsforhowadvancedusersshouldbesupportedduringtheirsearches,andhowtheirinteractionscouldbeusedtohelpsearchersofallexperiencelevelsfindmorerelevantinformationandlearnimprovedsearchingstrategies.Keywords:Querysyntax,advancedsearchfeatures,expertsearching.INTRODUCTIONTheformulationofquerystatementsthatcaptureboththesalientaspectsofinformationneedsandaremeaningfultoInformationRetrieval(IR)systemsposesachallengeformanysearchers[3].CommercialWebsearchenginessuchasGoogle,Yahoo!,andWindowsLiveSearchofferuserstheabilitytoimprovethequalityoftheirqueriesusingqueryoperatorssuchasquotationmarks,plusandminussigns,andmodifiersthatrestrictthesearchtoaparticularsiteortypeoffile.Thesetechniquescanbeusefulinimprovingresultprecisionyet,otherthanvialoganalyses(e.g.,[15][27]),theyhavegenerallybeenoverlookedbytheresearchcommunityinattemptstoimprovethequalityofsearchresults.IRresearchhasgenerallyfocusedonalternativewaysforuserstospecifytheirneedsratherthanincreasingtheuptakeofadvancedsyntax.Researchonpracticaltechniquestosupplementexistingsearchtechnologyandsupportusershasbeenintensifyinginrecentyears(e.g.[18][34]).However,itischallengingtoimplementsuchtechniquesatlargescalewithtolerablelatencies.TypicalqueriessubmittedtoWebsearchenginestaketheformofaseriesoftokensseparatedbyspaces.ThereisgenerallyanimpliedBooleanANDoperatorbetweentokensthatrestrictssearchresultstodocumentscontainingallqueryterms.DeLimaandPedersen[7]investigatedtheeffectofparsing,phraserecognition,andexpansiononWebsearchqueries.TheyshowedthattheautomaticrecognitionofphrasesinqueriescanimproveresultprecisioninWebsearch.However,thevalueofadvancedsyntaxfortypicalsearchershasgenerallybeenlimited,sincemostusersdonotknowaboutadvancedsyntaxordonotunderstandhowtouseit[15].Sinceitappearsoperatorscanhelpretrieverelevantdocuments,furtherinvestigationoftheiruseiswarranted.Inthispaperweexploretheuseofqueryoperatorsinmoredetailandproposealternativeapplicationsthatdonotrequirealluserstouseadvancedsyntaxexplicitly.Wehypothesizethatsearcherswhouseadvancedquerysyntaxdemonstrateadegreeofsearchexpertisethatthemajorityoftheuserpopulationdoesnot;anassertionsupportedbypreviousresearch[13].Studyingthebehavioroftheseadvancedsearchengineusersmayyieldimportantinsightsaboutsearchingandresultbrowsingfromwhichothersmaybenefit.Throughanexperimentalstudyandanalysis,weofferpotentialanswersforeachofthesequestions.Arelationshipbetweentheuseofadvancedsyntaxandanyofthesefeaturescouldsupportthedesignofsystemstailoredtoadvancedsearchengineusers,oruseadvancedusers’interactionstohelpnon-advancedusersbemoresuccessfulintheirsearches.RELATEDWORKFactorssuchaslackofdomainknowledge,poorunderstandingofthedocumentcollectionbeingsearched,andapoorlydevelopedinformationneedcanallinfluencethequalityofthequeriesthatuserssubmittoIRsystems([24],[28]).Therehasbeenavarietyofresearchintodifferentwaysofhelpingusersspecifytheirinformationneedsmoreeffectively.Belkinetal.[4]experimentedwithprovidingadditionalspaceforuserstotypeamoreverbosedescriptionoftheirinformationneeds.AsimilarapproachwasattemptedbyKellyetal.[18],whousedclarificationformstoelicitadditionalinformationaboutthesearchcontextfromusers.Theseapproacheshavebeenshowntobeeffectiveinbest-matchretrievalsystemswherelongerqueriesgenerallyleadtomorerelevantsearchresults[4].However,inWebsearch,wheremanyofthesystemsarebasedonanextendedBooleanretrievalmodel,longerqueriesmayactuallyhurtretrievalperformance,leadingtoasmallnumberofpotentiallyirrelevantresultsbeingretrieved.Itisnotsimplysufficienttorequestmoreinformationfromusers;thisinformationmustbeofbetterquality.RelevanceFeedback(RF)andinteractivequeryexpansionarepopulartechniquesthathavebeenusedtoimprovethequalityofinformationthatusersprovidetoIRsystemsregardingtheirinformationneeds.InthecaseofRF,theuserpresentsthesystemwithexamplesofrelevantinformationthatarethenusedtoformulateanimprovedqueryorretrieveanewsetofdocuments.IthasprovendifficulttogetuserstouseRFintheWebdomainduetodifficultyinconveyingthemeaningandthebenefitofRFtotypicalusers.Querysuggestionsofferedbasedonquerylogshavethepotentialtoimproveretrievalperformancewithlimiteduserburden.Thisapproachislimitedtore-executingpopularqueries,andsearchersoftenignorethesuggestionspresentedtothem.Inaddition,bothofthesetechniquesdonothelpuserslearntoproducemoreeffectivequeries.Log-basedanalysisofusers’interactionswiththeExciteandAltaVistasearchengineshasshownthatonly10-20%ofqueriescontainedanyadvancedsyntax.ThisanalysiscanbeausefulwayofcapturingcharacteristicsofusersinteractingwithIRsystems.Researchinusermodelingandpersonalizationhasshownthatgatheringmoreinformationaboutuserscanimprovetheeffectivenessofsearches,butrequiremoreinformationaboutusersthanistypicallyavailablefrominteractionlogsalone.Unlesscoupledwithaqualitativetechnique,suchasapost-sessionquestionnaire[23],itcanbedifficulttoassociateinteractionswithusercharacteristics.Inourstudyweconjecturethatgiventhedifficultyinlocatingadvancedsearchfeatureswithinthetypicalsearchinterface,andthepotentialproblemsinunderstandingthesyntax,thoseusersthatdouseadvancedsyntaxregularlyrepresentadistinctclassofsearcherswhowillexhibitothercommonsearchbehaviors.Inthispaperwestudyothersearchcharacteristicsofusersofadvancedsyntaxinanattempttodeterminewhetherthereisanythingdifferentabouthowthesesearchengineuserssearch,andwhethertheirsearchescanbeusedtobenefitthosewhodonotmakeuseoftheadvancedfeaturesofsearchengines.Todothisweuseinteractionlogsgatheredfromlargesetofconsentingusersoveraprolongedperiod.Inthenextsectionwedescribethedataweusetostudythebehavioroftheuserswhouseadvancedsyntax,relativetothosethatdonotusethissyntax.DATAToperformthisstudywerequiredadescriptionofthequeryingandbrowsingbehaviorofmanysearchers,preferablyoveraperiodoftimetoallowpatternsinuserbehaviortobeanalyzed.ToobtainthesedataweminedtheinteractionlogsofconsentingWebusersoveraperiodof13weeks,fromJanuarytoApril2006.Whendownloadingapartnerclient-sideapplication,theuserswereinvitedtoconsenttotheirinteractionwithWebpagesbeinganonymouslyrecorded(withauniqueidentifierassignedtoeachuser)andusedtoimprovetheperformanceoffuturesystems.Theinformationcontainedintheselogentriesincludedauniqueidentifierfortheuser,atimestampforeachpageview,auniquebrowserwindowidentifier(toresolveambiguitiesindeterminingwhichbrowserapagewasviewed),andtheURLoftheWebpagevisited.Thisprovideduswithsufficientdataonqueryingbehavior(frominteractionwithsearchengines),andbrowsingbehavior(frominteractionwiththepagesthatfollowasearch)tomorebroadlyinvestigatesearchbehavior.Inadditiontothedatagatheredduringthecourseofthisstudywealsohadrelevancejudgmentsofdocumentsthatusersexaminedfor10,680uniquequerystatementspresentintheinteractionlogs.Thesejudgmentswereassignedonasix-pointscalebytrainedhumanjudgesatthetimethedatawerecollected.Weusethesejudgmentsinthisanalysistoassesstherelevanceofsitesusersvisitedontheirbrowsetrailawayfromsearchresultpages.Theprivacyofourvolunteerswasmaintainedthroughouttheentirecourseofthestudy:nopersonalinformationwaselicitedaboutthem,participantswereassignedauniqueanonymousidentifierthatcouldnotbetracedbacktothem,andwemadenoattempttoidentifyaparticularuserorstudyindividualbehaviorinanyway.Allfindingswereaggregatedovermultipleusers,andnoinformationotherthanconsentforloggingwaselicited.DISCUSSIONANDIMPLICATIONSOurfindingsindicatesignificantdifferencesinthequerying,result-click,post-querynavigation,andsearchsuccessofthosethatuseadvancedsyntaxversusthosethatdonot.Manyofthesefindingsmirrorthosealreadyfoundinpreviousstudieswithgroupsofself-identifiednovicesandexperts.Thereareseveralwaysinwhichacommercialsearchenginesystemmightbenefitfromaquantitativeindicationofsearcherexpertise.Thismightbeyetanotherfeatureavailabletoarankingengine;i.e.itmaybethecasethatexpertsearchersinsomecasespreferdifferentpagesthannovicesearchers.Theuserinterfacetoasearchenginemightbetailoredtoauser’sexpertiselevel;perhapsevenmoreadvancedfeaturessuchastermweightingandqueryexpansionsuggestionscouldbepresentedtomoreexperiencedsearcherswhilepreservingthesimplicityofthebasicinterfacefornovices.Resultpresentationmightalsobecustomizedbasedonsearchskilllevel;futureworkmightre-evaluatethebenefitsofcontentsnippets,thumbnails,etc.inamannerthatallowsdifferentoutcomesfordifferentexpertiselevels.Additionally,ifbrowsinghistoriesareavailable,thedestinationsofadvancedsearcherscouldbeusedassuggestedresultsforqueries,bypassingandpotentiallyimprovinguponthetraditionalsearchprocess.Theuseoftheinteractionofadvancedsearchengineuserstoguideotherswithlessexpertiseisanattractivepropositionforthedesignersofsearchsystems.Inpart,thesesearchersmayhavemorepost-querybrowsingexpertisethatallowsthemtoovercometheshortcomingsofsearchsystems.Theirinteractionscanbeusedtopointuserstoplacesthatadvancedsearchengineusersvisitorsimplytotrainlessexperiencedsearchershowtosearchmoreeffectively.However,ifexpertusersaregoingtobeusedinthisway,issuesofdatasparsitywillneedtobeovercome.Ouradvancedusersonlyaccountedfor20.1%oftheuserswhoseinteractionswestudied.Whilstthesemaybeamongstthemostactiveusersitisunlikelythattheywillviewdocumentsthatcoverlargenumberofsubjectareas.However,ratherthanfocusingonwheretheygo(whichisperhapsmoreappropriateforthosewithdomainknowledge),advancedsearchengineusersmayusemoves,tacticsandstrategies[2]thatinexperienceduserscanlearnfrom.Encouraginguserstouseadvancedsyntaxhelpsthemlearnhowtoformulatebettersearchqueries;leveragingthesearchingstyleofexpertsearcherscouldhelpthemlearnmoresuccessfulpost-queryinteractions.Onepotentiallimitationtotheresultswereportisthatinpriorresearch,ithasbeenshownthatqueryoperatorsdonotsignificantlyimprovetheeffectivenessofWebsearchresults[8],andthatsearchersmaybeabletoperformjustaswellwithoutthem[27].Itcouldthereforebearguedthattheuserswhodonotusequeryoperatorsareinfactmoreadvanced,sincetheydonotwastetimeusingpotentiallyredundantsyntaxintheirquerystatements.However,thisseemsunlikelygiventhatthosewhouseadvancedsyntaxexhibitedsearchbehaviorstypicalofuserswithexpertise[13],andaremoresuccessfulintheirsearching.However,infutureworkwewillexpandofdefinitionof“advanceduser”beyondattributesofthequerytoalsoincludeotherinteractionbehaviors,someofwhichwehavedefinedinthisstudy,andotheravenuesofresearchsuchaseye-tracking[12].中文譯文高級(jí)搜索引擎用戶的查詢(xún)和瀏覽行為懷特,瑞恩,莫里斯,丹摘要幫助商業(yè)網(wǎng)絡(luò)搜索引擎的所有用戶在搜索中取得更大成功的一種方法是更好地了解具有更高搜索專(zhuān)業(yè)知識(shí)的用戶在做什么,并利用這些知識(shí)為每個(gè)人帶來(lái)收益。在本文中,我們研究高級(jí)搜索引擎用戶(以及那些不那么先進(jìn)的)的交互日志,以更好地了解這些用戶組搜索的方式。結(jié)果顯示,與分類(lèi)為非高級(jí)的用戶相比,查詢(xún),結(jié)果點(diǎn)擊,查詢(xún)后瀏覽以及我們分類(lèi)為高級(jí)(基于查詢(xún)運(yùn)算符的使用)的用戶的搜索成功率存在顯著差異。我們的研究結(jié)果意味著在搜索過(guò)程中應(yīng)該如何支持高級(jí)用戶,以及他們的互動(dòng)如何用于幫助所有經(jīng)驗(yàn)級(jí)別的搜索者找到更多相關(guān)信息并學(xué)習(xí)改進(jìn)的搜索策略。關(guān)鍵字:查詢(xún)語(yǔ)法,高級(jí)搜索功能,專(zhuān)家搜索。引言查詢(xún)語(yǔ)句的制定既捕獲了信息需求的突出方面,又對(duì)信息檢索(IR)系統(tǒng)有意義,這對(duì)許多搜索者提出了挑戰(zhàn)。諸如Google,Yahoo!和WindowsLiveSearch等商業(yè)Web搜索引擎為用戶提供了使用查詢(xún)運(yùn)算符(如引號(hào),加號(hào)和減號(hào))以及限制搜索到特定站點(diǎn)的修飾符或文件類(lèi)型。除了通過(guò)日志分析,這些技術(shù)可以用于提高結(jié)果精度,但研究人員一般忽略這些技術(shù)來(lái)提高搜索結(jié)果的質(zhì)量。IR研究一般側(cè)重于用戶指定需求的替代方式,而不是增加高級(jí)語(yǔ)法的使用。近年來(lái),對(duì)補(bǔ)充現(xiàn)有搜索技術(shù)和支持用戶的實(shí)用技術(shù)的研究一直在加?。ɡ鏪18][34])。然而,以可忍受的延遲大規(guī)模實(shí)施這些技術(shù)是具有挑戰(zhàn)性的。提交給Web搜索引擎的典型查詢(xún)采用由空格分隔的一系列令牌的形式。在令牌之間通常存在隱含的布爾AND運(yùn)算符,它將搜索結(jié)果限制為包含所有查詢(xún)項(xiàng)的文檔。DeLima和Pedersen[7]研究了解析,短語(yǔ)識(shí)別和擴(kuò)展對(duì)Web搜索查詢(xún)的影響。他們表明,在查詢(xún)中自動(dòng)識(shí)別短語(yǔ)可以提高網(wǎng)絡(luò)搜索的結(jié)果精度。然而,對(duì)于典型的搜索者來(lái)說(shuō),高級(jí)語(yǔ)法的價(jià)值通常是有限的,因?yàn)榇蠖鄶?shù)用戶不知道高級(jí)語(yǔ)法或不知道如何使用它[15]。由于操作員可以幫助檢索相關(guān)文件,因此需要對(duì)其使用進(jìn)行進(jìn)一步調(diào)查。在本文中,我們更詳細(xì)地探討了查詢(xún)運(yùn)算符的用法,并提出了不要求所有用戶都明確使用高級(jí)語(yǔ)法的替代應(yīng)用程序。我們假設(shè)使用高級(jí)查詢(xún)語(yǔ)法的搜索者表現(xiàn)出大多數(shù)用戶群體沒(méi)有的搜索專(zhuān)業(yè)知識(shí)程度;一個(gè)斷言支持以前的研究[13]。研究這些高級(jí)搜索引擎用戶的行為可能會(huì)產(chǎn)生對(duì)其他人可能從中受益的搜索和結(jié)果瀏覽的重要見(jiàn)解。通過(guò)實(shí)驗(yàn)研究和分析,我們?yōu)槊總€(gè)問(wèn)題提供可能的答案。高級(jí)語(yǔ)法的使用與任何這些功能之間的關(guān)系可以支持為高級(jí)搜索引擎用戶量身定制的系統(tǒng)設(shè)計(jì),或者使用高級(jí)用戶的交互來(lái)幫助非高級(jí)用戶在他們的搜索中更加成功。文獻(xiàn)綜述諸如缺乏領(lǐng)域知識(shí),對(duì)正在搜索的文檔集合理解不深以及信息需求不足等因素都會(huì)影響用戶提交給IR系統(tǒng)的查詢(xún)質(zhì)量([24],[28])。已經(jīng)有各種不同的方法來(lái)幫助用戶更有效地指定他們的信息需求。Belkin等人[4]嘗試為用戶提供更多的空間來(lái)輸入他們的信息需求的更詳細(xì)的描述。Kelly等人嘗試了類(lèi)似的方法。[18],他使用澄清表格來(lái)從用戶中獲得關(guān)于搜索上下文的附加信息。已經(jīng)證明這些方法在最佳匹配檢索系統(tǒng)中是有效的,其中較長(zhǎng)的查詢(xún)通常導(dǎo)致更相關(guān)的搜索結(jié)果[4]。然而,在網(wǎng)絡(luò)搜索中,許多系統(tǒng)基于擴(kuò)展布爾檢索模型,較長(zhǎng)的查詢(xún)實(shí)際上可能會(huì)損害檢索性能,導(dǎo)致檢索到少量可能不相關(guān)的結(jié)果。要求用戶提供更多信息并不足夠,這些信息必須具有更好的質(zhì)量。相關(guān)性反饋(RF)和交互式查詢(xún)擴(kuò)展是常用的技術(shù),用于提高用戶向IR系統(tǒng)提供的有關(guān)其信息需求的信息的質(zhì)量。在RF的情況下,用戶向系統(tǒng)呈現(xiàn)相關(guān)信息的例子,然后用這些信息來(lái)制定改進(jìn)的查詢(xún)或檢索新的文檔集合。由于難以向典型用戶傳達(dá)RF的含義和好處,因此很難讓用戶在Web域中使用RF?;诓樵?xún)?nèi)罩咎峁┑牟樵?xún)建議有可能在用戶負(fù)擔(dān)有限的情況下提高檢索性能。這種方法僅限于重新執(zhí)行流行的查詢(xún),而搜索者經(jīng)常忽視向他們提出的建議。另外,這兩種技術(shù)都不能幫助用戶學(xué)習(xí)產(chǎn)生更有效的查詢(xún)。用戶與Excite和AltaVista搜索引擎交互的日志分析表明,只有10-20%的查詢(xún)包含任何高級(jí)語(yǔ)法。該分析可以是捕獲與IR系統(tǒng)交互的用戶特征的有用方式。對(duì)用戶建模和個(gè)性化的研究表明,收集更多關(guān)于用戶的信息可以提高搜索的有效性,但需要更多關(guān)于用戶的信息,而不僅僅是單獨(dú)從交互日志中獲得的信息。除非結(jié)合定性技術(shù),如會(huì)后調(diào)查問(wèn)卷[23],否則將交互與用戶特征相關(guān)聯(lián)可能很困難。在我們的研究中,我們猜想鑒于在典型的搜索界面中定位高級(jí)搜索功能存在困難,并且在理解語(yǔ)法方面存在潛在的問(wèn)題,那些使用高級(jí)語(yǔ)法的用戶通常會(huì)表現(xiàn)出一類(lèi)獨(dú)特的搜索者,他們將展示其他常見(jiàn)搜索行為。在本文中,我們將研究高級(jí)語(yǔ)法的用戶的其他搜索特性,以試圖確定這些搜索引擎用戶搜索的方式是否有任何不同,以

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論