馬國慶-面向移動瀏覽器網(wǎng)頁預(yù)取與緩存方法設(shè)計實現(xiàn)_第1頁
馬國慶-面向移動瀏覽器網(wǎng)頁預(yù)取與緩存方法設(shè)計實現(xiàn)_第2頁
馬國慶-面向移動瀏覽器網(wǎng)頁預(yù)取與緩存方法設(shè)計實現(xiàn)_第3頁
馬國慶-面向移動瀏覽器網(wǎng)頁預(yù)取與緩存方法設(shè)計實現(xiàn)_第4頁
馬國慶-面向移動瀏覽器網(wǎng)頁預(yù)取與緩存方法設(shè)計實現(xiàn)_第5頁
已閱讀5頁,還剩23頁未讀 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

口袋網(wǎng)絡(luò):移動設(shè)備的即時網(wǎng)絡(luò)瀏DimitriosLymberopoulos,OrianaRiva,KarinStrauss,AkshayAlexandros高的網(wǎng)絡(luò)延遲和電池使用限制使得的網(wǎng)絡(luò)瀏覽體驗不盡如人意。在之前的工作8000個用戶的網(wǎng)絡(luò)瀏覽路徑開始進行分析,通Boosting技術(shù)的機器學(xué)習(xí)方法高效的訓(xùn)練出反映每個用減少了超過50%的能量消耗。分類和描述H.4.m[信息系統(tǒng)]:信息系統(tǒng)的應(yīng)用-雜一般條款1.隨著最近觸摸屏和增加和移動數(shù)據(jù)網(wǎng)絡(luò)的普及,智能正在快速的普及。它們是接入網(wǎng)絡(luò)最便捷的設(shè)備,據(jù)[22]4年內(nèi)超過桌面設(shè)備成通過搜索服務(wù)來驗證此想法:一些被經(jīng)常查詢的詞條以及對應(yīng)的結(jié)果在晚上正在充,為保證在現(xiàn)有的網(wǎng)絡(luò)環(huán)境下獲得快速的瀏覽體驗,本文章提出一種智能的網(wǎng)頁層次上來說,不管是因為預(yù)取發(fā)生的太久(內(nèi)容過期)還是預(yù)取的內(nèi)容不是用戶要之前關(guān)于網(wǎng)絡(luò)內(nèi)容預(yù)取主要關(guān)注與簡單的將各個聯(lián)系到一起組成集合,然后當(dāng)入的研究瀏覽的特點而不僅僅是瀏覽記錄的連續(xù)特征。我們利用根據(jù)分析得出的瀏將要什么頁面并且能夠得到用戶什么時候會相應(yīng)的頁面。通過分析每個用戶的瀏覽記錄,能夠在用戶準(zhǔn)備某個頁面之前提前好相應(yīng)的內(nèi)容,從而提高用48000個模型進行評價,結(jié)果顯示,我們能夠少了超過50%的能量消耗。提供了一個對8000用戶網(wǎng)絡(luò)瀏覽的詳細分析,結(jié)果顯示每個用戶的瀏覽行為在驗證上述方法的準(zhǔn)確性時,用戶的瀏覽記錄一部分用來訓(xùn)練用戶的2.網(wǎng)絡(luò)瀏覽分我們整個工作的就是理解用戶的瀏覽行為并且建立有效的模型。我們首先對我我們所用的數(shù)據(jù)集是8000個用戶超過3個月的瀏覽記錄。其中,用戶是隨機的從大量安裝了Bing應(yīng)用或者中預(yù)裝了Bing的用戶中選取的。用戶的手([20,4040,140140,460460,∞])又被細分為四個類(低、中、高、極高。在我們分析的瀏覽記錄中,包括了每個用戶的標(biāo)識,的URL地址以及的時間。移動網(wǎng)絡(luò)的重復(fù)我們首先研究移動網(wǎng)絡(luò)的重復(fù)性。我們計算用戶在下一次中會一個全URL1.a所示,不同的設(shè)備和使用頻率的用戶分開作圖。結(jié)果顯示,將近40%到60%的智能用戶,不管是使用的頻率高還是低,只有近20%的可能們還發(fā)現(xiàn),用戶使用的次數(shù)越多,越有可能重復(fù)的頁面。另外,功能機上的特點與瀏覽路徑更加的保守,他們往往只會一些需要的頁面。圖1網(wǎng)頁統(tǒng)的URL占到了用戶總的50%以上。換句話說,單個URL對每個用戶的URL問量的10%,因此在預(yù)取技術(shù)中考慮每個用戶的特點顯得非常重要。少數(shù)的用戶經(jīng)常的URL;第二種情況就是大量的用戶不經(jīng)常的URL。為了進比如說是3,那么可能導(dǎo)致那種50%經(jīng)常使用瀏覽的用戶的被標(biāo)記的URL數(shù)目會使用瀏覽的用戶的被標(biāo)記的URL數(shù)目為0.圖2最長頁面統(tǒng)被標(biāo)記的記錄占到了70%以上。2URL的平均數(shù)URL2戶被標(biāo)記的頁面、未被標(biāo)記的頁面以及任意頁面的時間間隔分布。4中不同的126分鐘。因此,可以將被標(biāo)記的頁面當(dāng)做是下個被標(biāo)記的頁面被的。圖3頻未被標(biāo)記的頁面到下次再次未被標(biāo)記的頁面,70%到80%用戶的時間間隔小于12分鐘,遠高于被標(biāo)記頁面的35%到50%的比例。換句話說,當(dāng)用戶使用網(wǎng)絡(luò)的時候,與被標(biāo)記的頁面相比,他們更傾向于在短時間內(nèi)多個頁面。因此在除了分析瀏覽過程中的相對時間,我們同樣研究了瀏覽記錄中的絕對時間(某圖4所示,我們隨機選取了4個智能用戶3個月的瀏覽記錄,將用戶一天內(nèi)的圖4時間分由圖可知,4個用戶的網(wǎng)絡(luò)行為模式的差異是顯而易見的。用戶1和用戶4的記錄中主要是未標(biāo)記的所以對于這類用戶的模式進行內(nèi)容預(yù)取存在很大的難度。然而,有趣的是,用戶和用戶4的具有很強的時間周期性(用戶4的行為僅發(fā)生在上午6點到9點用戶標(biāo)記頁面的周期性預(yù)測出某個被標(biāo)記的頁面什么時候會被用戶,那么就能8000個用戶的真實的瀏覽記錄,我們發(fā)現(xiàn)了對于進行預(yù)取非常關(guān)鍵少部分被標(biāo)記的URL對用戶的瀏覽行為起著主要作用,預(yù)測這些被標(biāo)記的URL對被標(biāo)記的URL總是成群的被。使用這傾向于在很短的時間窗格里成批的它們,因此,當(dāng)前被的被標(biāo)記的URL對以后的用戶行為有很大的指導(dǎo)URLURLURL記錄能夠幫助我們預(yù)測接下來的URL的類型基于學(xué)習(xí)的內(nèi)容預(yù)每個呈現(xiàn)給用戶的URL,需要計算很多相應(yīng)的特征,對用戶的信息,檢索信息,URL以為網(wǎng)頁預(yù)測模型,模型計算在給定的時間每個被標(biāo)記URL的概率。相同的,概率越大說明對應(yīng)的URL更有可能在接下來被用戶。被用來訓(xùn)練預(yù)測模型的特征是整個模型建立的,并且應(yīng)當(dāng)從用戶的瀏覽記錄中抽取。55URL的特征,這些特征依據(jù)時間和空間的特性來對用釋為已或未的特征向量。利用這些帶的特征向量路徑,系統(tǒng)里有隨機梯度Boosting技術(shù)訓(xùn)練一個模型,URL都會生成一個特征向量,預(yù)測模型將這些特征向量作為閾值,判斷對哪些URL進行預(yù)取,包括對應(yīng)的、CSS、javascript等。的網(wǎng)絡(luò)瀏覽結(jié)構(gòu)中獲得。利用隨機梯度Boosting的方法,我們可以更加清晰的發(fā)現(xiàn)為了創(chuàng)建一個預(yù)測模型,MART需要將歷史的瀏覽記錄數(shù)據(jù)作為輸入,整個數(shù)據(jù)被1/51/6.訓(xùn)練集用來訓(xùn)練模型,它們可能與一個空間參數(shù)s(之前的URL一個時間參數(shù)t(時間一個時空參數(shù)b(距上次被的時間,一個活躍程度參數(shù)p(該URL在所有URL中的活躍度,連同一個記錄用戶行為的a相關(guān)聯(lián)。MART利用訓(xùn)練數(shù)據(jù)構(gòu)建一個分類模型M,該模型用來評價轉(zhuǎn)移的概率????(??|??????,??)。在測試時我們利用下降梯度進行優(yōu)化,表1提出了4個特征值,如表1中所示,用來反映上述的特性。總的來說,對于一個有k個被標(biāo)記URL的用戶而言,需2*k+11個特征值。每個特征值的具體介紹見表征記錄用戶被標(biāo)記URL的周期性以及頁面的具體時間。時空特征將空間和時URL110不等,但是起到?jīng)Q定因素的頁面通常只有23個。試數(shù)據(jù)。我們首先利用訓(xùn)練數(shù)據(jù)找出用戶的被標(biāo)記的URL,對于在一個月內(nèi)被過5次URL分別命名為??1,??2,?,????6中的用戶而言,k=2.鑒定完被標(biāo)記的URL之后,訓(xùn)練集、驗證集,測試集將通過如下處理得到。圖6單分成一系列的單元。舉例來說,圖6中顯示了當(dāng)D=5時4個不同的單元。對于被的被標(biāo)記的URL對應(yīng)的特征向量被標(biāo)記為已。其他的特征向量被標(biāo)記為。如果一個被標(biāo)記的URL在一個單元內(nèi)被了多次,那么在計算的時候只考慮單元內(nèi)的第一次。這使得模型向第一次偏移,從而保證預(yù)取的及時性。每當(dāng)一個URL在一個單元開始的時候被預(yù)取后,在整個單元內(nèi)都認為該頁面的內(nèi)容是的。在這里面,我們默認為網(wǎng)頁的刷新時間要長于D。訓(xùn)練文件中帶的特征向量被用來下訓(xùn)練預(yù)測模型。測試文件中帶的特我們能從模型中得到每個頁面的概率。對于概率大于0.5的頁面進行預(yù)?。ㄟ@個率的閾值可以根據(jù)網(wǎng)絡(luò)的狀況和電池的狀況進行動態(tài)的變化。預(yù)測的成功與否取決于4.6個不同的模型(2個不同的特征集合三個不同的新鮮閾值,對于所有的用戶,我們需要創(chuàng)建48000個預(yù)測模型。5.利用剩余的空間換取網(wǎng)絡(luò)延遲和電量消耗。先前的工作我們只是利用空間通過對用戶將要的內(nèi)容進行預(yù)測,提高用戶在使用時的瀏覽體驗,同時減低能耗。參考E.Adar,J.Teevan,andS.T.Dumais.Largescale ysisofwebrevisitationpatterns.InProc.ofCHI,2008.E.Adar,J.Teevan,andS.T.Dumais.Resonanceontheweb:webdynamicsandrevisitationpatterns.InProc.ofCHI,pages1381–1390,2009.A.Adya,P.Bahl,andL.Qiu.yzingthebrowsepatternsofmobileclients.InProc.MWorkshoponInternetT.Armstrong,O.Trescases,C.Amza,andE.deLara.Efficientandtransparentdynamiccontentupdatesformobileclients.InProc.ofMobiSys,2006.A.Balasubramanian,B.Levine,andA.Venkataramani.Enhancinginteractivewebapplicationsinhybridnetworks.InProc.of N.Balasubramanian,A.Balasubramanian,andA.Venkataramani.Energyconsumptioninmobilephones:ameasurementstudyandimplicationsfornetworkapplications.InProc.ofIMC,pages280–293,2009.P.Barford,A.Bestavros,A.D.Bradley,andM.Crovella.Changesinwebclientpatterns:Characteristicsandcachingimplications.WorldWideWeb,2(1–2):15–28,L.D.CatledgeandJ.E.Pitkow.Characterizingbrowsingstrategiesintheworld-wideweb.InProc.ofthe3rdWorld-WideWebconferenceonTechnology,toolsandapplications,pages1065–1073,1995.A.Cockburn,S.Greenberg,S.Jones,B.Mckenzie,andM.Moyle.Improvingwrevisitation:ysis,designandevaluation.ITandSocietyJ.,1:159–183,A.CockburnandB.McKensie.Whatdowebusersdo?anempiricalysisofwebuse.Int.J. put.Stud.,54:903–922,June2001.C.Cunha,A.Bestavros,andM.Crovella.Characteristicsofwwwclient-basedTechnicalReportTR-95-010,andSystems,1999.H.Falaki,D.Lymberopoulos,R.Mahajan,S.Kandula,andD.Estrin.Afirstlookatonsmartphones.InProc.ofIMC,pagesJ.H.Friedman.Stochasticgradientboosting.Comput.Stat.Data.,38(4):367–378,ITRSWorkingGroup.Internationaltechnologyforsemiconductors2009report.Technicalreport,2009.Z.JiangandL.Kleinrock.Webprefetchinginamobileenvironment.IEEE Communications,5(5),1998.F.Khalil,J.Li,andH.Wang.Integrating mendationmodelsforimprovedwgepredictionaccuracy.InAustralasianConference puterScience,2008.A.KomninosandM.Dunlop.Acalendarbasedinternetcontentpre-cachingagentforsmallcomputingdevices.J.of alandUbiquitousComputing,12(7),2008.E.Koukoumidis,D.Lymberopoulos,K.Strauss,J.Liu,andD.Burger.Pocketcloudlets.InProc.ofASPLOS,2011.E.P.MarkatosandC.E.Chronaki.Atop-10approachtoprefetchingontheweb.InProc.ofINET,1998.B.McKenzieandA.Cockburn.Anempiricalysisofwgerevisitation.InProc.ofHICSS,volume5,2001.MongooseMetrics.MobileDevicesSurpassDesktopWebBrowsinginFivetoTenYears,2010./press A.Nanopoulos,D.Katsaros,andY.Manolopoulos.Adataminingalgorithmforgeneralizedwebprefetching.IEEETrans.onKnowledgeandDataEngineering,2003.H.Obendorf,H.Weinreich,E.Herder,andM.Mayer.Wgerevisitationrevisited:Implicationsofalong-termclick-streamstudyofbrowserusage.InProc.ofCHI,2007.V.N.PadmanabhanandJ.C.Mogul.Usingpredictiveprefetchingtoimproveworldwideweblatency. MComput.Commun.Rev.,26(3),1996.V.N.PadmanabhanandL.Qiu.Thecontentandaccessdynamicsofabusywebsite:findingsandimplications. MComput.Commun.Rev.,30:111–123,2000.J.PitkowandP.Pirolli.Mininglongestrepeatingsubsequencestopredictworldwidesurfing.InProc.ofUSENIX,pages139–150,F.Qian,Z.Wang,A.Gerber,Z.M.Mao,S.Sen,andO.Spatscheck.Characterizingradioresourceallocationfor3gnetworks.InProc.OfIMC,pages137–150,2010.L.TauscherandS.Greenberg.Howpeoplerevisitwges:empiricalfindingsimplicationsforthedesignofhistorysystems.Int. p.St.,47:97–137,A.Thawani,S.Gopalan,V.Sridhar,andK.Ramamritham.Contextawaretimelyinformationdeliveryinmobileenvironments.TheComputerJournal,50(4),2007.ofwebcontenttoamobiledevice.InProc.OfMobility,2007.Q.Wu,C.J.C.Burges,K.M.Svore,andJ.Gao.Ranking,boosting,andmodeladaptation.TechnicalReportMMSR-TR-2008-109,Research,2008.L.YinandG.Cao.Adaptivepower-awareprefetchinwirelessnetworks.IEEETrans.onWirelessCommunications,3(5),2004.PocketWeb:InstantWebBrowsingforMobileDimitriosLymberopoulos,OrianaRiva,KarinStrauss,AkshayMittal?,Alexandros nInstituteofTechnology,Kanpur, Thehighnetworklatenciesandlimitedbatterylifeofmobilephonescanmakemobilewebbrowsingafrustratingexperience.Inpriorwork,weproposedtradingmemorycapacityforlowerwebaccesslatencyandamoreconvenientdatatransferschedulefromanenergybyprefetchingslowly-changingdata(searchqueriesandresults)nightly,whenthephoneischarging.However,mostwebcontentisintrinsicallymuoredynamicandmaybeupdatedmultipletimesaday,thuseliminatingtheeffectivenessofperiodicupdates.Thispaperaddressesthechallengeofprefetchingdynamicwebcontentinatimelyfashion,givingtheuseraninstantwebbrows-ingexperiencebutwithoutaggravatingthebatterylifetimeissue.Westartbyyzingthewebaccesstracesof8,000users,andobservethatmobilewebbrowsingexhibitsastrongspatiotemporalsignature,whichisdifferentforeveryuser.Weproposetouseama-niquestoef?cientlymodelthissignatureonaperuserbasis.Themachinelearningmodeliscapableofaccuraypredictingfuturewebaccessesandprefetchingthecontentinatimelymanner.Ourdatasetsshowsthatwecanaccurayprefetch60%oftheURLsforabout80-90%oftheuserswithin2minutesbeforetherequest.Thesystemprototypewebuiltnotonlyprovidesmorethan80%lowerwebaccesstimeformorethan80%oftheusers,butitalsoachievesthesameorlowerradioenergydissipationbymorethan50%forthemajorityofmobileusers.CategoriesandSubjectDescriptorsH.4.m[InformationSys-tems]:InformationSystemsApplications—MiscellaneousGeneralTermsAlgorithms,HumanFactors,Withrecentadvancesinlargetouchscreensandwidespreaddatanetworks,smartphonesarerapidlygainingpopularity.Theyarethemostconvenientdevicetoaccesstheweb,andaccordingtoarecentstudy[22],mobiledevicesareexpectedtosurpassdesktopwebbrowsinginthenext4years.Mobilephoneuser’sexperiencehascomealongwayinthepastdecade,butsuchdevicesstillfacehighnetworklatenciesandlimitedbatterylife,whichcanmakethemobileexperiencefrustrating.Permissiontomakedigitalorhardcopiesofallorpartofthisworkforalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforpro?torcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthe?rstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspeci?cpermissionand/orafee.ASPLOS’12,March3–7,2012,London,England,UK.Copyrightc2012ACM978-1-4503-0759-8/12/03...$10.00

Luckily,memorycapacityisstillexperiencinghealthyimprove-ments[15],andcanbeusedtomitigatethetwopreviouslimita-tions.Surplusmemorycanbeusedtostoredatathatisbroughttothemobiledevicewhennetworkconditionsarefavorable.TheBasedonthisobservationwehaveproposedtheconceptofPocketCloudlets[19],i.e.,bringingpartofcloudservicesintomobilephonestoreducelatencyandenergyconsumption,withtheaddedbene?tofsigni?cantlyreducingtheloadontheserversideaswell.Wedemonstratedtheconceptusingasearchservice:asetofpop-ularsearchqueriesandresultsisloadedontothephoneatnight,whenthephoneischarging,tospeedupsearchesduringthenextday.However,inthatworkwelimitedourselvestosearchanddidnotaddressactualwebcontent.Whilesearchresultschangeslowly(theycanbeconsideredstaticonadailybasis),webcontentcanchangequicklyduringasingleday.Toenableafastermobilewebbrowsingexperienceontheinligentwebcontentprefetcherthatdownloadswebcontentonthemobiledeviceatappropriatetimes,anticipatingauser’sfuturewebaccesses.Perhapsthebiggestchallengewithprefetchingwebcontentcomparedtosearchqueriesandresultsisthatwebcontentisdynamic.Whilethemapofsearchqueriestosearchresultsremainsrelativelystableoverdaysorevenweeks,webcontentchangesfrequently.Newsandsocialnetworkwebsiteschangecontinuously,suchthatthenightlyupdateapproachdoesnotworkaswellasitdoesforsearch.prefetchasmuchcontentwecanasoftenaspossible.However,thisapproachisnotpracticalduetoconstraintsinbatterycapacityandwgesofauserevery2minutesmightensurealightning-fastwebbrowsingexperience,butthebatterywouldnotlastverylong.Atahighlevel,wecanconsideraprefetchofawgeasunsuccessfulor“wasted”eitherbecauseithappenedtoolongago(andthusthecontentonthedeviceisstale),orbecausetheuserdidnotendupexplicitlyrequestingthewgeatall.Ineitherofthesecases,weendupusingsomephone’sresourceswithoutrealizinganygains.Therefore,ourgoalistoachievetimelyprefetcheswithoutincreasing(andpossiblydecreasing!)energyWe?rstyzethewebaccesstracesfromroughly8,000mobileusersoveraperiodof3monthsandshowthreewell-de?nedpatterns.First,usersoftenvisitasmallsetofwgesfromtheirphones,whichtheytendtorepeatedlyvisitovertime.Second,useraccessesareoftenperiodicandhappenatgiventimewindows.Forexample,ausermaychecktheupdatesevery30anafternoonbreak.Third,usersoftenaccesscontentinbursts.Forexample,whentheuserchecksthenews,shemayalsochecktheweatherforecastandcurrentstockprices.low

medium

lowFeaturephoneStdSmartphoneAverageSmartphoneStd

mediumFeaturephoneStdSmartphoneAverageSmartphoneStdFractionFractionof0 Probabilityofvisitinganewhigh

FractionFractionof0 Probabilityofvisitinganewextreme

FractionFractionofURL0 #ofthetopmostfrequentlyvisitedhighFeaturephoneStdSmartphoneAverageSmartphoneStd

FractionFractionofURL0 #ofthetopmostfrequentlyvisitedextremeFeaturephoneStdSmartphoneAverageSmartphoneStdFractionFractionof0

FractionFractionof0

FractionFractionofURL0

FractionFractionofURL0

Probabilityofvisitinganew

Probabilityofvisitinganew

#ofthetopmostfrequentlyvisited

#ofthetopmostfrequentlyvisited Figure1.(a)RepeatabilityofmobileURLvisits.(b)AverageandstandarddeviationofthecumulativeURLvolumethatthetopmostfrequentlyvisitedURLsareaccountablefor.Priorworkinwebcontentprefetchinghasfocusedonsimplycorrelatingsetsofwebsitesaccessedtogetherandusingthisinfor-mationtoprefetchthesetwhenthe?rstpageofasetisaccessed.Inthiswork,westudypropertiesofmobilewebaccessesthatgobe-yondthesesequentialfeatures.Wetakeadvantageofthespatiotem-poralaccesspatternsobtainedfromourysisandusemachinelearningtechniquestocreateamodelthatcanbeusedtopredictbothwhatwgesauserislikelytorequestaswellaswhentheserequestsarelikelytooccur.Bylearninghoweachindividualuseraccessesthewebovertime,thephonecanproactivelydown-loadwebcontentbeforetheuserexplicitlyattemptstoaccessit,thusenablinganinstantmobilebrowsingexperience.wecanaccurayprefetch60%oftheURLsforabout80-90%oftheuserswithin2minutesbeforetherequest.Furthermore,theproposedapproachnotonlyprovidesmorethan80%lowerwebaccesstimeformorethan80%oftheusers,butitalsoachievesthesameorlowerradioenergydissipationbymorethan50%forthemajorityofmobileusers.Insummary,thispapermakesthefollowingfrom8,000users,showingwidelydisparatebehaviorsfromusertouserbutastrongspatiotemporalstructureforindividualaccesspredictionprobleminmachinelearning,wherethefea-turesarederivedfromtheobservablestructureofmobilewebbrowsing.Weusestochasticgradientboostingtechniquesforthispurpose,whichallowustoprovideinsightintowhichfea-turesarethemostrelevanttoaccessprediction.Experimentallyevaluatestheaccuracyoftheproposedap-proachforeachuserbycreatingindividualusermodelswithaportionoftheaccessesineachtrace,andtestingtheirper-formancewiththeremainingportion.?es,usingapro-totypeimplementation,theimpactoftheproposedapproachonthewebaccesstimeandradiopowerconsumptionwithrespecttothestate-of-the-art.

MobilewebBrowsingCentraltoourworkistheabilitytounderstandandmodeleffec-tivelytheuserbrowsingbehavior.Westartourstudyby?rstpro-vidingadescriptionofthedatasetthatweused,andcontinuewiththeresultsofourysisbothonaggregateacrossusersandindi-viduallyperuser.DataWeusedthemobilewebaccesslogsof8,000usersacrosstheUnitedStatesovera3-monthperiod.TheuserswererandomlyselectedamongalargernumberofusersthatoptedtodownloadandinstalltheBingapplicationortoenablethepre-installedBingtoolbarontheirmobilephones.Users’phonesvariedfromhigh-endsmartphones(e.g.,,Android,Blackberry)tolow-endfeaturephones(e.g.,LGandSamsungdeviceswithcustomoperat-ingsystems).Fromthetotalof8,000usersinourdataset,halfareunderstandingonuserbehavior,eachofthetwosetsofuserswasfurthersplitinto4differentclasses(low,medium,high,andex-tremevolumeclasses)basedonthemonthlyvolumeofwebac-cesses([20-40),[40,140),[140,460),[460,∞)respectively).Inthelogsweyzed,theinformationoneachwebaccessincludeduniqueuseridenti?er,fullpathofaccessedURL,andaccesstimes-RepeatabilityofMobileWebWe?rststudytherepeatabilityofmobilewebaccesses.Wecom-putethenumberoftimesthatanyuserwillbevisitinganewuniqueURL(i.e.,afullpathURLthathasnotbeenvisitedbefore)inthenextaccess.WeshowtheresultsinFigure1(a)acrossvolumeclassesanddevicetypes.Approximay40%to60%ofthesmart-phoneusers,forthelowandextremevolumeclasses,arelikelytovisitanewURL20%ofthetime.Inotherwords,80%oftheURLvisitsarerepeatedvisitsforroughlyhalfofthesmartphoneusers.Wealsoobservethatusersinhighervolumeclassesaremorelikelytorepeatvisitsthanusersinlowervolumeclasses.Finally,althoughthetrendsaresimilarforfeaturephones,theoverallrepeatedvisitsarehigherwhencomparedtosmartphoneusers.Intuitively,fea-turephoneusersthathavetointeractwithdeviceswithconstrained 148104484 %ofTotalURL%ofTotalURLFractionofURL

low

FractionofURL0

0DeviceTypeUserVolume

FractionFractionofURL

InterAcessTime(hours)highvolume

FractionFractionofURL

InterAcessTime(hours)extremevolumeFigure2.BreakdownoftotalURLaccessesintotargeted(URLsthathavebeenaccessedatleast5timesinamonth)anduntargeted.Thewhitenumbersineachbarplotrepresenttheaveragenumberofuniquetargeted/untargetedURLs.

00

InterAcessTime

00

InterAcessTimeuserinterfacesandhardwarecapabilitiesaremorelikelytoaccessthewebinamoreconservativewaycomparedtosmartphoneusers.Theytendtoexploretheweblessandfocusmoreonaccessingwebcontenttheyreallyneedtoaccess.ToexaminetherepeatabilityofURLvisitsinmoredetail,wealsocompute,foreachuser,thecumulativeURLvolumeforthetopURLsthattheuseraccesses.TheresultisshowninFigure1(b).ThenumbersonthehorizontalaxisrepresentthetopmostfrequentlyvisitedURLs(theseURLsmightbedifferentacrossusers).TheverticalaxisshowsthecumulativefractionofthetotalURLvisitsthatthenumberofthemostfrequentlyvisitedURLsisresponsibleforacrossusers.Notably,acrossuserclasses,themostfrequentlyvisitedURLaccountsforabout50%oftheoveralluser’sURLvisits.Inotherwords,asingleURLisresponsibleforapproximayhalfofatypicaluser’sURLrequests.However,wespeci?cally,thereareusersforwhommorethan90%oftheirtotalURLvolumecanbeattributedtoasingleURL,andusersforwhichthemostfrequentURLcorrespondstolessthan10%oftheirtotalvolume.Itisthereforeimportantforanyprefetchingtechniquetotakeintoaccounttheindividualcharacteristicsofeveryuser.Targetedvs.UntargetedwebFromtheysissofarweinferthattheURLsthatauservisitsfallintotwoclasses:thereisasmallnumberoffrequentlyvisitedURLs,andalongtailofinfrequentlyvisitedURLs.InordertothemastargetedanduntargetedURLs.Wede?neatargetedURLtobeonewhichwasvisitedbytheuseratleast5timesinamonth.Wechosethisthresholdbycloselyyzingtheuserwebaccesslogs.Wefoundthatsmallerthresholds,suchas3,couldbetoopermissiveandcause50%oftheextremevolumeuserstohavemorethan50targetedURLs.Ontheotherhand,higherthresholdssuchas10couldbetooaggressiveandcause30%ofthelowvolumeuserstohave0targetedURLs.WeyzethevolumeofwebaccessesgeneratedbytargetedanduntargetedURLs.AsFigure2shows,althoughthetargetedac-cessesareonlyslightlymorethantheuntargetedaccessesforthelowvolumeusers,theURLaccessesaredominatedbytargetedac-cessesfortheremainingclassesofusers.Forexample,forhighvolumeusers,targetedaccessesaccountfor70%ofthetotalsmart-phoneURLaccesses.Figure2alsoprovidesmoreinsightontheaveragenumberofuniquetargetedURLsacrossthedifferentvolumeclassesandde-vicetypes(numberindicatedinsidethebarsinFigure2).Forlowandmediumvolumeusersandforbothfeaturephonesandsmart-

Figure3.Timeelapsedbetweenconsecutivesmartphonewebac-cesseswhenall,targeted,oruntargetedURLsareconsidered.Thetrendsareidenticalforfeaturephones(notshown).2and3respectively.Forhighandextremevolumeclasses,itin-creasesto9and12forfeaturephonesandsmartphonesrespectively.Inotherwords,2to12uniqueURLsare,oage,responsiblemobiledevicetoproperlymodelwhenandwhichofthesmallnum-beroftargetedURLswillbeaccessedbytheuserisofparamountimportanceforaneffectiveprefetchingpolicy.TimingofMobileWebWgesareconstantlyupdated.Foraprefetchingtechniquetobeeffective,itneedstopredictwhentheuser’swebaccesseswilltakeplace.Hence,westudythetemporalaccesspatternsofourusers.Figure3showstheelapsedtimebetweenconsecutivesmartphonewebaccessesfortargeted,untargetedandcombined(targetedanduntargeted)URLvisits.Approximay35%to50%oftargetedURLvisitsacrossthe4volumeclassesoccurwithin12minutes(0.2hoursinFigure3)ofthelasttargetedURLvisit.Additionally,25%to40%(dependingonthevolumeclass)oftargetedURLvisitstakeplacewithin6minutes(0.1hoursinFigure3)ofthelasttargetedURLvisit.Hence,atargetedURLaccesscanserveasagoodpredictorofthetimeatwhichanexttargetedURLaccesswilloccur.concentratedintimewhencomparedtotargetedURLs.Approxi-may70%to80%ofuntargetedURLvisits(asopposedto35%to50%oftargetedURLvisits)takeplacewithin12minutesofthelastuntargetedURLvisit.Inotherwords,whenmobileusersex-amountoftimeascomparedtowhenvisitingtargetedcontent.Wecanleveragethisinformationtoimprovetheaccuracyofprefetch-ingandsavebatteryresourcesbynotprefetchingtargetedcontentwhentheuserisabouttovisituntargetedURLs.APeekintoIndividualInadditiontorelativetiming,wealsostudytheroleofabsolutetim-inginmobilewebbrowsing(e.g.,timeofdaywhenURLaccessesoccur).Ingeneral,knowingwhentoexpectURLaccessescandrivecontentprefetching.Figure4showsthetimestampswithinadayofallURLaccessesthat4randomsmartphoneusersperformedover3months.RandomUser8RandomUserinef?cientasitmightnotprovideenoughinformationtowhento64MobilewebbrowsingbehavioracrossuserscanvarygreatlyinthetypeandnumberofaccessedURLsaswellasthetimingofURLaccesses.URLURLURLURL00246810121416182022TimeofRandomUserURLURL00246810121416182022Timeof

00246810121416182022TimeofRandomUserURLURL00246810121416182022Timeof

vantageoftheunderlyingspatiotemporalpatternsofeachindivid-ualuser’swebbrowsingbehaviorisrequiredtoenabletimelyandaccuratecontentprefetching.ContentPrefetchingAsALearningMobilewebbrowsingbehaviorexhibitsseveralspatialandtem-poralproperties.Toenabletimelyprefetchingofwebcontent,theprefetchingschemeneedstocarefullymodelandlearnallthesedifferentpropertiesforeachindividualuser.However,optimallyOurapproachisinspiredbythewebsearchcommunity,Figure4.Webaccessesof4representativesmartphoneusersinthehighvolumeclass.Allaccessesoverthe3monthsareprojectedbluecirclesrepresenttargetedURLvisits.Thevarianceinmobilewebaccesspatternsacrossthe4userstargetedURLs.Mostlikely,awebcontentprefetchingtechniquewillhavedif?cultyinmodelingtheseusers’webbrowsingpatternsaccuray,asithasnowayofpredictingtheuntargetedaccesses.Interestingly,however,users1and4accesswgesfromtheirphonesatgiventimeintervalswithintheday(e.g.,user4’saccessesareonlybetween6amand9am,9pmand11pm,andmidnightand2am).Ontheotherhand,forusers2and3,webaccessesaredom-inatedbyasmallsetoftargetedURLs(2targetedURLsforuser2and7foruser3).Moreimportantly,thesinglemostfrequentlyvisitedtargetedURLisresponsibleforthemajorityofthatuser’saccessthissingletargetedURLperiodicallythroughouttheday.IfaprefetchingpolicycanpredictwhenthetargetedURLwillbeac-cessedbytheusersbasedontheirperiodicaccesses,itcanbeveryeffectiveinprovidinganinstantmobilebrowsingexperience.SumndKeyTheysisofrealwebaccesslogsfrom8,000usershashigh-lighteddifferentaspectsofmobilewebbrowsingbehaviorthatarecriticaltocontentprefetching:AsmallnumberoftargetedURLsisresponsibleforthemajor-ityofauser’sURLvisits.PredictingthesetargetedURLac-cessescanhaveahugeimpactontheuser’sbrowsingexperi-TargetedURLaccessesareclusteredintime.MobileuserstendrecenttargetedURLaccessescanbestrongindicatorsoffutureURLvisits.UntargetedURLaccessesaresigni?cantlymoreclusteredintimethantargetedURLaccesses.RecentuntargetedURLac-cessescanhelpusdecideaboutthetypeoffutureURLaccesses(targetedvs.untargeted).prefetchcontentinatimelymanner.Prefetchingbasedonlyonpastsequences(or,moregenerally,sets)ofwebaccessescanbe

multiplehundredsorthousandsoffeaturesarecombinedtoranksubmitsaqueryandthesearchengineranksasetofURLstoshowthemostrelevantoneshigherupintheresultpage.Therankingproblemisoftenformulatedasaclickpredictionproblem,whereforeveryrelatedURL,theenginehastoestimatetheprobabilityofauserclickonthatURL.Thehighertheprobability,thehighertherankoftheURL.Tocreatetheclickpredictionmodel,searcheveryURLdisplayedtotheuser,variousfeaturesarecomputed,encodinginformationabouttheuser,thequery,theURLorallofclicked)oranon-click(ifitwasnotclicked).Theclickpredictionmodelisthentrainedusingmillionsoftheselabeledfeaturevec-Inwebcontentprefetching,thewebsearchclicklogsareplacedbytheuser’swebaccesslogs.TheURLsarenolongerwebsearchresults,butthetargetedURLsidenti?edintheuser’swebaccesslogs.Theclickpredictionmodelisnowturnedintoawebaccesspredictionmodelwhoseroleistoassign,atanygiventime,anaccessprobabilitytoeachtargetedURL.Thehighertheprob-ability,themorelikelytheuseristorequestaccesstothisURL.Thefeaturesusedtotrainthepredictionmodelarethemostcriticaltheuser’swebaccesslogs.Converselytowebsearch,inwebcontentprefetchingtheuserdoesnotexplicitlysubmitaquery.Thus,todeterminewhentheevent-drivenapproachwherethemobiledevicemakeswebaccesspredictionsasaresultofcertainuseractions.Forinstance,themobiledevicemakesapredictioneverytimetheuserunlocksthephone,activatesthebrowserorvisitsaURL.Dependingontheresultingprobabilities,thephonedecideswhethertoprefetchanyapproach.Of?ine,themobiledevicerecordsuserwebaccesses,includingautomaticpagerefreshes,andperiodically(e.g.,weekly,monthly)usesthisinformationtobuildawebaccesspredictionmodelfortheuser.Thismodelcanbebuiltonthemobiledevicecloud.First,asetoffeaturesisextractedforeverytargetedURLinauser’sweblogs.Theroleofthesefeaturesistoencodetheunderlyingstructureofmobilewebbrowsingbehaviorintermsofthewebaccesslogsaremappedtoasetoffeaturevectorsthatareannotatedasaccessesornon-accesses.U

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論