第4章字符串與正則表達式

上傳人：9*** IP屬地：湖北上傳時間：2023-02-07 格式：PPT 頁數(shù)：60 大?。?70.50KB 積分：30 舉報 版權申訴

已閱讀5頁，還剩55頁未讀，繼續(xù)免費閱讀

版權說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權，請進行舉報或認領

文檔簡介

第4章字符串與正則表達式最早的字符串編碼是美國標準信息交換碼ASCII，僅對10個數(shù)字、26個大寫字英文字母、26個小寫字英文字母及一些其它符號進行了編碼。ASCII采用8位即1個字節(jié)，因此最多只能對256個字符進行編碼。隨著信息技術的發(fā)展，各國的文字都需要進行編碼，常見的編碼有UTF-8，GB2312，GBK，CP936。采用不同的編碼意味著把同一字符存入文件時，寫入的內(nèi)容可能不同。UTF-8編碼是國際通用的編碼，以8位，即1字節(jié)表示英語(兼容ASCII)，以24位即3字節(jié)表示中文及其它語言，UTF-8對全世界所有國家需要用到的字符進行了編碼。GB2312是中國制定的中文編碼，使用1個字節(jié)表示英語，2個字節(jié)表示中文；GBK是GB2312的擴充；CP936是微軟在GBK基礎上完成的編碼；GB2312、GBK和CP936都是使用2個字節(jié)表示中文，UTF-8使用3個字節(jié)表示中文；Unicode是編碼轉(zhuǎn)換的基礎。在Windows平臺上，input()函數(shù)從鍵盤輸入的字符串默認為GBK編碼，而Python程序的字符串編碼使用#coding指定，如#coding=utf-8#coding:GBK#-*-coding:utf-8-*-Python2.7.8環(huán)境：>>>s1='中國'>>>s1'\xd6\xd0\xb9\xfa'>>>len(s1)4>>>s2=s1.decode('GBK')>>>s2u'\u4e2d\u56fd'>>>len(s2)2>>>s3=s2.encode('UTF-8')>>>s3'\xe4\xb8\xad\xe5\x9b\xbd'>>>len(s3)6>>>prints1,s2,s3中國中國中國Python3.4.2環(huán)境：>>>s='中國山東煙臺'>>>len(s)6>>>s='SDIBT'>>>len(s)5>>>s='中國山東煙臺SDIBT'>>>len(s)114.1字符串在Python中，字符串也屬于序列類型，除了支持序列通用方法（包括分片操作）以外，還支持特有的字符串操作方法。字符串屬于不可變序列類型4.1字符串Python字符串駐留機制：對于短字符串，將其賦值給多個不同的對象時，內(nèi)存中只有一個副本，多個對象共享該副本。長字符串不遵守駐留機制。判斷一個變量s是否為字符串，應使用isinstance(s,basestring)。在Python3之前，字符串有str和unicode兩種，其基類都是basestring。在Python3之后合二為一了。在Python3中，程序源文件默認為UTF-8編碼，全面支持中文，字符串對象不再有encode和decode方法。4.1.1字符串格式化4.1.1字符串格式化常用格式字符4.1.1字符串格式化>>>x=1235>>>so="%o"%x>>>so"2323">>>sh="%x"%x>>>sh"4d3">>>se="%e"%x>>>se"1.235000e+03">>>chr(ord("3")+1)"4">>>"%s"%65"65">>>"%s"%65333"65333">>>"%d"%"555"Traceback(mostrecentcalllast):File"<pyshell#19>",line1,in<module>"%d"%"555"TypeError:%dformat:anumberisrequired,notstr4.1.1字符串格式化使用format方法進行格式化print"Thenumber{0:,}inhexis:{0:#x},thenumber{1}inoctis{1:#o}".format(5555,55)print"Thenumber{1:,}inhexis:{1:#x},thenumber{0}inoctis{0:#o}".format(5555,55)print"mynameis{name},myageis{age},andmyQQis{qq}".format(name="DongFuguo",age=37,tel="306467355")position=(5,8,13)print"X:{0[0]};Y:{0[1]};Z:{0[2]}".format(position)weather=[("Monday","rain"),("Tuesday","sunny"),("Wednesday","sunny"),("Thursday","rain"),("Friday","Cloudy")]formatter="Weatherof'{0[0]}'is'{0[1]}'".formatforiteminmap(formatter,weather):printitem4.1.2字符串常用方法find()、rfind()、index()、rindex()、count()find()和rfind方法分別用來查找一個字符串在另一個字符串指定范圍（默認是整個字符串）中首次和最后一次出現(xiàn)的位置，如果不存在則返回-1；index()和rindex()方法用來返回一個字符串在另一個字符串指定范圍中首次和最后一次出現(xiàn)的位置，如果不存在則拋出異常；count()方法用來返回一個字符串在另一個字符串中出現(xiàn)的次數(shù)。4.1.2字符串常用方法>>>s="apple,peach,banana,peach,pear">>>s.find("peach")6>>>s.find("peach",7)19>>>s.find("peach",7,20)-1>>>s.rfind('p')25>>>s.index('p')1>>>s.index('pe')6>>>s.index('pear')25>>>s.index('ppp')Traceback(mostrecentcalllast):File"<pyshell#11>",line1,in<module>s.index('ppp')ValueError:substringnotfound>>>s.count('p')5>>>s.count('pp')1>>>s.count('ppp')04.1.2字符串常用方法split()、rsplit()、partition()、rpartition()split()和rsplit()方法分別用來以指定字符為分隔符，將字符串左端和右端開始將其分割成多個字符串，并返回包含分割結(jié)果的列表；partition()和rpartition()用來以指定字符串為分隔符將原字符串分割為3部分，即分隔符前的字符串、分隔符字符串、分隔符后的字符串，如果指定的分隔符不在原字符串中，則返回原字符串和兩個空字符串。4.1.2字符串常用方法>>>s="apple,peach,banana,pear">>>li=s.split(",")>>>li["apple","peach","banana","pear"]>>>s.partition(',')('apple',',','peach,banana,pear')>>>s.rpartition(',')('apple,peach,banana',',','pear')>>>s.rpartition('banana')('apple,peach,','banana',',pear')>>>s="2014-10-31">>>t=s.split("-")>>>printt['2014','10','31']>>>printmap(int,t)[2014,10,31]4.1.2字符串常用方法對于split()和rsplit()方法，如果不指定分隔符，則字符串中的任何空白符號（包括空格、換行符、制表符等等）都將被認為是分隔符，返回包含最終分割結(jié)果的列表。>>>s='helloworld\n\nMynameisDong'>>>s.split()['hello','world','My','name','is','Dong']>>>s='\n\nhelloworld\n\n\nMynameisDong'>>>s.split()['hello','world','My','name','is','Dong']>>>s='\n\nhello\t\tworld\n\n\nMyname\tisDong'>>>s.split()['hello','world','My','name','is','Dong']4.1.2字符串常用方法split()和rsplit()方法還允許指定最大分割次數(shù)，例如：>>>s='\n\nhello\t\tworld\n\n\nMynameisDong'>>>s.split(None,1)['hello','world\n\n\nMynameisDong']>>>s.rsplit(None,1)['\n\nhello\t\tworld\n\n\nMynameis','Dong']>>>s.split(None,2)['hello','world','MynameisDong']>>>s.rsplit(None,2)['\n\nhello\t\tworld\n\n\nMyname','is','Dong']>>>s.split(None,5)['hello','world','My','name','is','Dong']>>>s.split(None,6)['hello','world','My','name','is','Dong']4.1.2字符串常用方法字符串聯(lián)接join()例子：>>>li=["apple","peach","banana","pear"]>>>sep=",">>>s=sep.join(li)>>>s"apple,peach,banana,pear"不推薦使用+連接字符串，優(yōu)先使用join()方法CompareJoinAndPlusForStringConnection.py4.1.2字符串常用方法lower()、upper()、capitalize()、title()、swapcase()這幾個方法分別用來將字符串轉(zhuǎn)換為小寫、大寫字符串、將字符串首字母變?yōu)榇髮憽⒚總€單詞的首字母變?yōu)榇髮懸约按笮懟Q。>>>s="WhatisYourName?">>>s2=s.lower()>>>s2"whatisyourname?">>>s.upper()"WHATISYOURNAME?">>>s2.capitalize()"Whatisyour,name?">>>s.title()'WhatIsYourName?'>>>s.swapcase()'wHATISyOURnAME?'4.1.2字符串常用方法查找替換replace()>>>s="中國，中國">>>prints中國，中國>>>s2=s.replace("中國","中華人民共和國")>>>prints2中華人民共和國，中華人民共和國4.1.2字符串常用方法生成映射表函數(shù)maketrans和按映射表關系轉(zhuǎn)換字符串函數(shù)translate>>>importstring>>>table=string.maketrans("abcdef123","uvwxyz@#$")>>>s="Pythonisagreateprogramminglanguage.Ilikeit!">>>s.translate(table)"Pythonisugryutyprogrumminglunguugy.Ilikyit!">>>s.translate(table,"gtm")#第二個參數(shù)表示要刪除的字符"Pyhonisuryuyproruinlunuuy.Ilikyi!"4.1.2字符串常用方法strip()、rstrip()、lstrip()這幾個方法分別用來刪除兩端、右端或左端的空格或連續(xù)的指定字符。>>>s="abc">>>s2=s.strip()>>>s2"abc">>>"aaaassddf".strip("a")"ssddf">>>"aaaassddf".strip("af")"ssdd">>>"aaaassddfaaa".rstrip("a")'aaaassddf'>>>"aaaassddfaaa".lstrip("a")'ssddfaaa'4.1.2字符串常用方法內(nèi)置函數(shù)eval()>>>eval("3+4")7>>>a=3>>>b=5>>>eval('a+b')8>>>importmath>>>eval('help(math.sqrt)')Helponbuilt-infunctionsqrtinmodulemath:sqrt(...)sqrt(x)Returnthesquarerootofx.>>>eval('math.sqrt(3)')1.7320508075688772>>>eval('aa')Traceback(mostrecentcalllast):File"<pyshell#3>",line1,in<module>eval('aa')File"<string>",line1,in<module>NameError:name'aa'isnotdefined4.1.2字符串常用方法>>>a=input("Pleaseinputavalue:")Pleaseinputavalue:"__import__('os').startfile(r'C:\Windows\\notepad.exe')">>>eval(a)>>>eval("__import__('os').system('mdtesttest')")4.1.2字符串常用方法成員判斷>>>"a"in"abcde"True>>>"j"in"abcde"Falses.startswith(t)、s.endswith(t)判斷字符串是否以指定字符串開始或結(jié)束>>>importos>>>[filenameforfilenameinos.listdir(r'c:\\')iffilename.endswith(('.bmp','.jpg','.gif'))]4.1.2字符串常用方法center()、ljust()、rjust()返回指定寬度的新字符串，原字符串居中、左對齊或右對齊出現(xiàn)在新字符串中，如果指定寬度大于字符串長度，則使用指定的字符（默認為空格）進行填充。>>>'Helloworld!'.center(20)'Helloworld!'>>>'Helloworld!'.center(20,'=')'====Helloworld!===='>>>'Helloworld!'.ljust(20,'=')'Helloworld!========'>>>'Helloworld!'.rjust(20,'=')'========Helloworld!'4.1.3字符串常量>>>importstring>>>string.digits'0123456789'>>>string.punctuation'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'>>>string.letters'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'>>>string.printable'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~\t\n\r\x0b\x0c'>>>string.lowercase'abcdefghijklmnopqrstuvwxyz'>>>string.uppercase'ABCDEFGHIJKLMNOPQRSTUVWXYZ'4.1.3字符串常量隨機密碼生成原理>>>importstring>>>x=string.digits+string.ascii_letters+string.punctuation>>>x'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'>>>importrandom>>>''.join([random.choice(x)foriinrange(8)])'H\\{.#=)g'>>>''.join([random.choice(x)foriinrange(8)])'(CrZ[44M'>>>''.join([random.choice(x)foriinrange(8)])'o_?[M>iF'>>>''.join([random.choice(x)foriinrange(8)])'n<[I)5V@'4.1.4可變字符串在Python中，字符串屬于不可變對象，不支持原地修改，如果需要修改其中的值，只能重新創(chuàng)建一個新的字符串對象。然而，如果確實需要一個支持原地修改的unicode數(shù)據(jù)對象，可以使用io.StringIO對象或array模塊。>>>importio>>>s="Hello,world">>>sio=io.StringIO(s)>>>sio.getvalue()'Hello,world'>>>sio.seek(7)7>>>sio.write("there!")6>>>sio.getvalue()'Hello,there!'4.1.4可變字符串>>>importarray>>>a=array.array('u',s)>>>print(a)array('u','Hello,world')>>>a[0]='y'>>>print(a)array('u','yello,world')>>>a.tounicode()'yello,world'4.2正則表達式正則表達式是字符串處理的有力工具和技術。正則表達式使用某種預定義的模式去匹配一類具有共同特征的字符串，主要用于處理字符串，可以快速、準確地完成復雜的查找、替換等處理要求。Python中，re模塊提供了正則表達式操作所需要的功能。4.2.1正則表達式元字符4.2.1正則表達式元字符4.2.1正則表達式元字符最簡單的正則表達式是普通字符串，可以匹配自身'[pjc]ython'可以匹配'python'、'jython'、'cython''[a-zA-Z0-9]'可以匹配一個任意大小寫字母或數(shù)字'[^abc]'可以一個匹配任意除'a'、'b'、'c'之外的字符'python|perl'或'p(ython|erl)'都可以匹配'python'或'perl'子模式后面加上問號表示可選。r'(http://)?(www\.)?python\.org'只能匹配''、''、''和'''^http'只能匹配所有以'http'開頭的字符串(pattern)*：允許模式重復0次或多次(pattern)+：允許模式重復1次或多次(pattern){m,n}：允許模式重復m~n次4.2.1正則表達式元字符'(a|b)*c'：匹配多個（包含0個）a或b，后面緊跟一個字母c。'ab{1,}'：等價于'ab+'，匹配以字母a開頭后面帶1個至多個字母b的字符串。'^[a-zA-Z]{1}([a-zA-Z0-9._]){4,19}$'：匹配長度為5-20的字符串，必須以字母開頭、可帶數(shù)字、“_”、“.”的字串。'^(\w){6,20}$'：匹配長度為6-20的字符串，可以包含字母、數(shù)字、下劃線。'^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$'：檢查給定字符串是否為合法IP地址。'^(13[4-9]\d{8})|(15[01289]\d{8})$'：檢查給定字符串是否為移動手機號碼。'^[a-zA-Z]+$'：檢查給定字符串是否只包含英文字母大小寫。'^\w+@(\w+\.)+\w+$'：檢查給定字符串是否為合法電子郵件地址。4.2.1正則表達式元字符'^(\-)?\d+(\.\d{1,2})?$'：檢查給定字符串是否為最多帶有2位小數(shù)的正數(shù)或負數(shù)。'[\u4e00-\u9fa5]'：匹配給定字符串中所有漢字。'^\d{18}|\d{15}$'：檢查給定字符串是否為合法身份證格式。'\d{4}-\d{1,2}-\d{1,2}'：匹配指定格式的日期，例如2016-1-31。'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[,._]).{8,}$'：檢查給定字符串是否為強密碼，必須同時包含英語字母大寫字母、英文小寫字母、數(shù)字或特殊符號（如英文逗號、英文句號、下劃線），并且長度必須至少8位。"(?!.*[\'\"\/;=%?]).+"：如果給定字符串中包含’、”、/、;、=、%、?則匹配失敗，關于子模式語法請參考表4-4。'(.)\\1+'：匹配任意字符的一次或多次重復出現(xiàn)。4.2.1正則表達式元字符可能會發(fā)生的錯誤：>>>importre>>>symbols=[',','+','-','*','/','//','**','>>','<<','+=','-=','*=','/=']>>>foriinsymbols: patter=pile(r'\s*'+i+r'\s*')Traceback(mostrecentcalllast):File"<pyshell#11>",line2,in<module>patter=pile(r'\s*'+i+r'\s*')File"C:\python27\lib\re.py",line190,incompilereturn_compile(pattern,flags)File"C:\python27\lib\re.py",line244,in_compileraiseerror,v#invalidexpressionerror:multiplerepeat>>>foriinsymbols: patter=pile(r'\s*'+re.escape(i)+r'\s*')正常執(zhí)行4.2.2re模塊主要方法compile(pattern[,flags]):創(chuàng)建模式對象search(pattern,string[,flags]):在字符串中尋找模式match(pattern,string[,flags]):從字符串的開始處匹配模式findall(pattern,string[,flags]):列出字符串中模式的所有匹配項split(pattern,string[,maxsplit=0]):根據(jù)模式匹配項分割字符串sub(pat,repl,string[,count=0]):將字符串中所有pat的匹配項用repl替換escape(string):將字符串中所有特殊正則表達式字符轉(zhuǎn)義其中flags的值可以是re.l（忽略大小寫）、re.L、re.M（多行匹配模式）、re.S（使元字符.也匹配換行符）、re.U（匹配Unicode字符）、re.X（忽略模式中的空格，并可以使用#注釋）的不同組合（使用|進行組合）。4.2.3直接使用re模塊方法>>>importre>>>text='alpha.beta....gammadelta'>>>re.split('[\.]+',text)['alpha','beta','gamma','delta']>>>re.split('[\.]+',text,maxsplit=2)#分割2次['alpha','beta','gammadelta']>>>re.split('[\.]+',text,maxsplit=1)#分割1次['alpha','beta....gammadelta']>>>pat='[a-zA-Z]+'>>>re.findall(pat,text)#查找所有單詞['alpha','beta','gamma','delta']4.2.3直接使用re模塊方法>>>pat='{name}'>>>text='Dear{name}...'>>>re.sub(pat,'Mr.Dong',text)#字符串替換'DearMr.Dong...'>>>s='asd'>>>re.sub('a|s|d','good',s)#字符串替換'goodgoodgood'>>>re.escape('')#字符串轉(zhuǎn)義'http\\:\\/\\/www\\.python\\.org'4.2.3直接使用re模塊方法>>>printre.match('done|quit','done')#匹配成功<_sre.SRE_Matchobjectat0x00B121A8>>>>printre.match('done|quit','done!')#匹配成功<_sre.SRE_Matchobjectat0x00B121A8>>>>printre.match('done|quit','doe!')#匹配不成功None>>>printre.match('done|quit','d!one!')#匹配不成功None4.2.3直接使用re模塊方法>>>m=re.match(r'www\.(.*)\..{3}','')>>>m.group(0)''>>>m.group(1)'python'>>>m.start(1)4>>>m.end(1)10>>>m.span(1)(4,10)4.2.3直接使用re模塊方法刪除字符串中重復的空格>>>importre>>>s='aaabbcdefff'>>>re.split('[\s]+',s)['aaa','bb','c','d','e','fff','']>>>re.split('[\s]+',s.strip())['aaa','bb','c','d','e','fff']>>>''.join(re.split('[\s]+',s.strip()))'aaabbcdefff'>>>''.join(re.split('\s+',s.strip()))'aaabbcdefff'>>>re.sub('\s+','',s.strip())'aaabbcdefff'>>>s'aaa>>>s.split()#也可以不使用正則表達式['aaa','bb','c','d','e','fff']>>>''.join(s.split())'aaabbcdefff'4.2.3直接使用re模塊方法使用以'\'開頭的元字符>>>importre>>>example='ShanDongInstituteofBusinessandTechnology'>>>re.findall('\\ba.+?\\b',example)#以a開頭的完整單詞['and']>>>re.findall('\\Bo.+?\\b',example)#含有o字母的單詞中第一個非首字母o后面的剩余部分['ong','ology']>>>re.findall('\\b\w.+?\\b',example)#所有單詞['ShanDong','Institute','of','Business','and','Technology']>>>re.findall(r'\b\w.+?\b',example)#使用原始字符串，減少需要輸入的符號數(shù)量['ShanDong','Institute','of','Business','and','Technology']>>>re.findall('\d\.\d\.\d','Python2.7.8')#x.x.x的數(shù)字形式['2.7.8']>>>re.split('\s',example)#使用任何空白字符分割字符串['ShanDong','Institute','of','Business','and','Technology']4.2.4使用正則表達式對象首先使用re模塊的compile()方法將正則表達式編譯生成正則表達式對象，然后再使用正則表達式對象提供的方法進行字符串處理。使用編譯后的正則表達式對象可以提高字符串處理速度。正則表達式對象的match(string[,pos[,endpos]])方法用于在字符串開頭或指定位置進行搜索，模式必須出現(xiàn)在字符串開頭或指定位置；正則表達式對象的search(string[,pos[,endpos]])方法用于在整個字符串中進行搜索；正則表達式對象的findall(string[,pos[,endpos]])方法用于在字符串中查找所有符合正則表達式的字符串列表。4.2.4使用正則表達式對象importre>>>example='ShanDongInstituteofBusinessandTechnology'>>>pattern=pile(r'\bB\w+\b')#以B開頭的單詞>>>pattern.findall(example)['Business']>>>pattern=pile(r'\w+g\b')#以g結(jié)尾的單詞>>>pattern.findall(example)['ShanDong']>>>pattern=pile(r'\b[a-zA-Z]{3}\b')#查找3個字母長的單詞>>>pattern.findall(example)['and']>>>pattern.match(example)#從字符串開頭開始匹配，所以不成功，沒有返回值>>>pattern.search(example)#在整個字符串中搜索，所以成功<_sre.SRE_Matchobjectat0x01228EC8>>>>pattern=pile(r‘\b\w*a\w*\b’)#查找所有含有字母a的單詞>>>pattern.findall(example)['ShanDong','and']4.2.4使用正則表達式對象替換字符串內(nèi)容的方法：sub(repl,string[,count=0])-->newstringReturnthestringobtainedbyreplacingtheleftmostnon-overlappingoccurrencesofpatterninstringbythereplacementrepl.subn(repl,string[,count=0])-->(newstring,numberofsubs)Returnthetuple(new_string,number_of_subs_made)foundbyreplacingtheleftmostnon-overlappingoccurrencesofpatternwiththereplacementrepl.4.2.4使用正則表達式對象>>>example='''Beautifulisbetterthanugly.Explicitisbetterthanimplicit.Simpleisbetterthancomplex.Complexisbetterthancomplicated.Flatisbetterthannested.Sparseisbetterthandense.Readabilitycounts.''‘>>>pattern=pile(r'\bb\w*\b',re.I)>>>printpattern.sub('*',example)*is*thanugly.Explicitis*thanimplicit.Simpleis*thancomplex.Complexis*thancomplicated.Flatis*thannested.Sparseis*thandense.Readabilitycounts.4.2.4使用正則表達式對象>>>printpattern.sub('*',example,1)*isbetterthanugly.Explicitisbetterthanimplicit.Simpleisbetterthancomplex.Complexisbetterthancomplicated.Flatisbetterthannested.Sparseisbetterthandense.Readabilitycounts.>>>pattern=pile(r'\bb\w*\b')>>>printpattern.sub('*',example,1)Beautifulis*thanugly.Explicitisbetterthanimplicit.Simpleisbetterthancomplex.Complexisbetterthancomplicated.Flatisbetterthannested.Sparseisbetterthandense.Readabilitycounts.4.2.4使用正則表達式對象分割字符串：split(string[,maxsplit=0])-->list4.2.4使用正則表達式對象>>>example=r'one,two,three.four/file\six?seven[eight]nine|ten‘>>>pattern=pile(r'[,./\\?[\]\|]')>>>pattern.split(example)['one','two','three','four','file','six','seven','eight','nine','ten']>>>example=r'one1two2three3four4file5six6seven7eight8nine9ten'>>>pattern=pile(r'\d+')>>>pattern.split(example)['one','two','three','four','file','six','seven','eight','nine','ten']>>>example=r'onetwothreefour,file.six.seven,eight,nine9ten'>>>pattern=pile(r'[\s,.\d]+')>>>pattern.split(example)['one','two','three','four','file','six','seven','eight','nine','ten']4.2.5子模式與match對象使用()表示一個子模式，即()內(nèi)的內(nèi)容作為一個整體出現(xiàn)，例如’(red)+’可以匹配’redred’、’redredred‘等多個重復’red’的情況。>>>telNumber='''SupposemyPhoneNo.iyoursihisi'''>>>pattern=pile(r'(\d{3,4})-(\d{7,8})')>>>pattern.findall(telNumber)[('0535','1234567'),('010','12345678'),('025','87654321')]4.2.5子模式與match對象正則表達式對象的match方法和search方法匹配成功后返回match對象。match對象的主要方法有group()、groups()、groupdict()、start()、end()、span()等等。4.2.5子模式與match對象importretelNumber='''SupposemyPhoneNo.iyoursihisi'''pattern=pile(r'(\d{3,4})-(\d{7,8})')index=0whileTrue:matchResult=pattern.search(telNumber,index)ifnotmatchResult:breakprint'-'*30print'Success:'foriinrange(3):print'Searchedcontent:',matchResult.group(i),\'Startfrom:',matchResult.start(i),'Endat:',matchResult.end(i),\'Itsspanis:',matchResult.span(i)index=matchResult.end(2)4.2.5子模式與match對象子模式擴展語法：(?P<groupname>)：為子模式命名(?iLmsux)：設置匹配標志，可以是幾個字母的組合，每個字母含義與編譯標志相同(?:...)：匹配但不捕獲該匹配的子表達式(?P=groupname)：表示在此之前的命名為groupname的子模式(?#...)：表示注釋(?=…)：用于正則表達式之后，表示如果=后的內(nèi)容在字符串中出現(xiàn)則匹配，但不返回=之后的內(nèi)容(?!...)：用于正則表達式之后，表示如果!后的內(nèi)容在字符串中不出現(xiàn)則匹配，但不返回!之后的內(nèi)容(?<=…)：用于正則表達式之前，與(?=…)含義相同(?<!...)：用于正則表達式之前，與(?!...)含義相同4.2.5子模式與match對象>>>importre>>>exampleString='''Thereshouldbeone--andpreferablyonlyone--obviouswaytodoit.Althoughtha

人人文庫> 全部分類> 教育資料 > 課件下載

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預覽，若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

第4章字符串與正則表達式

文檔簡介

溫馨提示

最新文檔

評論

第4章字符串與正則表達式

文檔簡介

溫馨提示

最新文檔

評論

相關文檔