C45算法建立決策樹(shù)JAVA練習(xí)DOC_第1頁(yè)
C45算法建立決策樹(shù)JAVA練習(xí)DOC_第2頁(yè)
C45算法建立決策樹(shù)JAVA練習(xí)DOC_第3頁(yè)
C45算法建立決策樹(shù)JAVA練習(xí)DOC_第4頁(yè)
C45算法建立決策樹(shù)JAVA練習(xí)DOC_第5頁(yè)
已閱讀5頁(yè),還剩8頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、【決策樹(shù)】一C4.5算法建立決策樹(shù) JAVA練習(xí)以下程序是我練習(xí)寫(xiě)的,不一定正確也沒(méi)做存儲(chǔ)優(yōu)化。有問(wèn)題請(qǐng)留言交流。轉(zhuǎn)載請(qǐng)掛連接。當(dāng)前的屬性為 :age in come stude nt credit_rat ing當(dāng)前的數(shù)據(jù)集為(最后一列是TARGET_VALUE ):youthhighnofairnoyouthhighnoexcelle ntnomiddle_agedhighnofairyesseniorlowyesfairyesseniorlowyesexcelle ntnomiddle_agedlowyesexcelle ntyesyouthmediumnofairnoyouthlowy

2、esfairyesseniormediumyesfairyesyouthmediumyesexcelle ntyesmiddle_agedhighyesfairyesseniormediumnoexcelle ntnoC4.5建立樹(shù)類(lèi)p ackage C45Test;import java.util.ArrayList;import java.utiLList; import java.util.Map;p ublic classDecisi on Tree Public TreeNode createDT(List<ArrayList<String>>data,Lis

3、t<String>attributeList)System.out.println("當(dāng)前的 DATA為");for (int i=0;i<data.size();i+) ArrayListvStri ng> temp = data.get(i);for (int j=O;jvtemp.size();j+) System.out. prin t(te mp .get(j)+ "");System.out. prin tl n();IISystem.out .prin tl n("System.out.println(&q

4、uot;當(dāng)前的 ATTR為");for (int i=0;i<attributeList.size();i+)System.out. prin t(attributeList.get(i)+ "");System.out .prin tl n();IISystem.out.printingTreeNode node =n ewTreeNode();Stri ng result = In foGai n.ls Pure(l nfoGai n.getTarget(data); if (result != null )n ode.setNodeName("

5、;leafNode");no de.setTargetF un Value(result);retu rn no de;if (attributeList.size() = 0)no de.setTargetF un Value(result);retu rn no de; else In foGa in gai n =n ewI nfoGai n(data,attributeList);double maxGai n = 0.0;int attrIndex = -1;for (int i=0;i<attributeList.size();i+) double temp Gai

6、 n = gain. getGa in Ratio(i); if (maxGai n < temp Gai n)maxGa in = tempGain;attrI ndex = i;選擇出的最大增益率屬性為:"+System.out. prin tl n("attributeList.get(attrI ndex);no de.setAttributeValue(attributeList.get(attrl ndex);List<ArrayList<String>> resultData =null ;Map vStri ng,L ong&g

7、t; attrvalueMa p =gai n.getAttributeValue(attrl ndex);for (Map.Entry<String, Long> entry :attrvalueMa p.e ntrySet()resultData = gain. getData4Value(e ntry.getKey(), attrI ndex);null ;當(dāng)前為的"+enti7.getKey()+"分支。");TreeNode leafNode =System.out. prin tl n(""+attributeList.

8、get(attrI ndex)+"if (resultData.size() = 0)leafNode =n ewTreeNode();leafNode.setNodeName(attributeList.get(attrl ndex); leafNode.setTargetF un Value(result); leafNode.setAttributeValue(e ntry.getKeyO);else for ( int j = 0; j v resultData.size(); j+) resultData.get(j).remove(attrI ndex);ArrayLis

9、tvStri ng> resultAttr =ArrayListvStri ng>(attributeList);resultAttr.remove(attrI ndex); leafNode = createDT(resultData,resultAttr);n ode.getChildTreeNode().add(leafNode);no de.get PathName().add(e ntry.getKeyO);return no de;newclass TreeNodeprivate private private p rivate p rivateString attri

10、buteValue;List<TreeNode> childTreeNode; ListvStri ng> p athName;String targetF un Value;Stri ng no deName;p ublicTreeNode(Stri ng no deName)this .no deName = no deName;this .childTreeNode = newArrayListvTreeNode>();this .pathName = newArrayListvString>();public TreeNode()this .childTr

11、eeNode =newArrayListvTreeNode>();this .pathName = newArrayListvString>();public String getAttributeValueO return attributeValue;public void setAttributeValue(String attributeValue) this .attributeValue = attributeValue;public List<TreeNode> getChildTreeNode() return childTreeNode;public

12、void setChildTreeNode(ListvTreeNode> childTreeNode) this .childTreeNode = childTreeNode;Public String getTargetFunValue() return targetFunValue;public void setTargetFunValue(String targetFunValue) this .targetFunValue = targetFunValue;public String getNodeName() retur n no deName;p ublic void set

13、NodeName(Stri ng no deName) this .no deName = no deName;public List<String> getPathName() return pathName;public void setPathName(ListvString> pathName) this .pathName = pathName;增益率計(jì)算類(lèi)(取log的時(shí)候底用的是e,沒(méi)用2)l-jp ackage C45Test;import import import import import import import/C 4.5 public實(shí)現(xiàn)class

14、 InfoGain java.util.ArrayList; java.util.HashMa p; java.util.HashSet; java.util.Iterator; java.util List; java.util.Ma p; java.util.Set;private privatep ublic attribute"List<ArrayList<Stri ng>> data;List<Stri ng> attribute;In foGa in( List<ArrayList<Stri ng>> data,

15、List<Stri ng>this .data =newArrayList<ArrayList<String>>();for (int i=0;i<data.size();i+)List<Stri ng> temp = data.get(i);ArrayList<Stri ng> t =n ewArrayList<Stri ng>();for (int j=0;j<temp.size();j+)t.add(te mp .get(j);this .data.add(t);this .attribute =newA

16、rrayList<String>();for (int k=O;kvattribute.size();k+)this .attribute.add(attribute.get(k);/*this.data = data; this.attribute = attribute;*/ /獲得熵p ublic double getE ntro py ()Map vStri ng,L ong> targetValueMa p = getTargetValue();Set<Stri ng> targetkey = targetValueMa p.keySet(); doub

17、le entropy = 0.0;for (String key : targetkey)double p = MathUtils.div(double )targetValueMa p.get(key),(double )data.size();en tro py += (-1) * p * Math.log( p);獲得InfoAreturn entropy;/p ublic double getI nfoAttnbute(int attribute In dex)Map vStri ng,L ong> attributeValueMa p = getAttributeValue(a

18、ttributeI ndex);double infoA = 0.0;for (Map.Entry<String, Long> entry :attributeValueMa p.en trySet()int size = data.size();double attributeP = MathUtils.div( double )entry.getValue() (double) size);Map vStri ng,L ong> targetValueMa p = getAttributeValueTargetValue(e ntry.getKey(),attribute

19、l ndex);long totalCount = 0L;for (Map.EntryvString, Long>en tryValue :targetValueMa p.en trySet()totalCo unt += en tryValue.getValue();double valueSum = 0.0;for (Map.EntryvString, Long> entryTargetValue :targetValueMa p.en trySet()double p =MathUtils.div(double )e ntryTargetValue.getValue(),(d

20、ouble )totalCou nt);valueSum += Math.log( p) * p;infoA += (-1) * attribute P * valueSum;return infoA;得到屬性值在決策空間的比例/Public MapvString,Long> getAttnbuteValueTargetValue(String attributeName, int attribute In dex)Map vStri ng,L ong> targetValueMa p =n ewHashMa pvStri ng,L on g>();IteratorvArra

21、yListvStri ng>> iterator = data.iterator();while (iterator.hasNext()ListvStri ng> temp List = iterator. next();if (attributeName.equalsIgnoreCase(tempList.get(attributeIndex) int size = tempList.size();Stri ng key = temp List.get(size - 1);null ? +value :1L);Long value = targetValueMa p.get

22、(key); targetValueMa p.p ut(key, value !=得到屬性在決策空間上的數(shù)量return targetValueMap;/int attribute In dex)newnull ? +value :1L);public MapvString,Long> getAttributeValue(Map vStri ng,L ong> attributeValueMa p = HashMa pvStri ng,Lo ng>();for (ArrayListvString> note : data) Stri ng key = no te.get

23、(attributeI ndex); Long value = attributeValueMa p.get(key); attributeValueMa p.p ut(key, value !=return attributeValueMap;intpublic ListvArrayListvString>> getData4Value(Stnng attrValue, attrI ndex)newListvArrayListvStri ng>> resultData =ArrayListvArrayListvStri ng>>();IteratorvAr

24、rayListvStri ng>> iterator = data.iterator();for (;iterator.hasNext();)ArrayListvStri ng> temp list = iterator. next();if (templist.get(attrlndex).equalslgnoreCase(attrValue)ArrayListvStri ng> temp = (ArrayListvStri ng>)temp list.clo ne(); resultData.add(te mp);return resultData;/獲得增益

25、率public double getGainRatio( int attributeIndex)return MathUtils.div(getGain(attributeIndex), getS plitl nfo(attributel ndex);/獲得增益量p ublic double getGa in( int attribute In dex)return getEntropy() - getInfoAttribute(attributelndex);/得到懲罰因子p ublic double getS plitl nfo( int attribute In dex)Map vStr

26、i ng,L ong> attributeValueMa p = getAttributeValue(attributeI ndex);double splitA = 0.0;for (Map.EntryvString, Long> entry :attributeValueMa p.en trySet()int size = data.size();double attributeP = MathUtils.div( double )entry.getValue() (double) size);sp litA += attribute P * Math.log(attribut

27、e P) * (-1);得到目標(biāo)函數(shù)在當(dāng)前集合范圍內(nèi)的離散的值return splitA;/public MapvString,Long> getTargetValue()Map vStri ng,L ong> targetValueMa p =n ewHashMa pvStri ng,L on g>();Iterator<ArrayList<Stri ng>> iterator = data.iterator();while (iterator.hasNext()List<Stri ng> temp List = iterator. ne

28、xt();Stri ng key = temp List.get(te mp List.size() - 1);Long value = targetValueMa p.get(key);null ? +value : 1L);targetValueMa p.p ut(key, value !=return targetValueMap;/獲得TARGE值public staticListvString>getTarget(List<ArrayList<String>>data)ListvStri ng> list =n ewArrayListvStri n

29、g>();for (ArrayListvString> temp : data)int index = temp.size() -1;Stri ng value = temp .get(i ndex);list.add(value);return list;判斷當(dāng)前純度是否100%/p ublic static Stri ng IsPu re(ListvStri ng> list)SetvStri ng> set =n ewHashSetvStri ng>();for (String name :list)set.add( name);if (set.size()

30、 > 1) return null ;IteratorvStri ng> iterator = set.iterator(); return iterator.next();測(cè)試類(lèi),數(shù)據(jù)集讀取以上的分別放到2個(gè)List中。p ackage C45Test;import importjava.util.ArrayList; java.util List;importC45Test.Decisi on Tree.TreeNode;p ublicclass MainC45 p rivatestaticfinalListvArrayListvStri ng>> dataList

31、 =ArrayListvArrayListvStri ng>>();p rivatestaticfinalListvStri ng> attributeList =newnewArrayListvStri ng>();public static void main(String args)Decisi on Tree dt =new Decisi on Tree();TreeNode node = dt.createDT(c on figData(),c on figAttribute(); System.out .prin tl n();大數(shù)運(yùn)算工具類(lèi)ap ackage C45Test; imp ortjava.math.BigDecimal;p ublicabstract class MathUtils 默認(rèn)余數(shù)長(zhǎng)度p rivate static final/int DIV_SCALE = 10;/受限于DOUBL長(zhǎng)度 p ublic static doubleadd( double value1, double value2)BigDecimal big1 =BigDecimal big2 =return big1.add(big2).doubleValue();n ewBigDecimal(Stri ng.value

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論