当前位置: 首页 > 后端技术 > Python

机器学习——K近邻算法

时间:2023-03-25 22:28:40 Python

K杩戦偦绠楁硶1.浠€涔堟槸k杩戦偦绠楁硶锛熷鏋滃ぇ澶氭暟瀹炰緥灞炰簬鏌愪釜绫伙紝鍒欏皢杈撳叆瀹炰緥褰掑叆姝ょ被銆傚涓嬪浘鎵€绀猴細杈撳叆涓€涓柊鐨勨潛鐐癸紝鍒ゆ柇鏄睘浜巆lassA杩樻槸classB銆傞偅涔堥棶棰樺氨鏉ヤ簡k杩戦偦绠楁硶銆傚浣曞垽鏂渶杩戦偦锛熻绠楄窛绂伙紒濡備綍璁$畻璺濈2锛岃窛绂绘祴搴?锛屾姘忚窛绂?$d(x,y)=\sqrt{\displaystyle\sum_{i=1}^{n}(y_i-x_i)^2}$$3锛屾浖鍝堥】璺濈$$d(x,y)=(\displaystyle\sum_{i=0}^{n}{|y_i-x_i|})$$4,闂靛彲澶柉鍩鸿窛绂?$闂靛彲澶柉鍩鸿窛绂?(\displaystyle\sum_{i=0}^{n}{|y_i-x_i|})^\frac{1}{p}$$鐜板湪鎴戜滑鐭ラ亾濡備綍纭畾璺濈浜嗭紝浣嗘槸闂鍙堟潵浜嗭紝鎴戜滑搴旇濡備綍瀹氫箟鎴戜滑鐨凨鍊煎憿锛?銆乲鍊肩殑閫夋嫨濡傛灉閫夋嫨杈冨皬鐨刱鍊硷紝鐩稿綋浜庝娇鐢ㄨ緝灏忛偦鍩熷唴鐨勮缁冩牱渚嬭繘琛岄娴嬶紝鈥滃涔犫€濈殑閫艰繎璇樊浼氬噺灏忋€傦紙鐩镐技锛夎缁冩牱渚嬩細瀵归娴嬬粨鏋滀骇鐢熷奖鍝嶃€備絾缂虹偣鏄€滃涔犫€濈殑浼拌璇樊浼氬澶э紝棰勬祴缁撴灉瀵归偦灞呯殑瀹炰緥鐐归潪甯告晱鎰熴€傚鏋滅浉閭诲疄渚嬬偣鎭板ソ鏄櫔澹帮紝鍒欓娴嬪皢鏄敊璇殑銆備篃灏辨槸璇达紝k鍊肩殑闄嶄綆鎰忓懗鐫€鏁翠綋妯″瀷鍙樺緱澶嶆潅锛屽鏄撳嚭鐜拌繃鎷熷悎銆傞€夋嫨杈冨ぇ鐨刱鍊肩浉褰撲簬浣跨敤杈冨ぇ閭诲煙涓殑璁粌瀹炰緥杩涜棰勬祴銆傚畠鐨勪紭鐐规槸鍙互鍑忓皯瀛︿範鐨勪及璁¤宸紝缂虹偣鏄細澧炲姞瀛︿範鐨勮繎浼艰宸€傝繖鏃讹紝涓庤緭鍏ュ疄渚嬬浉璺濊緝杩滐紙涓嶇浉浼硷級鐨勮缁冨疄渚嬩篃浼氬棰勬祴浜х敓褰卞搷锛岄€犳垚棰勬祴閿欒銆俴鍊肩殑澧炲姞鎰忓懗鐫€鏁翠釜妯″瀷鍙樺緱鏇寸畝鍗曘€傚鏋渒=N锛岄偅涔堟棤璁鸿緭鍏ュ疄渚嬫槸浠€涔堬紝瀹冮兘灏嗚绠€鍗曞湴棰勬祴涓哄睘浜庤缁冨疄渚嬩腑浠h〃鏈€澶氱殑绫诲埆銆傛鏃舵ā鍨嬭繃浜庣畝鍗曪紝瀹屽叏蹇界暐浜嗚缁冩牱渚嬩腑鐨勫ぇ閲忔湁鐢ㄤ俊鎭紝杩欐槸涓嶅彲鍙栫殑銆傚湪搴旂敤涓紝k鐨勫彇鍊间竴鑸彇杈冨皬鐨勫€笺€傞€氬父锛屼氦鍙夐獙璇佹柟娉曠敤浜庨€夋嫨鏈€浼榢鍊笺€傚洓銆佺畻娉曡繃绋嬶紙python锛夌涓€姝ワ細纭畾KNN绠楁硶闇€瑕佺‘瀹氱殑鍙傛暟锛?銆佽緭鍏ュ緟鍒嗙被鍙橀噺锛堣缁冮泦锛夛紱2銆並鍊硷紱3.璺濈鐨勮绠楁柟娉曪紱4.璁粌闆嗘暟鎹紱4.銆佽缁冮泦鏍囩鐨勭浜屾锛氳绠楄窛绂伙紝KNN绠楁硶鐨勬牳蹇冩槸璁$畻姣忎釜鍙橀噺鐨勮窛绂伙紝鐒跺悗閫夋嫨鏈€鍓嶉潰鐨凨涓€傛墍浠ユ垜浠湁鏃跺€欓渶瑕佸杈撳叆鏁版嵁杩涜鎵╃淮浣垮叾涓庢祴璇曢泦鐨勫舰鐘剁浉鍚屻€傛楠?锛氬垽鏂璌NN鐨勫噯纭€?.KNN璇勪及浼樺娍鏄撲簬瀹炴柦锛氶壌浜庣畻娉曠殑绠€鍗曟€у拰鍑嗙‘鎬э紝瀹冩槸鏂版暟鎹瀛﹀灏嗗涔犵殑绗竴鎵瑰垎绫诲櫒涔嬩竴閫傚簲鎬э細闅忕潃鏂拌缁冩牱鏈殑澧炲姞锛岀畻娉曢€傚簲浠讳綍鏂版暟鎹紝鍥犱负鎵€鏈夎缁冩暟鎹兘瀛樺偍鍦ㄥ唴瀛樹腑寰堝皯鐨勮秴鍙傛暟锛欿NN鍙渶瑕乲鍊煎拰璺濈搴﹂噺锛屼笌鍏朵粬鏈哄櫒瀛︿範绠楁硶鐩告瘮闇€瑕佸緢灏戠殑瓒呭弬鏁扮己鐐逛笉鑳藉緢濂藉湴鎵╁睍锛氱敱浜嶬NN鏄竴绉嶆儼鎬х畻娉曪紝瀹冩瘮鍏朵粬鍒嗙被鍣ㄤ娇鐢ㄦ洿澶氱殑鍐呭瓨鍜屾暟鎹瓨鍌ㄣ€備粠鏃堕棿鍜岄噾閽辩殑瑙掑害鏉ョ湅锛岃繖鍙兘鏄槀璐电殑銆傛洿澶氱殑鍐呭瓨鍜屽瓨鍌ㄤ細澧炲姞涓氬姟寮€鏀紝鏇村鐨勬暟鎹彲鑳介渶瑕佹洿闀跨殑鏃堕棿鏉ヨ绠椼€傚敖绠″凡缁忓垱寤轰簡涓嶅悓鐨勬暟鎹粨鏋勶紙渚嬪Ball-Tree锛夋潵瑙e喅璁$畻鏁堢巼浣庝笅鐨勯棶棰橈紝浣嗗垎绫诲櫒鏄惁鐞嗘兂鍙兘鍙栧喅浜庝笟鍔¢棶棰樼殑缁存暟鐏鹃毦锛欿NN绠楁硶瀹规槗鎴愪负curseofdimensionality锛岃繖鎰忓懗鐫€瀹冨湪楂樼淮鏁版嵁杈撳叆涓婅〃鐜颁笉浣炽€傝繖鏈夋椂琚О涓哄嘲鍖栫幇璞★紝鍦ㄧ畻娉曡揪鍒版渶浣崇壒寰佹暟閲忓悗锛岄澶栫殑鐗瑰緛浼氬鍔犲垎绫婚敊璇殑鏁伴噺锛岀壒鍒槸褰撴牱鏈噺杈冨皬鏃讹紝寰堝鏄撹繃鎷熷悎锛氱敱浜庘€滅淮鏁扮伨闅锯€?锛孠NN涔熸洿瀹规槗杩囨嫙鍚堛€傝櫧鐒朵娇鐢ㄤ簡鐗瑰緛閫夋嫨鍜岄檷缁存妧鏈潵闃叉杩欑鎯呭喌鐨勫彂鐢熷叚銆乸ython浠g爜瀹炵幇锛堟墜鍐欏瓧浣撹瘑鍒級闇€瑕佽皟鐢ㄥ寘锛堝彲浠ュ皾璇曚娇鐢╬ytorch鎻愰珮璁$畻閫熷害锛屽師鐞嗕篃寰堢畝鍗昿ytorch鍙互涓巒umpy鍑犱箮鏃犵紳瀵规帴)importtorchimportnumpyasnpiportoperatorfromosimportlistdir鏍稿績浠g爜defKNN(inx,dataset,labels,k,distances_way):"""inx:杈撳叆寰呭垎绫绘暟datset:杈撳叆鏍锋湰璁粌闆嗘爣绛?labelvectork:閫夋嫨鏈€杩戠殑閭诲眳涓暟锛屽叾涓璴abel涓暟涓庣煩闃礵ataset鐨勮鏁扮浉鍚宒istance:璁$畻璺濈鐨勬柟寮?""#calculatethedistancedatasize_h=dataset.shape[0]inx=np.tile(inx,(datasize_h,1))#灏唅nx缁村害灞曞紑涓轰笌鏁版嵁闆嗗舰鐘剁浉鍚岀殑鐭╅樀ifdistances_way==str('o'):#娆ф皬璺濈diffmat=inx-鏁版嵁闆唖q_diffmat=diffmat**2sq_distances=sq_diffmat.sum(axis=1)distance=sq_distances**0.5elifdistances_way==str('man'):#鏇煎搱椤胯窛绂籨iffmat=inx-鏁版嵁闆哸bs_diffmat=abs(diffmat)distance=abs_diffmat.sum(axis=1)elifdistances_way==str('min'):#Minkowskidistancep=int(input('inputpvalue:'))diffmat=inx-datasetsq_diffmat=diffmat**2sq_distances=sq_diffmat.sum(axis=1)distance=sq_distances**(1/p)distance_sort=distance.argsort()#鎸夌収璺濈浠庡皬鍒板ぇ鎺掑簭#returndistance_sort#灏嗘帓搴忓悗鐨勮窛绂诲搴斿埌鎴戜滑鐨刲abel涓婏紝浣跨敤鍝堝笇琛╠ic={}foriinrange(k):diff_label=labels[distance_sort[i]]dic[diff_label]=dic.get(diff_label,0)+1dic_sort=sorted(dic.items(),key=operator.itemgetter(1),reverse=True)returndic_sort[0][0]鎵嬪啓瀛椾綋璇嗗埆锛堝垪瀛愭潵鑷€婃満鍣ㄥ涔犲疄鎴樸€嬶紝鑷繁鍋氫簡涓€浜涙敼鍔級#灏?2*32杞寲涓?*1024鐨勭煩闃碉紝涔熷彲浠ヨ浆鍖栧彉鎴愪竴涓?2*32鐨勭煩闃礵efdata_read(path):data=open(path)data_use=np.zeros((1,1024))foriinrange(32):data_line=data.readline()#璇诲彇姣忎竴琛宖orjinrange(32):data_use[0,32*i+j]=int(data_line[j])returndata_use#Convertthe32*32textfileintoa32*32matrix#defdata_read(path):#data=open(path)#data_use=np.zeros([32,32])#data_narry=np.array(data)#forjinrange(len(a)):#foriinrange(len(a)):#data_use[i][j]=a[i][j]#returndata_usedefhandwritingClassTest():"""hwLabels:鎵嬪啓鏁板瓧鐨勭湡瀹炲€糾:璁粌闆嗙殑鏂囦欢鏁癿Tset:娴嬭瘯闆嗙殑鏂囦欢鏁皌rainingMat:瀛樺偍璁粌闆嗙殑鎵€鏈変竴缁存暟鎹?""hwLabels=[]trainingFileList=listdir('D:/Github-code/python-gogogo/鏈哄櫒瀛︿範/鍒嗙被闂/K杩戦偦绠楁硶/鎵嬪啓鏁版嵁鏂囦欢/training_handwriting')m=len(trainingFileList)#m=1934trainingMat=np.zeros((m,1024))foriinrange(m):#鎷嗗垎鏂囦欢鍚?-->寮€濮嬪彧鍙栨枃浠跺悕鐨勭涓€涓瓧绗︼紙瀵瑰簲瀹炴暟锛塮ileNameStr=trainingFileList[i]#鑾峰彇鎵€鏈夋枃浠跺悕StringfileStr=fileNameStr.split('.')[0]classNumStr=int(fileStr.split('_')[0])hwLabels.append(classNumStr)#鎷嗗垎鏂囦欢鍚?-->endtrainingMat[i,:]=data_read('D:/Github-code/python-gogogo/鏈哄櫒瀛︿範/鍒嗙被闂/K杩戦偦绠楁硶/鎵嬪啓鏁版嵁鏂囦欢/training_handwriting/%s'%fileNameStr)#returntrainingMattestFileList=listdir('D:/Github-code/python-gogogo/鏈哄櫒瀛︿範/鍒嗙被闂/K杩戦偦绠楁硶/鎵嬪啓鏁版嵁鏂囦欢/test_handwriting')errorCount=0.0mTest=len(testFileList)fori鍦ㄨ寖鍥村唴锛坢Test锛夛細fileNameStr=testFileList[i]fileStr=fileNameStr.split('.')[0]classNumStr=int(fileStr.split('_')[0])vectorUnderTest=data_read('D:/Github-code/python-gogogo/鏈哄櫒瀛︿範/鍒嗙被闂/K杩戦偦绠楁硶/鎵嬪啓鏁版嵁鏂囦欢/test_handwriting/%s'%fileNameStr)print(vectorUnderTest.shape)classifierResult=KNN(vectorUnderTest,trainingMat,hwLabels,3,'o')print("KNN鍒嗙被缁撴灉锛?d锛屽疄闄呯粨鏋滐細%d"%(classifierResult,classNumStr))if(classifierResult!=classNumStr):errorCount+=1.0print("\n閿欒鏁颁釜鏁帮細%d"%errorCount)print("\n閿欒鐜囷細%f"%(errorCount/float(mTest)))7.涓汉鎬濊€冩棦鐒惰繃绋嬩腑娑夊強鍒皃ytorch锛岃€屾垜浠仛鐨勬槸鈥滄墜鍐欏瓧浣撹瘑鍒€濓紝閭d箞涓嶅Θ灏濊瘯浣跨敤涓€涓嬭嚜宸辨媿涓€寮犵収鐗囷紝鐒跺悗璇嗗埆鑷繁鐨勬墜鍐欏瓧浣擄紙鍥剧墖浜屽€煎寲锛屾垜鏆傛椂鎯冲埌杩欎釜锛屽洜涓烘繁搴﹀涔犳湰韬笉鏄お鎳傦級锛屽鏋渆鐢ㄧ殑鏄疜NN绠楁硶锛屾晥鐜囧彲鑳芥病鏈塁NN绠楁硶楂橈紝杩欓噷闈㈠彲鑳借繕娑夊強鍒颁簡opencv搴撶殑浣跨敤濡傛灉澶у鍦ㄨ瘎璁哄尯鏈変粈涔堟兂娉曪紝杩樹笉濡傝涪鎴戝搱鍝堝搱鍝堛€傪煒€馃榾鏂囩珷鏈変笉瓒充箣澶勮鐣欒█銆傚弬鑰?.https://www.ibm.com/cn-zh/topics/knn#:~:text=k%2D%E6%9C%80%E8...,%E6%9C%80%E5%B8%B8%E8%A1%A8%E7%A4%BA%E7%9A%84%E6%A0%87%E7%AD%BE%E3%80%822锛屾潕鑸€婄粺璁″瀛︿範鏂规硶绗簩鐗堛€?/p>