《計算機應用研究》|Application Research of Computers

一種基于遺傳算法優化的大數據特征選擇方法

Using genetic algorithm for feature selection optimization on big data processing

免費全文下載 (已被下載 次)  
獲取PDF全文
作者 張文杰,蔣烈輝
機構 1.解放軍信息工程大學 網絡空間安全學院,鄭州 450001;2.數字工程與先進計算國家重點實驗室,鄭州 450001
統計 摘要被查看 次,已被下載
文章編號 1001-3695(2020)01-010-0050-03
DOI 10.19734/j.issn.1001-3695.2018.05.0495
摘要 提出了一種基于遺傳算法的大數據特征選擇算法。該算法首先對各維度的特征進行評估,根據每個特征在同類最近鄰和異類最近鄰上的差異度調整其權重,基于特征權重引導遺傳算法的搜索,以提升算法的搜索能力和獲取特征的準確性;然后結合特征權重計算特征的適應度,以適應度作為評價指標,啟動遺傳算法獲取最優的特征子集,并最終實現高效準確的大數據特征選擇。通過實驗分析發現,該算法能夠有效減小分類特征數,并提升特征分類準確率。
關鍵詞 大數據; 特征選擇; 遺傳算法; 特征子集
基金項目 河南省基礎前沿課題
河南省科技攻關計劃項目
本文URL http://www.048285.live/article/01-2020-01-010.html
英文標題 Using genetic algorithm for feature selection optimization on big data processing
作者英文名 Zhang Wenjie, Jiang Liehui
機構英文名 1.Faculty of Cyberspace Security,PLA Information Engineering University,Zhengzhou 450001,China;2.State Key Laboratory of Mathematical Engineering & Advanced Computing,Zhengzhou 450001,China
英文摘要 This paper proposed a novel feature selection method based on genetic algorithm for big data processing. Firstly, this method evaluated the features of each dimension, adjusted its weight according to the difference of each feature on the si-milar nearest neighbor and the heterogeneous nearest neighbor, and guided the search of genetic algorithm based on the feature weight, thus improved the search ability of the algorithm and the accuracy of feature acquisition. And then it combined the feature weights to calculate the fitness of the feature, took fitness as the evaluation index, and started the genetic algorithm to obtain the optimal feature subset, finally achieved an efficient and accurate big data feature selection. The results of experiment show that this method can effectively reduce the number of classification features and improve the accuracy of feature classification.
英文關鍵詞 big data; feature selection; genetic algorithm; feature subset
參考文獻 查看稿件參考文獻
 
收稿日期 2018/5/6
修回日期 2018/6/28
頁碼 50-52,56
中圖分類號 TP391
文獻標志碼 A
012曾道人三尾中特书