《電子技術應用》
您所在的位置:首頁 > 通信與網絡 > 設計應用 > 基于聚類的HTTP/HTTPS協(xié)議資產發(fā)現(xiàn)
基于聚類的HTTP/HTTPS協(xié)議資產發(fā)現(xiàn)
電子技術應用
馬琰1,2,蘇馬婧1,2,姚旺君1,2,權曉文3,劉紅1,2
1.中國信息安全研究院有限公司;2.華北計算機系統(tǒng)工程研究所;3.遠江盛邦(北京)網絡安全科技股份有限公司
摘要: 網絡探測掃描是發(fā)現(xiàn)網絡資產的重要方法,在探測結果中HTTP/HTTPS協(xié)議占比較高,是重要的互聯(lián)網資產識別來源。隨著網絡環(huán)境的日益復雜,應用HTTP/HTTPS協(xié)議的資產種類和數(shù)量也在急劇增加,這使得傳統(tǒng)基于指紋規(guī)則的網絡資產識別方法面臨著識別效率低、適應性差等問題,無法滿足HTTP/HTTPS協(xié)議識別的需要。因此,提出了一種新型HTTP/HTTPS協(xié)議資產發(fā)現(xiàn)方法,通過自動化規(guī)則生成器對HTTP/HTTPS協(xié)議響應數(shù)據進行處理,并基于詞頻統(tǒng)計和相似度信息對原始數(shù)據進行預過濾,利用文本編碼模型實現(xiàn)對HTTP/HTTPS協(xié)議響應體信息的文本編碼和特征融合,結合無監(jiān)督聚類算法實現(xiàn)對HTTP/HTTPS協(xié)議資產的發(fā)現(xiàn)。實驗結果表明,所提出的方法能夠顯著提高HTTP/HTTPS協(xié)議資產發(fā)現(xiàn)效率,提升資產標注速度,并可在無先驗知識下發(fā)現(xiàn)未知資產。
中圖分類號:TP393.08 文獻標志碼:A DOI: 10.16157/j.issn.0258-7998.256341
中文引用格式: 馬琰,蘇馬婧,姚旺君,等. 基于聚類的HTTP/HTTPS協(xié)議資產發(fā)現(xiàn)[J]. 電子技術應用,2025,51(11):98-106.
英文引用格式: Ma Yan,Su Majing,Yao Wangjun,et al. HTTP/HTTPS protocol asset discovery based on clustering[J]. Application of Electronic Technique,2025,51(11):98-106.
HTTP/HTTPS protocol asset discovery based on clustering
Ma Yan1,2,Su Majing1,2,Yao Wangjun1,2,Quan Xiaowen3,Liu Hong1,2
1.China Information Security Research Institute Co., Ltd.;2.National Computer System Engineering Research Institute of China;3.WebRAY Tech (Beijing) Co., Ltd.
Abstract: Network probing and scanning is an essential method for discovering network assets, with HTTP/HTTPS protocols representing a significant proportion of the discovery results and serving as a key source for identifying Internet assets. As the network environment becomes increasingly complex, the variety and volume of assets utilizing the HTTP/HTTPS protocol have grown rapidly, which poses challenges for traditional network asset identification methods based on fingerprinting rules. These conventional approaches suffer from low recognition efficiency and poor adaptability, making them inadequate for identifying HTTP/HTTPS protocol assets. Therefore, this paper proposes a novel method for discovering HTTP/HTTPS protocol assets. The approach processes HTTP/HTTPS response data through an automated rule generator, performs pre-filtering of the raw data based on term frequency statistics and similarity information, and applies a text encoding model to encode the HTTP/HTTPS response body and fuse the features. By integrating an unsupervised clustering algorithm, this method enables the discovery of HTTP/HTTPS protocol assets. Experimental results show that the proposed method significantly improves the efficiency of HTTP/HTTPS protocol asset discovery, accelerates asset labeling, and enables the discovery of unknown assets without prior knowledge.
Key words : network asset discovery;HTTP/HTTPS protocols;automated rule generation;unsupervised clustering;Word2Vec;DBSCAN

引言

在數(shù)字化轉型的推動下,網絡資產的種類和數(shù)量呈指數(shù)級增長,網絡安全面臨日益復雜的挑戰(zhàn)。網絡資產不僅包括傳統(tǒng)的網絡設備(如網絡攝像頭、防火墻),還擴展至各種內容管理系統(tǒng)和網絡服務。當前,網絡資產識別主要依賴基于靜態(tài)指紋規(guī)則匹配的方法,這種方法雖然在已知類型資產的識別中表現(xiàn)良好,但其局限性同樣明顯:首先,指紋規(guī)則構建和維護依賴于專家經驗和大量人力資源投入;其次,基于靜態(tài)指紋庫的方法在面對新型設備時響應速度緩慢,導致對未知類型資產的識別率顯著降低。這些缺陷限制了當前基于指紋規(guī)則匹配的資產識別技術的有效性和適應性。

為解決上述問題,本文創(chuàng)新性地提出了一種針對HTTP/HTTPS協(xié)議網絡資產的發(fā)現(xiàn)方法,通過自動化規(guī)則生成器對主動探測所采集到的HTTP/HTTPS協(xié)議數(shù)據進行指紋規(guī)則生成和數(shù)據過濾,配合無監(jiān)督聚類方法實現(xiàn)對網絡資產數(shù)據按共同特征進行劃分,以實現(xiàn)協(xié)議的自動發(fā)現(xiàn),此方法可以發(fā)現(xiàn)未知資產,提高標注效率。本文提出的自動化規(guī)則生成器基于層次化分組策略,逐步對數(shù)據集進行細化,提煉具有高區(qū)分度的特征字段并構建可以進行粗分類的指紋規(guī)則,以過濾掉無共性資產特征的數(shù)據。針對HTTP/HTTPS響應頭部字段的多樣性,本文對大規(guī)模探測結果數(shù)據集進行了統(tǒng)計分析并結合專家經驗,篩選出了21個響應頭部字段用于生成自動化過濾規(guī)則,設計了自動化規(guī)則生成器;在此基礎上,對經預過濾后的數(shù)據,設計了面向HTTP/HTTPS響應體信息的多特征融合資產聚類算法,該算法采用Word2Vec[1]進行特征編碼,將處理后的數(shù)據轉化為特征向量,結合特征融合技術與DBSCAN[2]聚類技術,在多維特征空間中進行高效聚類以實現(xiàn)對潛在資產的發(fā)現(xiàn)。最后,本文通過實驗驗證了所提方法的有效性。此方法不僅提高了HTTP/HTTPS協(xié)議資產發(fā)現(xiàn)的效率,還能夠有效發(fā)現(xiàn)未知資產,進而提高指紋標注和規(guī)則提取的效率。


本文詳細內容請下載:

http://ccf-cncc2011.cn/resource/share/2000006847


作者信息:

馬琰1,2,蘇馬婧1,2,姚旺君1,2,權曉文3,劉紅1,2

(1.中國信息安全研究院有限公司,北京 102200;

2.華北計算機系統(tǒng)工程研究所,北京 100083;

3.遠江盛邦(北京)網絡安全科技股份有限公司,北京 100084)


subscribe.jpg

此內容為AET網站原創(chuàng),未經授權禁止轉載。