Hybrid measures to beat phish

A hybrid algorithm that used machine learning to feed off statistical induction ratios can spot malicious web pages known as phishing sites and so alert unwary users to the possibility that their data, privacy or security may be compromised before they access such sites Details are published in the International Journal of Data Mining, Modelling and Management.

Hiba Zuhair of Al-Nahrain University, in Baghdad, Iraq, and Ali Selamat of the Universiti Teknologi Malaysia (UTM), Johor, Malaysia, explain how there are some very powerful machine learning systems that can detect phishing sites. However, the criminal creators of such websites are rather wily and there are always novel page structures and coding that might be missed by such protection systems on the day when the new malware site is first launched and the early unwitting users get hooked. To preclude users falling for such zero-hour phishing sites there is an urgent need for an adaptive approach that can spot the problem even with novel sites.

As such, “Phishing induction must be boosted up with the extraction of new features, the selection of robust subsets of decisive features, the active learning of classifiers on a big webpage stream,” the team writes. Their two-pronged algorithmic defence provides a more holistic way to detect phishing sites. They have demonstrated efficacy against existing machines learning-based anti-phishing techniques. The team hopes that their analysis of earlier approaches and the method they suggest could provide a new “taxonomy” for the development of more effective still protection against this ubiquitous security problem in the digital realm.

Zuhair, H. and Selamat, A. (2020) ‘Phish webpage classification using hybrid algorithm of machine learning and statistical induction ratios’, Int. J. Data Mining, Modelling and Management, Vol. 12, No. 3, pp.255–276.