In cybersecurity, the big challenge is to be able to protect against the millions of new malware variants that are launched daily.

Although nearly all of the zero-day threats are based on extremely small mutations of known malware (by some estimates, the vast majority of new malware are mutated by less than 2% in comparison with known malware), many security solutions are incapable of detecting them because they rely on manually-tuned heuristics for creating handcrafted signatures. This process is time-consuming and reactive, leaving organizations vulnerable until the new signature is released.

Newer solutions such as those based on analysis of the behavioral characteristics of the malware at runtime, or sandboxing solutions that execute the malware in a virtual (sandbox) environment to determine whether it is malicious or not, like legacy solutions present critical limitations in their ability to provide real-time detection. As a result, their detection often comes too late, once the malware has already caused damage.

Cybersecurity solutions that apply machine learning artificial intelligence utilize manually selected features, which are then fed into classical machine learning modules to classify the file as malicious or benign. But despite improvements in the rate and pace of detection, they are still lacking.

Deep learning is the next step in artificial intelligence. It is also known as neural networks because it is “inspired” by the brain’s ability to learn to identify objects. Similar to the way our brain is fed with raw data from our sensory inputs and learns the high-level features on its own, in deep learning, raw data is fed through the deep neural network, which then learns on its own to identify the object on which it is trained.

Recent advancements in deep learning have become possible as a result of major algorithmic improvements, and their implementation on graphical processing units (GPUs), which provide tremendously improved computational capabilities. The advancement in deep learning has enabled technologies that leverage deep learning to exhibit amazing results across applications, such as object, facial, and speech recognition.

When applied to cybersecurity, it takes milliseconds to feed a raw data file and pass it through the deep neural network to obtain detection with the highest accuracy rate. This predictive capability of being able to detect a never- before seen malware variant enables not only extremely accurate detection, but also leads the way to real-time prevention because at the very second a malicious file is detected, it is already blocked.

Therefore, while traditional machine learning yields better results than signatures and manual heuristics, deep learning has shown groundbreaking results in detecting first-seen malware, even compared with classical machine learning. This observation is consistent with improvements achieved by deep learning in other fields, such as computer vision, speech recognition, text understanding, etc.

Furthermore, with deep learning, as opposed to classic machine learning, instead of conducting manual feature engineering, datasets of many millions of malicious and legitimate files are fed into the infrastructure, without any human intervention of feature selection. This enables the technology to learn on its own what are the useful high-level, non-linear features necessary for accurate classification.

Furthermore, due to the input-agnostic nature of deep learning, any malicious file type (e.g., EXE, DLL, PDF, DOC, Android APK, etc.) is detected. The result is comprehensive protection on any device, platform, and operating system.

(About the author: Guy Caspi is CEO of Deep Instinct)

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access