Cybersecurity used to be about preventing access to a hermetically sealed environment: the corporate network. Today, security must contend with a decentralized IT environment as employees access data in different applications, from different locations, and on different devices. The cybersecurity challenge has evolved from keeping data inside the network perimeter to preventing data leakage as it travels across access points.

Cloud is the grand enabler of this new way of work and the next generation of threats. Passwords are all too frequently the last (only) line of defense. Employees often reuse passwords between consumer and enterprise cloud applications, and the best phishing scams succeed nearly half the time.

For a sense of the scale of this issue, 76.3% of companies have at least one compromised account incident per month in the cloud. Now that cloud services have become systems of record for corporate data, the stakes are high to prevent access from attackers or rogue employees.

A Needle in a Constantly Growing Haystack

In almost every publicized breach, security analysts ignored the crucial alerts due to the copious amounts of false alarms triggered on a daily basis. While most enterprises spend a majority of their security budget on preventative measures, such as firewalls, strong user authentication, intrusion prevention, antivirus systems, etc., beating these defense systems has become routine for hackers. Attackers often explore systems undetected for extended periods of time.

Data science techniques can help security teams identify the needles in increasingly large haystacks. Gartner predicts that by 2017, at least 60% of major cloud access security broker (CASB) vendors and 25% of major SIEM and DLP vendors will incorporate advanced analytics and user and entity behavior analytics (UEBA) functionality into their products, either through acquisitions, partnerships, or natively. Essentially, UEBA brings statistic profiling and anomaly detection based on machine learning to security.

UEBA Arrives in the Cloud

Cloud services log a tremendous amount of information about the activities employees perform beyond a simple login. UEBA can analyze a wide range of raw data including service action, service action category, service action objects, number of bytes downloaded or uploaded, number of times a service is accessed, rate of access or time of access, etc. measured either across one service action, a cloud service provider (CSP), or a homogenous group of either service actions or cloud service providers.

From this wealth of information, UEBA helps companies build behavioral models for cloud services and continuously monitor for behavior that deviates, even in non-obvious ways, from the norm. For example, machine learning can be applied to profile and baseline the activity of users, peer groups, and other entities; form peer groups based upon common user activities, using directory groupings, and human resources information; correlate user and other entity activities and behaviors; and detect anomalies using statistical models or rules that compare activity to profiles.

How does UEBA cut down on noisy false-positives? In other words, what makes this technique any different from existing monitoring systems which produce an overwhelming amount of alerts? In applying these powerful tools to the cloud, we stayed true to the building blocks of modern UEBA. The following principles differentiate a UEBA threat protection approach:

Behavior Models

Complex higher-order polynomials are used, which in turn generate an information dense representation of the data. Cloud UEBA solutions also need to deal with sparse usage data, especially for cloud services in the long tail, i.e. unsanctioned and less-used apps.

User Groups

Automated data-driven group detection identifies users with similar behavior across cloud services. Simultaneously analyzing a user with an individual and group a model yields a tighter control on expected behavior, minimizing false-positives.

Time Evolving

Cloud UEBA solutions should allow evolution over time (to absorb policy changes, user preference changes, etc.). By segregating user behavior into patterns that span across time, models are guaranteed to be stable, robust, and stationary.


A UEBA solution can also dynamically index user risk and cloud service risk and compare it with the evolving nature of cloud service usage to generate anomalies across all users within an enterprise. Data obtained from the multi-facetted behavioral analysis can then be combined with dynamic indexing to render an active self-learning module.

Putting a Face to the Numbers

What do these capabilities look like in terms of real world functionality? Essentially, UEBA adapts to understand how employees use cloud services. There are standard anomalies like an employee who downloads an extremely large amount of data in one day or uploads data to a high-risk cloud service.

With the aforementioned strategies, however, UEBA can put the two together and alert security when the same employee uploads downloads a large amount of data and accesses a high-risk cloud service on the same day – behavior which reaches a higher risk threshold.

In another example, UEBA can identify an administrator’s baseline behavior differently from the average marketing employee so that a false alarm is not triggered every time the admin pulls a weekly report. Similarly, algorithms can identify trusted locations and flag as an anomaly an incident in which the same user accesses a service from Los Angeles and then thirty minutes later from Singapore.

Hackers can steal passwords, social security numbers, and even fingerprints. What they cannot steal is how employees actually behave. Machine-learning and UEBA can help cut through the noise that has made security alerts useless. The more streams of information feeding into a behavioral model, the more accurately security algorithms can pick out real threats.

As cloud offerings become more mature, companies will benefit from improved APIs with more and more event types and contextual information. Whereas enterprises initially were afraid of giving up visibility when moving to the cloud, they can now benefit from cutting edge data science to monitor for risks.

(About the author: Sekhar Sarukkai is chief scientist at Skyhigh Networks)