Cloud, containers and AI: What do they mean for your data
8 tips for managing data and applying analytics to get the most out of artificial intelligence and cloud computing investments.
Leveraging data for business success
As organizations look to best leverage these fast growing technologies, they you need to understand the data requirements of each and the common elements that can improve the success and efficiency for all of them. In the following slides, Jack Norris, senior vice president of data and applications at MapR, shares insights on how to make the technologies work together to help drive organizational success.
Pay attention to the data in the cloud.
“Embracing the cloud is an important element for most organizations but how you embrace and what you do to streamline data logistics is critical to a successful strategy,” Norris says. “Companies are quickly embracing a number of key data initiatives, and it's important to look at these holistically and not as separate initiatives. A well thought out data strategy also extends to an organization’s ability to effectively leverage containers and AI.”
AI is a journey. Data can make it a footpath or a speedway.
“AI is not a collection of separate, isolated projects – it is a continuous series of steps as new technology and tools emerge,” Norris stresses. “How companies store, process and apply data is at the core of the biggest technology driven advances for decades. The companies that are using these new techniques are gaining share. In fact, according to Forrester Research by 2020, businesses adopting machine learning, AI, and deep learning, the Internet of Things (IoT), and Big Data will take away more than $1.2 trillion from their less-informed peers.”
Don’t try to standardize on data science tools.
“The average organization has five or more machine learning tools used in its organization,” Norris says. “With new tools and algorithms being introduced on a frequent basis – what is common, however, is the data. The data logistics, in how data is acquired, delivered and continues to flow is an essential ingredient to machine learning success and can make the use of multiple tools much easier and more successful.”
90 percent of machine learning success is driven by data logistics.
“Most data scientists spend 90 percent of their time on ‘data wrangling,’ Norris says. “Data logistics can speed and smooth their success. Many companies have a shortage of data scientists on staff. The simplest way to expand your bench is make the existing ones 5 to 10 times more productive.”
Containerize everything…But
“Containers are a method to improve the efficiency and elasticity of workloads, speeding development, deployment and migration,” Norris explains. “Containers are also being used with data science. Containers that use notebooks and can make data science collaboration easier and deployment of models easier in the cloud and at the edge. Containerizing stateless applications is straightforward and relatively easy. Stateful applications, including applications that share data (yes, think data science) is much more difficult.”
Containerizing everything requires a complementary data layer.
“Industry analysts caution that containerizing workloads that are stateful and share data add an extra dimension of complexity and obstacle to success,” Norris says. “Customers are adopting solutions that serve as dataware to support containerized data access using the Kubernetes volume driver.”
The Edge will eat the cloud.
“As workloads are getting more distributed, edge deployments are becoming strategic,” Norris says. “Leading organizations in automotive, oil and gas, healthcare and manufacturing are already tapping into the edge to drive new business models and revenue streams. To fully take advantage of these devices requires the coordination of data and analytics from the edge to the core.”
When in doubt focus on the data.
“Cloud, containers and AI all require coordinating data flows and supporting analytics to drive intelligent action,” Norris concludes. “Start anywhere but pay attention to the data and understand how you will handle data flows and extend from the edge to the core. It is critical to ensuring success with your first initiative and will set the foundation as you expand support.”