17 top data science and machine learning platforms
RapidMiner, TIBCO Software, SAS and KNIME are among the leading providers of data science and machine learning products, according to the latest Gartner Magic Quadrant report.
About this Magic Quadrant report
Gartner Inc. has released its "Magic Quadrant for Data Science and Machine Learning Platforms," which looks at software products that enable expert data scientists, citizen data scientists and application developers to create, deploy and manage their own advanced analytic models. According to Gartner analysts and report authors Carlie Idoine, Peter Krensky, Erick Brethenoux and Alexander Linden, "We define a data science platform as: A cohesive software application that offers a mixture of basic building blocks essential for creating all kinds of data science solutions, and for incorporating those solutions into business processes, surrounding infrastructure and products." Here are the top performers, categorized as Leaders, Challengers, Visionaries or Niche Players.
According to the Gartner analysts, “Leaders have a strong presence and significant mind share in the data science and ML market. They demonstrate strength in depth and breadth across the full data exploration, model development and operationalization process. While providing outstanding service and support, Leaders are also nimble in responding to rapidly changing market conditions. The number of expert and citizen data scientists using Leaders’ platforms is significant and growing. Leaders are in the strongest position to influence the market’s growth and direction. They address the majority of industries, geographies, data domains and use cases, and therefore have a solid understanding of, and strategy for, this market.”
“KNIME is based in Zurich, Switzerland. It provides the KNIME Analytics Platform on a fully open source basis for free, while a commercial extension, KNIME Server, offers more advanced functions, such as team, automation and deployment capabilities,” the report states. Among its strengths: “Well-balanced execution and vision. With a wealth of well-rounded functionality, KNIME maintains its reputation for being the market’s ‘Swiss Army knife.’ Its for-free and open-source KNIME Analytics Platform covers 85 percent of critical capabilities, and KNIME’s vision and roadmap are as good as, or better than, those of most of its competitors.”
RapidMiner is based in Boston, MA. Its platform includes RapidMiner Studio, RapidMiner Server, RapidMiner Cloud, RapidMiner Real-Time Scoring and RapidMiner Radoop. “RapidMiner remains a Leader by striking a good balance between ease of use and data science sophistication,” the Gartner analysts say. “Its platform’s approachability is praised by citizen data scientists, while the richness of its core data science functionality, including its openness to open-source code and functionality, make it appealing to experienced data scientists, too.”
SAS is based in Cary, NC. “It provides many software products for analytics and data science. For this Magic Quadrant, we evaluated SAS Enterprise Miner (EM) and SAS Visual Data Mining and Machine Learning (VDMML),” the Gartner analysts explain. “SAS’s Completeness of Vision is in the same class as many highly innovative competitors, but the company is falling behind in key areas such as deep learning and contributions to the open-source community.” Still, “SAS’s long market presence and considerable staying power have earned it much respect from customers. Many reference customers praised its products’ quality, stability and reliability. That solidity might have come at the expense of a few advances (such as quick adoption of open-source capabilities), but it has not prevented SAS from innovating and staying on a par with many of its newer competitors.”
TIBCO Software is based in Palo Alto, CA. “Through the acquisition of enterprise reporting and modern BI platform vendors (Jaspersoft and Spotfire), descriptive and predictive analytics platform vendors (Statistica and Alpine Data), and a streaming analytics vendor (StreamBase Systems), TIBCO has built a well-rounded and powerful analytics platform,” the Gartner analysts explain. On a single platform, TIBCO brings together powerful visualization capabilities, strong descriptive analytics and visionary predictive analytics features (from Statistica and Alpine Data, now rebranded as Spotfire Data Science). At the same time, TIBCO has maintained its platform’s necessary extensibility to open-source environments.”
According to the Gartner analysts, “Challengers have an established presence, credibility, viability and robust product capabilities. They may not, however, demonstrate thought leadership and innovation to the same degree as Leaders. There are two main types of Challenger: Long-established data science and ML vendors that succeed because of their stability, predictability and long-term customer relationships; Vendors established in adjacent markets, such as the analytics and BI, data and analytics service provider, and developer tool markets, which are entering the data science and ML market with solutions that extend their current platforms. Challengers are well-placed to succeed in this market as it is currently defined and are operating effectively within current market conditions. Their vision and roadmap, however, may be impaired by a lack of market understanding, excessive focus on short-term gains, strategy- and product-related inertia, and a lack of innovation.
Alteryx is based in Irvine, CA. It provides four software products, which comprise its data science platform. The Alteryx Analytics platform includes Alteryx Connect, Alteryx Designer, Alteryx Server and Alteryx Promote. "Alteryx’s emphasis on making data science accessible to citizen data scientists and others across the end-to-end analytic pipeline is resonating in the market," the Gartner analysts note. "Its approach provides a natural extension for a client base focused on data preparation but ready to take the next step into data science. Alteryx has focused on offering a complete, end-to-end data science platform. It has added two new products to its platform. Alteryx Connect focuses on data connections, data discovery and social connections. Alteryx Promote incorporates Alteryx’s Yhat acquisition and focuses on operationalizing analytic content."
Dataiku is headquartered in New York City, and has a main office in Paris, France. It offers Data Science Studio (DSS) with a focus on cross-discipline collaboration and ease of use. "Dataiku’s appearance in the Challengers quadrant is principally due to its strong execution and strengthening capabilities in relation to scalability," the Gartner analysts say. "A focus on real-time analytics capabilities and expansion of the breadth of its use cases could move Dataiku into the Leaders quadrant. Ease of use and collaboration across data science roles and between data science teams remain two of its platform’s major assets. The qualities of Dataiku most often highlighted by clients are that its platform is relatively easy to learn and that it provides a rapid path to productivity."
According to the Gartner analysts, “Visionaries are typically relatively small vendors or newer entrants representative of trends that are shaping, or have the potential to shape, the market. There may, however, be concerns about these vendors’ ability to keep executing effectively and to scale as they grow. They are typically not well known in this market, and therefore often have lower momentum, relative to Challengers and Leaders. Visionaries not only have a strong vision, but also a solid supporting roadmap. They are innovative in their approach to addressing the market’s needs. Although their offerings are typically innovative and solid in terms of the capabilities they do provide, there are often gaps in these offerings’ completeness and breadth.”
Databricks is based in San Francisco, CA. “Its Apache Spark-based Unied Analytics Platform combines data engineering and data science capabilities that use a variety of open-source languages,” the Gartner analysts say. “In addition to Spark, the platform provides proprietary features for security, reliability, operationalization, performance and real-time enablement on Amazon Web Services (AWS). Azure Databricks, which became generally available in March 2018, is an integrated service within Microsoft Azure that provides a high-performance Apache Spark-based platform optimized for Azure. Databricks remains a Visionary by providing support for the end-to-end analytic life cycle, hybrid cloud environments and accessibility for a wide variety of users.”
DataRobot is based in Boston, MA. It provides an augmented data science and ML platform. The platform automates key tasks, enabling data scientists to work efficiently and citizen data scientists to build models easily. According to the Gartner analysts, “DataRobot sets the standard for augmented data science and ML. Significant funding has enabled expansion via acquisitions to address time series modeling (Nutonian in May 2017) and an augmented approach for developers to incorporate models into applications (Nexosis in July 2018). These acquisitions give DataRobot the opportunity to extend its capabilities to new types of user, while focusing on its core competency of augmentation.”
Google, a subsidiary of Alphabet, is based in Mountain View, CA. Its core ML platform offerings include Cloud ML Engine, Cloud AutoML, the open-source TensorFlow, and the recently announced BigQuery ML. According to the Gartner analysts, “Its ML components require other Google components for end-to-end capabilities, such as Google Cloud Dataprep, Google Datalab, Google BigQuery, Google Cloud Data…ow, Google Cloud Dataproc, Google Data Studio, Kube…ow and Google Kubernetes Engine. Most of these components require the presence of the Google Cloud Platform (GCP). Google offers a rich ecosystem of AI products and solutions, ranging from hardware (Tensor Processing Unit [TPU]) and crowdsourcing (Kaggle) to world-class ML components for processing unstructured data like images, video and text. Google is also one of the pioneers of automated ML (with Cloud AutoML). It excels even more with its industryleading open-source TensorFlow offering for deep neural nets.”
H2O.ai is based in Mountain View, CA. and offers the free open-source H2O OpenSource Machine Learning (H2O, Sparkling Water and H2O4GPU) and a commercial product called H2O Driverless AI. “H2O’s core strength is its high-performing ML components, which are tightly integrated within several competing platforms evaluated in this Magic Quadrant,” the Gartner analysts explain. H2O.ai’s open-source ML components are effectively an industry standard, with many other platforms integrating them (for example, those of Alteryx, Dataiku, Domino, IBM, KNIME, RapidMiner and TIBCO Software). H2O.ai’s components are highly optimized and parallelized for CPU multicore and multinode configurations. H2O4GPU offers a software layer for significant GPU acceleration.”
IBM is based in Armonk, NY. “For this Magic Quadrant we evaluated two platforms: SPSS (including SPSS Modeler and SPSS Statistics) and Watson Studio, an offering that incorporates and builds on IBM’s previous Data Science Experience (DSX) product,” the Gartner analysts explain. “Its strategy, focused on the complete analytic pipeline, enables both expert and citizen data scientists to be productive. Watson Studio and its roadmap promise to deliver extensive openness, hybrid cloud support and strong analytic capabilities for both expert and citizen data scientists across the full analytic pipeline. Watson Studio provides a new, more modern approach, while continuing not only to support, but also to extend, the capabilities of SPSS.”
MathWorks is headquartered in Natick, MA. Its two major products are MATLAB and Simulink, but only MATLAB met the inclusion criteria for this Magic Quadrant. “To serve this growing market, MathWorks has strengthened the coherence of the MATLAB platform for its engineering-focused audience by seamlessly integrating advanced functionality for the treatment of unconventional data sources (images, video and IoT data),” the Gartner analysts explain. “Although MathWorks focuses on asset-centric industries, it also has customers in the financial services sector. Built from an engineering perspective, MATLAB offers a seamless experience, with operationalization as a fully integrated step. Mainly focused on industrial applications, MathWorks takes account of field personnel and subject matter experts’ experiences through a ‘citizen engineer’ approach.”
Microsoft is based in Redmond, WA. “It provides a number of software products for data science and ML,” the Gartner analysts explain. “In the cloud, it offers Azure Machine Learning (including Azure Machine Learning Studio), Azure Data Factory, Azure HDInsight, Azure Databricks and Power BI. For on-premises workloads, Microsoft offers Machine Learning Server. Only Azure Machine Learning met the inclusion criteria for this Magic Quadrant, although Microsoft’s broader offerings did influence our assessments of Azure Machine Learning’s extended capabilities and Microsoft’s Completeness of Vision. Microsoft’s first-class cloud approach with Azure Machine Learning provides a fully managed, high-performing environment. The cloud platform also offers advantages in terms of performance tuning, scalability and agile support for open-source technology.”
“Niche Players demonstrate strength in a particular industry or approach, or pair well with a specific technology stack,” the Gartner analysts explain. “They should be considered by buyers in their particular niche. Some Niche Players demonstrate a degree of vision, which suggests they could become Visionaries. Often, however, they are struggling to make their vision compelling, relative to others in the market. They are considered more followers than leaders in terms of driving and defining the market. They may also be struggling to develop a track record of innovation and thought leadership that could give them the momentum to become Visionaries. Other Niche Players could become Challengers if they continue to execute in a way that increases their momentum and traction in the market.”
Anaconda is based in Austin, TX. It offers Anaconda Enterprise 5.2, a data science development environment based on the interactive notebook concept (this analysis excludes the Conda Distribution Packages) that sees users exploiting open-source Python and R-based packages. “Anaconda continues to provide a loosely coupled distribution environment, which offers access to a wide range of open-source development environments and open-source libraries, primarily Python-based,” the Gartner analysts explain. “The dominance of Python among data scientists gives Anaconda great visibility to developers. Anaconda is the only data science vendor not just supporting but also indemnifying and securing the Python open-source community. In the past year, the company has revamped its user interface by providing enhanced collaboration and model reproducibility features, giving data scientists better productivity and model management capabilities.”
Datawatch is based in Bedford, MA. In January 2018, it acquired Angoss and its main data science product components. These include KnowledgeSEEKER, the most basic offering, aimed at citizen data scientists in a desktop context; KnowledgeSTUDIO, which includes many more models and capabilities than KnowledgeSEEKER; and KnowledgeENTERPRISE, a flagship product that includes the full range of capabilities. "Often praised for its ease of use and intuitive interface, Angoss should benefit from Datawatch’s extensive experience in data management and preparation," the Gartner analysts say. "Customers continue to commend Angoss’ intuitive interface and well-rounded product functionality. The platform is well-suited to citizen data scientists looking for technological depth, reliability and a quick path to productivity. But the corollary of that strength is an above-average perceived TCO.”
Domino (Domino Data Lab) is headquartered in San Francisco, CA. “The Domino Data Science Platform represents a comprehensive end-to-end solution designed for expert data scientists,” the Gartner analysts explain. “The platform incorporates both open-source and proprietary tool ecosystems, while providing capabilities for collaboration, reproducibility, and centralization of model development and deployment. Product bundling and configuration is straightforward. The roadmap focuses on collaboration and accessibility, building a tool- and platform-agnostic ecosystem, and driving the end-to-end analytic process through to operationalization, with the goal of making data science a scalable enterprise capability.”
SAP is based in Walldorf, Germany. It offers SAP Predictive Analytics (PA). “This platform has a number of components, including Data Manager for dataset preparation and feature engineering, Automated Modeler for citizen data scientists, Expert Analytics for more advanced ML, and Predictive Factory for operationalization,” the Gartner analysts explain. “SAP PA is tightly integrated with SAP HANA. SAP’s data science offering is closely tied to the company’s expanding Intelligent Enterprise vision and SAP Leonardo. Many SAP customers identify alignment with existing data and analytics investments as a key reason for choosing its platform. SAP PA is well-suited to handling very large datasets via SAP HANA and deploying models to SAP’s wide range of applications. SAP PA received excellent scores from reference customers for delivery and platform/project management.”