Cloud Data Platforms for Enterprises: Popularity, Technologies, and Challenges
Cloudera conducted a survey in which they found that most enterprises today tend to choose a multi-cloud strategy, and data visualization and governance are their top priorities. Responses from over 150 big data and analytics leaders were collected from all over the globe. From the responders, 49% were data and analytics professionals while 51% were the Business Intelligence (BI) or IT teams, or were identified as data consumers within business units. From the report, it is evident that the enterprise has stepped into the era of Big Data and Analytics.

Here are the key findings of the report:

  1. Most companies are going for a hybrid/multi-cloud strategy, and only 24% go with a single cloud vendor.
  2. Hadoop investment is staying the same for 53% while 30% of responders plan to invest more in it.
  3. 55% of the respondents are planning to invest in data visualization in the near future.
  4. According to 80% of the respondents, data governance holds high value.
  5. Azure stood as the most popular cloud data warehouse used, which was followed by AWS, Google, and Snowflake.

Common Cloud Data Platforms

The most common on-premise platforms were found to be open-source Hadoop, Oracle, and SQL server. For Hadoop, 53% of the respondents plan to keep their investment the same, while 30% plan to invest more. For the existing on-premise data platforms other than Hadoop, 42% of the responders plan to keep their investment the same while 33% plan to invest more. 61% of respondents were operating data platforms in the public cloud, and this was found to be similar to Forrester’s research that showed about 65% of North American enterprises currently rely on public cloud platforms.

Everyone was using different platforms in the public cloud such as AWS, Google BigQuery, Snowflake, Azure SQL data warehouse, Amazon Redshift, and Teradata cloud. For the respondents who were not currently using the cloud, 38% indicated that they are not on the cloud at all while 48% plan to deploy a public cloud data warehouse in the future. 16% wanted to stay off the cloud, and 36% are still evaluating their options.

Working with Cloud Data Platforms

23% of the responders who are deployed in the cloud say that it is performing well. 29% are running a data platform in the cloud and are still struggling to work things out. For 3% cloud data platforms were not working well. The remaining were either not allowed or had not yet started planning about getting on the cloud. 35% of the respondents had already deployed cloud, while 6% are in the process of deploying it. 21% will deploy the cloud data platform in less than a year while the remaining would take more. Cloudera found out that only 24% of the respondents are using a single cloud vendor while the majority are working with multiple cloud vendors and have a hybrid cloud strategy.

79% of the respondents said that they must have consistent, integrated security, and governance for their data in the public cloud, private cloud, or hybrid cloud.

Most respondents are storing data on a data lake in the public cloud or a cloud data warehouse; the remaining are doing it on a cloud relational database, or an on-premise data warehouse.

Advantages of Cloud Data Platforms that Companies are Leveraging

With public data cloud, the companies try to achieve the flexibility to scale up and down and get better data and analytics at lower costs. They also want to accept new technologies and deploy new applications quicker. Another reason why many respondents find attractive about the public cloud is the consumption based pricing model, i.e. paying for what is used. Most of the respondents are looking towards public cloud to deploy use cases related to data science, data warehousing, and business intelligence and that can use  real-time streaming data.

Technologies Used

Cloudera survey indicates that 49 respondents are using data visualization technologies and 57 respondents plan to use data visualization in the future. Data visualization technology is quite new; however, the rapid increase in data volumes and variety, coupled with an increased focus on analytical use cases created certain challenges for the data visualization technologies. Now business users need Ad Hoc access to both live and historical data, and they use Artificial Intelligence and Machine Learning platforms for Analytics.

Currently, companies use Microsoft applications such as Excel and PowerBI along with Tableau. According to the analysis by Cloudera, for AI/ML, Spark is the most used technology. Spark remains the technology of choice for data bricks and Cloudera data science workbench. The most common challenges with BI/AI tools were incomplete data, 18%, and poor query performance 15%.

Challenges with Public Cloud

For about 32 people security was a big challenge while 29 people said that the costs of public cloud are higher than expected. 27 respondents also said that they lack the skill sets for operating in the public cloud. Other challenges include the problem of data in multiple locations, difficulty in porting applications, retraining users, and poor performance. When the respondents were asked about the challenges they are experiencing with analytics, infrastructure governance was the number one challenge followed by skills, performance, cost, and security.


Cloudera is an enterprise data cloud company that provides cloud-native services to manage and secure the entire data lifecycle. They help organizations tackle transformational use cases and extract real-time insights from data to drive value and competitive differentiation. This information in the article is based on the report, researched by Cloudera, and is authored by TechCircle.

