Introduction With this first article of the two-part series on data product strategies, I am presenting some of the emerging themes in data product development and how they inform the prerequisites and foundational capabilities of an Enterprise data platform that would serve as the backbone for developing successful data product strategies. Once we have identified […]
With this first article of the two-part series on data product strategies, I am presenting some of the emerging themes in data product development and how they inform the prerequisites and foundational capabilities of an Enterprise data platform that would serve as the backbone for developing successful data product strategies. Once we have identified those capabilities, the second article explores how the Cloudera Data Platform delivers those prerequisite capabilities and has enabled organizations such as IQVIA to innovate in Healthcare with the Human Data Science Cloud.
From my discussions with Cloudera clients, data product development has been on top of the growth agenda in many industries such as Financial Services, Healthcare and Telecommunications. Among the plethora of industry-specific and technology themes contributing towards that growth agenda, there are some common business and technology forces influencing data product development:
The confluence of all the above business and technology factors has placed special emphasis on the organization’s data landscape and how that fits within the context of robust data product platform strategy that, based on Amrit Tiwana’s work on Platform Ecosystems meets four key success criteria Simplicity, Resiliency, Maintainability and Evolvability. These key success criteria call for a holistic rethink of the capabilities of the next-generation data platform that delivers successful data product strategies.
Among the key priorities of Cloudera clients that have successfully deployed and commercialized data product strategies, I have identified the following key requirements for efficient, differentiated, and scalable data platform ecosystems.
Security has always been a paramount concern for data ecosystems, and will continue to play a pivotal role in successful data products. In fact, data product development introduces an additional requirement that wasn’t as relevant in the past as it is today: That of scalability in permissioning and authorization given the number and multitude of different roles of data constituents, both internal and external accessing a data product. From security capability standpoint, organizations need to comprehensively address four requirements at scale:
In their seminal work on Data Product Development, MIT academics Meyer and Zack had advocated that a well-designed and executed platform approach “enables a company to create new versions of its products rapidly and efficiently to respond to or anticipate changing market needs”. If we extend that principle to the data product domain, we will find that only an Enterprise Data Platform approach that delivers frictionless access to any type of data without introducing any data or infrastructure barriers (e.g., data silos within a data product or heterogeneous implementations of a product family across different regions), is able to truly meet the vision of a “Data Product Ecosystem” in which the Enterprise Data Platform is the technology foundation being leveraged to deliver a Consistent, Infrastructure agnostic and Flexible Platform:
Expanding on the previous point around platform architecture that empowers successful product families, organizations that have taken the “long-view” in formulating a data and analytics monetization approach have realized that building a data platform to deliver a single product and then using extraneous components for the next derivative is simply not a scalable approach. That is because of all the additional cost and complexity factors associated with data movement / orchestration and duplicative storage costs emanating from stitching together different components / analytical capabilities. That ultimately delays time to market and undermines profit margins, let alone the different observability and management tools that need to be used to complement that stack for efficient control and performance. As a result, organizations need to evaluate the long term product portfolio strategy and how the data platform needs to be defined to realize that product vision, enabling modularity and extensibility.
A common pitfall in the development of data platforms is that they are built around the boundaries of point solutions and are constrained by the technological limitations (e.g., a technology choice such as Spark Streaming is overly focused on throughput at the expense of latency) or data formats (e.g., a solution that is focused on structured data and partially addresses unstructured data). As I am working with client executives to establish the business case around different service offerings that address multivariate market needs, I’ve concluded that there is great variation in the expected service characteristics; For example, a target persona has a short-term need for real-time visibility into a particular analytical environment whereas another is looking for a persistent, dedicated data lake to store and manage data.
As a result, data platforms need to deliver multiple product attributes and features rather than focusing on a particular analytical output or intermediate analytical stage (e.g., data warehousing). Those data product attributes include both functional and non-functional characteristics that translate to targeted, derivative value propositions that meet the needs of niche market segments.
Organizations that have successfully implemented innovative data products which radically transform industries, have evolved the nature of the analytics professional from a generic technology / data science expert to the industry-aware data scientist. Given their domain and technical experience, that role is able to find solutions in settings where there is complexity and lack of uniformity in data and bring understanding in contexts without universally accepted terms or common data models. An example of such organizational evolution has happened at IQVIA that has built an industry-leading Human Data Science Cloud leveraging the Cloudera Data Platform (CDP). As part of that organizational transformation, the data scientist role has morphed into the human data scientist one. Unlike the generalist data scientist approach to e.g., apply a toolkit of regression analysis, p-test, or other statistical analysis for the data at hand, the human data scientist will leverage intuition and creativity, preventing them from using old tools to answer new questions.
To accomplish such transformation, organizations need to empower product development teams with the right self-serve capabilities such as Edge-2-AI data visualization and discovery capabilities for all data sources pertinent to the knowledge worker’s duties. Those capabilities will not only remove pre-existing constraints in accessing and the understanding of data, but will also broaden the “art of the possible” with regards to what the industry-aware data scientist can do with the available data, thus pushing the boundaries of data product innovation.
This part of the Building Successful Data Strategies series explored the requirements for an Enterprise Data Cloud that delivers Simple, Resilient, Maintainable and Evolvable product strategies:
In the next part of the series, we will look into the specific capabilities of the Cloudera Data Platform that has enabled successful data product strategies. I would be more than happy to engage in a discussion with organizations that are interested to learn more about emerging trends in data product development and how Cloudera helps with commercializing innovative data products.
The post Five Strategies to Accelerate Data Product Development appeared first on Cloudera Blog.