The Scientist, The Engineer, and The Warehouse
Building the right team for data analytics in the age of cloud.
“I need to hire a data scientist!”
We hear this call, urgent and animated, from Chief Information Officers (CIOs) and Chief Technical Officers (CTOs) almost
every week. We understand. It’s easy to get excited about new technology, especially when it promises a shortcut to those most tantalizing of strategic objectives: innovation, competitive advantage, and efficiency. Few fields have recently generated such enthusiasm as data science, which broadly covers machine learning, artificial intelligence, and big data. Much of this eagerness is justified: the headline achievements have often been startling.
Less exciting—at least to the headline writers—are the managerial and architectural changes needed to support radical new practices as they emerge. Too few organizations have
given thought to supporting data science as both a technical
and a business practice This is partly driven by a simple lack
of knowledge about how machine learning, and particularly artificial intelligence work. These practices have a mystique of their own which can seem quite distant from IT’s day-to-day work. Companies often lack foresight, too. Results from predictive models need to be available and applicable in the real world of your operations, at a scale, and with a reliability that matches company demands.
Data warehousing is a core technology that enables data science to power business at an enterprise scale and is well-established and widely available. The concept of a data warehouse, first defined by Barry Devlin at IBM in 1985, is still a powerfultechnology and is far from being left behind by data lakes, pipelines, scripts, and algorithms. In fact, the data warehouse’s central role—to serve integrated data and a canonical model
of operations—has never been more authoritative. Thanks to new cloud architectures and in-memory engines, the technicalplatform is still highly relevant.
In companies that stand out in the field of data science, three new organizational roles are emerging: the data scientist, the data engineer responsible for ensuring predictive models are production-ready, and a new generation of data literate analysts in marketing, finance, and sales operations.