Make machine learning work for you

Achieve "ML democratization" and generate proof points by intaking problem statements, not technical requirements.

David Kangarchive page

September 20, 2023

The enthusiasm for AI and its applications is reaching a nadir, according to an August 2023 Gartner Hype Cycle press release, where generative AI is nearly perched atop the category of technologies at their “Peak of Inflated Expectations,” ready to plunge into the “Trough of Disillusionment.” A quick look at social media agrees, with some pages filled with targeted advertisements about topics as prosaic as “GPT for your pile of receipts.” This is good evidence that the AI craze is becoming a hammer looking for a nail.

Yet, with all this fervor, according to McKinsey, while AI adoption has more than doubled since 2017, it has leveled off at around 50% to 60% during the past few years.

IBM reveals that nearly half of the challenges related to AI adoption focus on data complexity (24%) and difficulty integrating and scaling projects (24%). While it may be expedient for marketers to “slap a GPT suffix on it and call it AI,” businesses striving to truly implement and incorporate AI and ML face a two-headed challenge: first, it’s difficult and expensive, and second, because it’s difficult and expensive, it’s hard to come by the “sandboxes” that are necessary to enable experimentation and prove “green shoots” of value that would warrant further investment. In short, AI and ML are inaccessible.

Data, data, everywhere

History shows that most business shifts at first seem difficult and expensive. However, spending time and resources on these efforts has paid off for the innovators. Businesses identify new assets, and use new processes to achieve new goals—sometimes lofty, unexpected ones. The asset at the focus of the AI craze is data.

The world is exploding with data. According to a 2020 report by Seagate and IDC, during the next two years, enterprise data is projected to increase at a 42.2% annual growth rate. And yet, only 32% of that data is currently being put to work.

Effective data management—storing, labeling, cataloging, securing, connecting, and making queryable—has no shortage of challenges. Once those challenges are overcome, businesses will need to identify users not only technically proficient enough to access and leverage that data, but also able to do so in a comprehensive manner.

Businesses today find themselves tasking garden-variety analysts with targeted, hypothesis-driven work. The shorthand is encapsulated in a common refrain: “I usually have analysts pull down a subset of the data and run pivot tables on it.”

To avoid tunnel vision and use data more comprehensively, this hypothesis-driven analysis is supplemented with business intelligence (BI), where data at scale is finessed into reports, dashboards, and visualizations. But even then, the dizzying scale of charts and graphs requires the person reviewing them to have a strong sense of what matters and what to look for—again, to be hypothesis-driven—in order to make sense of the world. Human beings simply cannot otherwise handle the cognitive overload.

The moment is opportune for AI and ML. Ideally, that would mean plentiful teams of data scientists, data engineers, and ML engineers that can deliver such solutions, at a price that folds neatly into IT budgets. Also ideally, businesses are ready with the right amount of technology; GPUs, compute, and orchestration infrastructure to build and deploy AI and ML solutions at scale. But much like the business revolutions of days past, this isn’t the case.

Inaccessible solutions

The marketplace is offering a proliferation of solutions based on two approaches: adding even more intelligence and insights to existing BI tools; and making it increasingly easier to develop and deploy ML solutions, in the growing field of ML operations, or MLOps.

BI is making significant inroads on augmenting its capabilities with ML, but still has the intrinsic cognitive overload challenge to overcome. ML capabilities are so embedded in BI interfaces that they aren’t easily extracted to be applied in more bespoke ways.

MLOps comes from the other direction, by easing the development and promotion of ML models. The challenge for MLOps is, while it makes data scientists and ML engineers more productive—more building and training models, and less wrangling data, deploying, and productionizing—it doesn’t address the fact that those very data scientists and ML engineers remain scarce and expensive in the first place.

The onus is therefore on businesses to find solutions that can enable non-Ph.D, traditional analysts to become effective ML practitioners. This is ML Democratization.

An ML democratization journey

Capital One began laying the foundations for the journey to ML democratization more than a decade ago, when it went all-in on the cloud, creating a modern computing environment that enables instant provisioning of infrastructure and increased processing power. This modern computing environment makes complex and large-scale data set analysis possible at increasing levels of efficiency.

Capital One adopted a philosophy of centralized and standardized platforms and governance. For AI and ML, it built an ML platform that provides engineers and scientists with governed access to algorithms, components, and infrastructure for reuse.

The computing environment and platform philosophy provided necessary, but not sufficient, ingredients to democratize ML. Infusing a “no hammers looking for nails” mantra, Capital One’s team of ML engineers and data scientists went with a business problem-first approach. Instead of gathering technical requirements the team gathered problem statements.

For instance, Capital One’s credit card transaction fraud team looked for a way to comprehensively detect pockets of fraud and automatically create real-time defenses. So the company developed ML algorithms, components, and infrastructure to build a solution. In the process, those components were published to a central ML platform to be reused and improved upon for future business problems requiring similar approaches.

As organizations expand their range of business use cases and develop solutions, they often find recurring patterns that can be harnessed for wider benefit. Recognizing these patterns can lead to a powerful realization: by making commonly used ML libraries, workflows, and components accessible through user-friendly interfaces, businesses can unleash the potential of ML across their enterprise, without requiring deep data science or engineering expertise.

This democratization of ML serves as a solution to several challenges, including cognitive overload, resource constraints, and accessibility issues. It paves the way for a culture of experimentation, essential for turning ML into a valuable tool rather than just a passing trend.

Now, if a business analyst wants to identify anomalies or track trends in their portfolio's granular segments, or if a marketing associate wants to perform in-depth campaign analysis beyond what traditional analytics tools offer, ML can meet these needs with minimal demands on engineering resources.

Using ML democratization transforms it from a shiny object into a centerpiece of practical value. In a single working day, an analyst with no prior ML knowledge or coding skills can uncover insightful information from any dataset of their choice. This shift significantly reduces the cost associated with exploring ML's potential and its application across various business areas.

No-code ML solutions could play a pivotal role in achieving ML democratization. We’re already seeing it happen, and ML will continue to become more accessible through technology advancements including no-code solutions. This ML democratization will allow business analysts to confidently make decisions they wouldn’t have previously considered, resulting in profound and lasting impacts.

This content was produced by Capital One. It was not written by MIT Technology Review’s editorial staff.