Hidden Technical Debt in ML systems — A summary (2024)

Hidden Technical Debt in ML systems — A summary (2)

The field of ML and AI is moving quickly and this paper was published a few years ago, but the topics discussed are still relevant and important. So, it is time to get this old draft of my article out.

Technical debt in software engineering is the incurred long term costs arising from moving quickly on implementation and deployment. This debt significantly slows down maintenance and improvement activities.

ML systems are part software engineering and inherit many of the same problems, like technical debt. While ML solutions are relatively easy to develop and deploy but monitoring, maintaining them is not.

ML systems have the added complexity of ML specific issues and constraints, in addition to those from software engineering. While software related issues are easier to spot and rectify, ML specific issues are at a system level and are difficult to detect — giving rise to ‘hidden technical debt’ in ML systems.

Hidden Technical Debt in ML systems — A summary (3)

Difficulty in enforcing abstraction boundaries

In Software engineering, making code maintainable is easier in comparison to ML. Software systems lend themselves well to abstraction boundaries.

ML systems in contrast do not — they use signals or features that are inherently entangled — changing the distribution of one feature may change the weights or the importance of other features. This is referred to as the CACE principle — Changing Anything Changes Everything.

Models are sometimes cascaded, where a model for a new problem is learned on top of an existing one. Although this approach may create a quick solution compared to creating a new model altogether — it results in system dependency. Analysis of improvements becomes expensive and improving any one model may lead to decrease in performance at the system-level.

There may be consumers that silently utilize the outputs of a model, referred to as visibility debt. Changes made to these models may likely impact consumers in ways that are unintended or poorly understood.

Data dependencies

Data features are often signals produced by other systems. Some of these signals may change behaviour over time, either due to improvements put in or because these signals themselves are the output of a ML system that updates itself over time.

ML models may also have data features that do not strongly contribute to the prediction intelligence and in some cases the features may be unnecessary. Having these features makes the system vulnerable to change. If there are legacy features, bundled features, incremental features or correlated features in the model, a re-examination is likely due.

Direct and Hidden feedback loops

A model may be feeding back into selection of its own training data. Any degradation in performance is also part of the feedback loop resulting in a vicious circle.

While a direct feedback loop described above is easier to investigate, hidden feedback loops — between systems that indirectly influence each other is tricky. Changes in one system may lead to undesirable effects in the other, often going unnoticed due to no knowledge of the dependencies.

High-debt design patterns

The prediction component of an ML system is only a fraction of the entire system — the rest is tooling or software code. Code that connects these two worlds together is referred to as Glue code. It is only supporting code, has no functionality, it however can be hidden and massive — making testing and developing alternatives difficult.

As new data sources are added to a data ecosystem, the number of ingestion pipelines also increase. Without an architecture that looks at data collection holistically, adding new sources can quickly become messy. Add a model to the mix, now you have a complex and interdependent system of scrapes, joins and sampling steps. Managing these pipelines, detecting errors, recovering from failures are difficult, costly and make further innovation costlier.

Experimental code paths that are no longer needed, contribute to the growth of debt. These codepaths make backward compatibility difficult to implement. Testing interactions between these codepaths is hard and can cause undesired effects in production.

Configuration of ML systems

ML systems have numerous parameters for data, features, algorithms — these can be configured until we get the desired performance. Configurations are sensitive and the messiness of data can make modifying these configurations difficult and prone to errors. Incorrect configurations can prove costly in terms of loss of time, computing resources and worse — production issues.

Changing external world

An ML system often interacts with the external world — an unstable world. Data or the mapping between inputs and outputs an ML system relies on, could change. This implies a need for constant monitoring and testing of the system which creates ongoing maintenance costs.

Although useful as a concept, there isn’t an appropriate metric to measure technical debt over time. However, there are some questions we could ask that help in assessing the extent and nature of the debt.

Technical debt is an issue in both engineering and research. Solutions that offer tiny improvements at the cost of significant increase in system complexity or addition of 1–2 data sources without due diligence can lead to accumulation of debt.

ML tech debt is becoming increasingly important to address and the authors hope the paper encourages development in areas of maintainable ML. However, these improvements alone will not be sufficient. The authors note the need for a culture that supports recognizing, prioritizing and rewarding efforts that contribute to long term health of the ML systems.

Not all sources of tech debt identified here actually contribute to it. E.g. Glue code is a great way to add abstraction and link layers together. Insufficient documentation or patchy designs is what allows Glue code to contribute to tech debt. Tech debt accumulates as a result of poor practices around these components or tools.

Costs in ML arising from changing external world (retraining, changing thresholds, testing etc) cannot be categorized as tech debt. These costs would present themselves irrespective of good practices — they are ‘out of control’ factors.

The paper however is a good overview of the factors that add to the overall ML cycle time and quality of the solutions.

Further reading

  1. https://matthewmcateer.me/blog/machine-learning-technical-debt/
  2. https://blog.metaobject.com/2021/06/glue-dark-matter-of-software.html
  3. https://convincedcoder.com/2019/04/27/Software-architecture-boundaries/
Hidden Technical Debt in ML systems — A summary (2024)
Top Articles
M-PESA GlobalPay Virtual VISA Card is a partnership between M-PESA and VISA. The virtual card is linked to your M-PESA wallet and enables you to make payments to international online sites for goods and services.
Secure & Convenient Money Transfer
Katie Pavlich Bikini Photos
Melson Funeral Services Obituaries
Shoe Game Lit Svg
Summit County Juvenile Court
P2P4U Net Soccer
123 Movies Babylon
Immediate Action Pathfinder
Discover Westchester's Top Towns — And What Makes Them So Unique
Ts Lillydoll
Mineral Wells Independent School District
10-Day Weather Forecast for Florence, AL - The Weather Channel | weather.com
WEB.DE Apps zum mailen auf dem SmartPhone, für Ihren Browser und Computer.
Roof Top Snipers Unblocked
ZURU - XSHOT - Insanity Mad Mega Barrel - Speelgoedblaster - Met 72 pijltjes | bol
Beryl forecast to become an 'extremely dangerous' Category 4 hurricane
Mc Donald's Bruck - Fast-Food-Restaurant
Kcwi Tv Schedule
Ac-15 Gungeon
Yonkers Results For Tonight
Bennington County Criminal Court Calendar
Rs3 Ushabti
Page 2383 – Christianity Today
800-695-2780
Lbrands Login Aces
Craigslist Brandon Vt
Combies Overlijden no. 02, Stempels: 2 teksten + 1 tag/label & Stansen: 3 tags/labels.
Gunsmoke Tv Series Wiki
Pokémon Unbound Starters
Account Now Login In
James Ingram | Biography, Songs, Hits, & Cause of Death
What Is The Lineup For Nascar Race Today
35 Boba Tea & Rolled Ice Cream Of Wesley Chapel
Que Si Que Si Que No Que No Lyrics
Martin Village Stm 16 & Imax
140000 Kilometers To Miles
Quality Tire Denver City Texas
New Gold Lee
Keeper Of The Lost Cities Series - Shannon Messenger
The Minneapolis Journal from Minneapolis, Minnesota
Bcy Testing Solution Columbia Sc
Sand Castle Parents Guide
Thor Majestic 23A Floor Plan
Alpha Labs Male Enhancement – Complete Reviews And Guide
Candise Yang Acupuncture
2294141287
Egg Inc Wiki
Greg Steube Height
How To Find Reliable Health Information Online
Latest Posts
Article information

Author: Lilliana Bartoletti

Last Updated:

Views: 5769

Rating: 4.2 / 5 (73 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Lilliana Bartoletti

Birthday: 1999-11-18

Address: 58866 Tricia Spurs, North Melvinberg, HI 91346-3774

Phone: +50616620367928

Job: Real-Estate Liaison

Hobby: Graffiti, Astronomy, Handball, Magic, Origami, Fashion, Foreign language learning

Introduction: My name is Lilliana Bartoletti, I am a adventurous, pleasant, shiny, beautiful, handsome, zealous, tasty person who loves writing and wants to share my knowledge and understanding with you.