View profile

To be agile, or not to be (ML4Devs Newsletter, Issue 3)

Machine Learning for Developers
To be agile, or not to be, that is the question.
I guess the answer depends on whom do you ask.
I have seen many Data Scientists bitterly oppose Agile and Scrum:
The crux of the argument is that Data Science is science and not engineering. Therefore:
  • Estimating the time requirement is very difficult.
  • Its nature is not iterative: unlike software, you can’t build a piece that partly works, and then fill in more pieces to make it more complete.
  • Its nature is water-fall: when an idea doesn’t work well, you might have to go back all the way to tweaking the problem formulation and collecting a different kind of data.
  • Agile means more meetings (stand up, sprint planning, retrospective, etc.) and less work.
  • Agile means constant change of priorities (as a consequence of constantly evolving understanding of requirements and business needs).
  • Agile Methodology makes you mechanical and hinders creativity.
In some sense, and to some extent, all of it is true.
Déjà vu for “old” enough Software Engineers.
Interestingly, software engineers who are old enough will feel déjà vu. Programmers had the same arguments in the late 90s:
  • Programming is part art and part science. It is a highly creative process.
  • Estimating software development efforts is a notoriously hard problem.
  • When you discover a problem in the software design, often you have to go back to the very beginning (i.e. it’s waterfall-ish).
  • Do you want me to sit in so many meetings for requirement review, design, estimate, integration plan, test plan, or do you want me to code and finish the stuff?
And here we are! Now most developers follow some kind of iterative process, and data scientists often think that engineers and managers don’t get “science and research”.
Just as then software was (and is) just a means to an end, even now data science and machine learning are means to the business goals.
So, what can we do?
I believe that with time, we will figure out how to manage the unpredictabilities of data science better, just as we figured that out for software development.
So what can we do? First, let’s step back and revisit the Agile manifesto:
  • Individuals and interactions over process and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Response to change over following a plan
Instead of rituals or Agile, we need to go back to the essence and adapt it to machine learning.
In my experience, I have found that the following improves the probability of successfully deploying a machine learning project and making a business impact:
  • Consolidate Ownership: Cross-functional team of product, developers, and data scientists responsible for the end-to-end project.
  • Integrate Early: Implement a simple (maybe even a rule-based) model and develop product features around it.
  • Iterate Often: Build better models and replace the simple model, monitor, and repeat.
Consolidating into a single team cross-pollinates data scientists and developers of each-others requirements early on.
Counterintuitively, integrating early actually decouples model and software development (that great software engineering principle: cohesion over coupling), and follow a different cadence yet being in the same rhythm.
It has started.
Some of it is already happening:
So, what do you think? What parts of Agile philosophy and process are suitable to adopt in data science and for taking machine learning to production? Do reply and let me know.
Photo by Matt Bowden on Unsplash. https://unsplash.com/photos/GZc4fnQsaWQ
Photo by Matt Bowden on Unsplash. https://unsplash.com/photos/GZc4fnQsaWQ
If you enjoyed this issue, please share it with your team members. Please connect on Twitter or Linkedin, and send your feedback, experiences, and suggestions.
Did you enjoy this issue? Yes No
Satish Chandra Gupta
Satish Chandra Gupta @scgupta

ML4Devs is a biweekly newsletter for software developers.

The aim is to curate resources for practitioners to design, develop, deploy, and maintain ML applications at scale to drive measurable positive business impact.

Each issue discusses a topic from a developer’s viewpoint.

In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Created with Revue by Twitter.