I believe that with time, we will figure out how to manage the unpredictabilities of data science better, just as we figured that out for software development.
So what can we do? First, let’s step back and revisit the Agile manifesto:
- Individuals and interactions over process and tools
- Working software over comprehensive documentation
- Customer collaboration over contract negotiation
- Response to change over following a plan
Instead of rituals or Agile, we need to go back to the essence and adapt it to machine learning.
In my experience, I have found that the following improves the probability of successfully deploying a machine learning project and making a business impact:
-
Consolidate Ownership: Cross-functional team of product, developers, and data scientists responsible for the end-to-end project.
-
Integrate Early: Implement a simple (maybe even a rule-based) model and develop product features around it.
-
Iterate Often: Build better models and replace the simple model, monitor, and repeat.
Consolidating into a single team cross-pollinates data scientists and developers of each-others requirements early on.
Counterintuitively, integrating early actually decouples model and software development (that great software engineering principle: cohesion over coupling), and follow a different cadence yet being in the same rhythm.