Behind the Build: MLflow
New Series Alert: Introducing: Behind the Build đź©·
We’re excited to kick off a new series here at @zally: Behind the Build - your backstage pass to the tools and practices powering our machine learning.
First up: MLFlow 🚀
At zally, we’re constantly evolving how we build and scale machine learning systems. One of the most impactful additions to our stack has been MLflow, helping us reduce the lead time from experimentation to deployment from hours to minutes, while also improving on quality and reproducibility.
Before MLflow, tracking experiments was a manual process.
In the early days, our focus was on rapidly building and validating an initial behavioural model, speed mattered more than structure. At that stage, using Excel sheets and informal documentation made sense. But as our models became more complex and zally matured, we needed a production grade system that could support reproducibility, consistency, and scale.
MLflow changed that.
It’s now a core part of our ML infrastructure bringing structure and reproducibility to our workflows
With MLflow, every training run is automatically logged via the Python API, capturing hyperparameters, evaluation metrics, model artifacts (e.g. models, plots), and environment data (pip dependencies, Python version). This gives us full visibility into the reproducibility and evolution of each model across its lifecycle.
We also use MLflow to package and deploy models as standalone servers. Using MLflow’s Python API, we build a Docker image that includes the trained model and all its dependencies. This image is deployed to our Kubernetes cluster, exposing a production-ready, scalable endpoint for real-time inference. This setup ensures consistency between training and production environments.
As you can see, it’s more than just experiment tracking. MLflow has enabled us to:
Reproduce experiments in minutes, with full traceability.
Implement structured, auditable workflows for model deployment.
Stay compliant and in control by keeping a full record of how each model was built, trained, and deployed.
The result?
đź’¨Faster iteration
👏Smoother deployments
🚀More reliable models in production.
As our ML footprint continues to grow, MLflow is helping us stay ahead by ensuring every model is backed by a clear history, a reproducible process, and a path to production that’s as rigorous as it is efficient.
This is just the start.
More tools, more stories and more behind-the-scenes to come. Stay tuned 🚀