Walmart Live Better Hackathon

The thing about working with data scientists is that the hardest part of the engagement has nothing to do with the data. A model is only as useful as the question it's built to answer. And writing a question that a model can actually answer — specific enough to be solvable, broad enough to matter — is where most technical collaborations quietly fall apart.

At Walmart, I was brought in as Technical Program Manager on two parallel data science workstreams as part of the Live Better Hackathon — both emerging from the same underlying pressure: Walmart had made ambitious, public sustainability commitments, and needed to understand whether the operational data it was already sitting on could make them real.

My role across both workstreams was the same: partner with business stakeholders to identify and sharpen the problem statement, develop product requirements the data science team could execute against, prep and validate data sets, and manage the technical sprint from framing to model to output. The design thinking wasn't in the interface. It was in the question.

1

Workstream One

Productive Produce Prediction

Walmart's Quality Control managers had accumulated years of anecdotal evidence suggesting a pattern: weather events at growing locations — heavy rainfall, temperature extremes, high wind velocity — seemed to correlate with produce rejections downstream at distribution centers. Rejections meaning: product that travels the full length of the supply chain only to be deemed unacceptable upon arrival, often ending up as waste.

The hypothesis wasn't controversial. The problem was that it had never been systematically tested. Anecdote and data are different things, and only one of them can drive a supply chain decision.

We built the problem statement around a specific question: could we predict inbound produce rejections — before product ever reached a distribution center — based on weather conditions at the source fields? If yes, smarter sourcing decisions upstream could reduce rejections, improve instock rates, and divert product from landfill before it became waste.

The data architecture combined two sources: Walmart's internal QC inspection records — including inspection date, DC number, average core temperature at inspection, supplier name, origin country, and rejection result — with NOAA's publicly available weather data, pulling temperature, precipitation, and wind speed for specific growing regions. We selected four to six commodity and growing location combinations based on weather sensitivity and rejection history, deliberately mixing domestic and international origins to give the model breadth.

The sprint produced a predictive model capable of flagging spoilage risk pre-shelf — not just a binary up-or-down signal, but a graduated risk estimate that could inform sourcing volume and routing decisions in advance of a product ever leaving the field. The model gave Walmart something its QC managers had never had: foresight with evidence behind it.

2

Workstream Two

Electric Boogaloo

Walmart had committed to transitioning its entire private fleet — more than 10,000 tractors — to 100% renewable energy by 2040 under Project 2040. The ambition was real. The infrastructure wasn't. Broad national charging capability for Class 8 heavy-duty electric vehicles didn't exist, and no one could build all of it at once. The question was where to start.

The data we had was remarkably rich: trip segment routing records for the full private fleet — origin type, destination type, city, state, zip, distance in miles, and dwell time at each stop — covering both outbound DC-to-store and inbound vendor-to-DC hauls across the U.S. We layered that against battery range and charging speed estimates for current HDEV prototypes, and integrated OpenStreetMap road network data to contextualize routes geographically.

The problem statement we built around was precise: given existing routes and current EV technology constraints, identify the locations where charging infrastructure would convert the highest volume of existing trips to electric operation — and sequence them so that early stations generate maximum return before lower-priority locations are built.

The geo-spatial model integrated vendor proximity, distribution density, and existing government EV infrastructure to surface an optimal first pilot market. The analysis selected Shafter, California — a distribution hub in the Central Valley with favorable route density, proximity to existing infrastructure, and geographic conditions suited to early electric fleet operation.

The model didn't just answer "where." It sequenced the "in what order" — building a prioritization logic that Walmart's transportation team could use as a construction roadmap, maximizing the electric conversion return from each station before moving to the next.

Both problems had the same shape: large, messy operational data waiting for the right question, a sustainability commitment that needed to move from announcement to evidence, and a sprint that could only succeed if the problem was framed precisely enough before a single line of code was written.

My contribution

Problem Framing

Partnered with business stakeholders across both workstreams to translate broad sustainability goals into precise, modelable questions — defining the scope, variables, and success criteria before the data science work began.

Product Requirements

Developed detailed product requirements for each workstream — including data set specifications, model type selection, visualization requirements, and optional non-technical user interfaces — giving the data science team a clear execution target.

Data Set Design

Identified and structured the data sets for both models: QC inspection records + NOAA weather data for produce prediction; trip segment routing + OpenStreetMap + HDEV specs for EV optimization. Validated data fields and coverage before sprint kickoff.

Technical Sprint Management

Managed both workstreams from problem statement through model delivery — keeping cross-functional teams aligned, clearing blockers, and ensuring outputs met both the technical standard and the business objective.

This portfolio
is password protected.

The question
is the work.

Problem Framing

Product Requirements

Data Set Design

Technical Sprint Management

This portfoliois password protected.

The questionis the work.

Problem Framing

Product Requirements

Data Set Design

Technical Sprint Management

This portfolio
is password protected.

The question
is the work.