Walmart: ChatGPT checkout converted 3x worse than website
354 points - last Thursday at 7:42 PM
SourceComments
> Why this is happening. Two forces are slowing agentic commerce, according to Leigh McKenzie, director of online visibility at Semrush: infrastructure and trust. Real-time catalog normalization across tens of millions of SKUs is a decade-scale problem Google already solved with Merchant Center, and consumers still default to checkout flows they trust — Apple Pay, Google Wallet, and Amazon one-click.
It turns out when you step outside of “hard tech” problems like building GPT6 there are all of these details others have solved already. E-commerce has been optimized to the last decimal point for the last 30 years.
OpenAI is new to it, and if I had to guess, not that interested in getting good at it.
Why would anyone have an extra layer of friction too where things could go wrong, where handing over payment details in another chain.
Just let me buy my stuff in peace. Shopping is not the 'killer app' for GenAI.
A chat interface is just fundamentally incompatible with this. The agent makes it too easy to ask questions and comparison shop.
Walmart does not, over 10 years after they were released, even accept the contactless payment systems in common use. Instead, they push their in-house version in part so they can capture the relevant customer data.
And we're meant to believe that Walmart planned to outsource the entire series of touchpoints represented by the discovery & checkout process? Yeah, okay.
This was never going to be more than an experiment for Walmart.
You won't get it to push your products when users ask what's the best XYZ - either because it'll be too honest to lie or because it'll be too expensive for you.
There’s a lot of this going on in AI at the moment. New folks come in thinking they have a magic solution and then produce a total train wreck as it turns out domain expertise is still a thing.
I’m currently using Gemini to research components for a remote controlled plane. I have the frame of the plane and now need to buy correctly specced servo motors, an engine, battery, etc etc. It has saved me so much time and educated me tremendously on how the different components interact and the options available.
If I could just press “buy” from within Gemini and pay via Google Pay (or better still, Apple Pay) I’d do it in a heartbeat.
If ChatGPT can do this today, I need to try it.
"I need mayo, ketchup, mustard and ground beef"
"Here is a list of products with prices ... proceed to pay $25 (yes/no)" Yes
"Your card has been charged. Delivery will knock on your door in 7 minutes"
I'll code that app in one month, what's there to lose?
The better comparison might be conversion rate for those who searched on Walmart.com vs those who searched within ChatGPT. Or maybe that is what they're comparing and I misunderstood?
I get all my groceries deliver to my doorstep via Walmart delivery pass. The thing I'm really missing is having AI curate meal planning to my family's preferences. I already feed ChatGPT my family' preferences (e.g. Kid A doesn't eat X Y Z and liked meal A B C, kid B likes ...) and ChatGPT is helping me build meal plans. With my preferences we can quickly nail down a meal plan for the week.
The slowest part of my meal planning is going through Walmart's slow site where each page load is 2-3 seconds and it takes several page load per item. Once it can translate my meal plan into a grocery checkout from Walmart I'm all set.
The variable isn't whether AI is present. It's whether AI makes decisions well. A checkout flow where the AI makes worse purchase decisions than a static website is the consumer-facing version of the same problem enterprises face with AI agents: capability without governance = worse outcomes, not better.
Most people using AI chat are exploring ideas and solutions. They’re doodling, not shopping. Or in old timey parlance, they’re looky-loos or tire kickers at best.
Anyone who’s had to justify ad spend in e-commerce can tell you that some sources produce huge traffic with absolutely terrible conversion. Reddit and Pinterest pretty much blow for this reason, with limited exceptions. It’s also why TikTok and other influencer platforms really work.
Conversion requires a mental shift from discovery to demand.
Also, really hate summaries like this without the actual source so here are the main points from the actual source (WIRED https://archive.is/7DuEV):
1. Instant Checkout inside ChatGPT performed poorly, with conversion about one-third of Walmart’s normal site.
2. The experience failed largely because it forced single-item purchases instead of letting users build a cart.
3. Walmart is shifting to embedding its own assistant, Sparky, inside ChatGPT and keeping checkout on its own system.
4. ChatGPT is still valuable because it’s driving significantly more new customer traffic than search.
5. Purchases that did work were mostly practical, problem-solving items like supplements and tools.
6. Fully automated “agentic shopping” is still unlikely in the near term because people want control over purchases.
7. OpenAI is moving away from in-chat checkout and focusing on helping users research while merchants handle transactions.
In short, AI is useful for discovery, but traditional e-commerce flows still outperform it at closing sales.
Speed is your greatest feature. LLMs are slow. Loading 450mb of javascript to the client just to buy a bag of Doritos is slow.
Server side rendering owns here.
I sort of trust them to make product recommendations, but at best I will only open a link they suggest and buy the product there.
The next generation will shop in a different way, if it's better, and the change will be gradual as well.
Adoption takes time.
If you want to buy a Walmart product, the easiest way is to go to Walmart. Why add an imprecise middle man in between?
Your product has to be a 10x improvement over the incumbent to be competitive.
In AI speak it would be the “extra-bitter” lesson I guess?
You need to add 10x resources to beat a product that’s already solved with mature tech.
The enshittification is upon us.
The latest AI is trained on the average citizens social media output. Iq 90.
That’s why AI seemed smart. The bar will not be raised again. We’re cooked.
Perhaps clickthrough is worse because there are fewer dark patterns involved and people are mostly just browsing and occasionally buying only what they need.
They didn't really seem to specify the "why" of it with any research. And weird that OAI wasn't supporting them to see wha the issue was.