Voice Commerce’s Quiet Maturation
Voice commerce has long been the technology that retail analysts predicted would transform shopping but that never quite seemed to arrive. For years, it hovered at the periphery of e-commerce strategy, associated with early-adopter smart speaker owners asking Alexa to reorder paper towels. In 2026, that characterisation has become outdated. Voice commerce has quietly matured into a substantial and fast-growing segment of digital retail, driven by advances in conversational AI, the proliferation of voice-enabled devices across cars and stores and living rooms, and a generational shift in consumer behaviour that is reshaping how people discover, evaluate, and purchase products.
The Market Reality
The voice commerce market has reached a scale that demands serious attention. According to The Business Research Company, the global market grew from $150.34 billion in 2025 to $194.03 billion in 2026, representing a compound annual growth rate of 29.1%. The market is projected to reach $484.09 billion by 2030, sustaining a CAGR of 25.7%. A separate analysis by Technavio values the market’s growth at an increase of $103.36 billion between 2025 and 2030, with a CAGR of 23.9%, and identifies biometric voice payment authentication systems as a key growth driver. The voice-activated shopping assistants market, a related category, grew from $6.84 billion in 2025 to $8.01 billion in 2026 and is expected to continue growing at a CAGR of 18.49% to reach $22.45 billion by 2032.
Within the United States, the voice commerce market alone is estimated at approximately $22.4 billion in 2026, according to eMarketer. Grand View Research projects the global market will reach $186.28 billion by 2030 at a 24.6% CAGR. These figures are not speculative projections; they reflect measurable transaction volume across platforms and devices. As the Voice Commerce Market Report 2026 notes, the growth is being propelled by expanding connected home ecosystems, integration with wearable devices, the development of multilingual voice assistants, partnerships between retailers and voice assistant providers and enhanced AI-driven personalisation. Major companies operating in the voice commerce market include Amazon, Apple, Alphabet, Samsung and Alibaba.
What makes these numbers particularly significant is that they represent a market that has matured beyond novelty. The early phase of voice commerce, driven primarily by smart speaker adoption, has given way to a more diversified ecosystem in which voice assistants are embedded in smartphones, vehicles, televisions and wearable devices. The growth is no longer dependent on a single device category or use case.
How Consumers Are Actually Using Voice Commerce
The consumer behaviour data reveals a more nuanced picture than the headline market size figures suggest. Approximately 49.6% of US consumers now use voice search for shopping-related activities, which includes browsing, comparing prices and adding items to lists. Half of all consumers have made a purchase using a voice assistant at some point. These numbers confirm that voice has entered the mainstream of shopping behaviour.
However, the conversion data tells a more complex story. Only 2.8% of voice commerce sessions end in a voice-only purchase. The real conversion happens when shoppers start with voice and finish on a screen: 14.2% of voice-initiated sessions convert when continued on another device, according to eMarketer data for 2026. This gap is instructive. It means voice is not replacing the traditional checkout flow. It is becoming another entry point, a discovery and initiation layer that feeds into screen-based purchasing rather than substituting for it.
The generational divide is stark and strategically significant. Around 30% of Gen Z consumers shop by voice every week. Millennials follow at 27.6%. Gen X drops to 14.9%, and only 6.8% of Boomers use voice for weekly purchases. For retailers whose customer base skews younger, voice is already embedded in buying habits. For those serving older demographics, the adoption curve provides more time, but the direction of travel is unmistakable.
Voice commerce also exhibits distinctive purchase characteristics. The average voice commerce order value sits at approximately $34, compared to $86 for traditional e-commerce, according to OC&C Strategy Consultants. Voice works best for simple, repeat purchases: household supplies, groceries and personal care items. About 44% of smart speaker users order household items weekly through their devices. Conversely, voice-initiated carts get abandoned at 42%, which is significantly lower than the standard e-commerce cart abandonment rate of roughly 70%. The friction reduction inherent in voice ordering appears to help consumers commit once they begin the purchase process. Reorder conversion rates via voice reach 28% for repeat purchases of known items, a compelling figure for any retailer selling consumable or replenishable products.
The willingness to engage with voice-based shopping assistance extends beyond simple reordering. KPMG’s Consumer Snapshot 01/2026, produced in collaboration with the EHI Retail Institute and published in March 2026, found that 55.7% of all consumers would seek advice from chatbots when shopping in the future. Among 18- to 24-year-olds, that figure rises to 77.8%. Only 9.4% of consumers reject digital consultations outright, and in the 18-to-24 age group, that figure is zero percent. The research also found that digital consultations can specifically help to reduce decision uncertainty and lower purchase abandonment rates in complex product ranges.
The In-Car Commerce Channel
One of the most overlooked dimensions of voice commerce is its migration into vehicles. About 15% of voice purchases are now initiated in vehicles, up from 6% in 2023. As automotive voice assistants reach 240 million active users globally, the vehicle is emerging as a genuine shopping channel for impulse and convenience purchases. The in-car environment, where hands-free interaction is a safety requirement rather than a convenience, represents a natural fit for voice commerce that is only beginning to be exploited.
A major survey of US drivers published in January 2026 found that 73% were interested in using in-car voice commerce, with 76% interested in using it to order food, 73% for vehicle maintenance, 71% for parking, 59% for entertainment planning, and 58% for impulse retail purchases. The automotive voice recognition market is expected to grow to $5.08 billion in 2026.
The industry is moving decisively. At the Beijing Auto Show in April 2026, Alibaba announced the integration of its Qwen AI voice model into vehicles from nine Chinese carmakers, BYD, Geely, Li Auto, Changan, Dongfeng, BAIC, Great Wall Motor, SAIC Volkswagen, and SAIC IM Motors, enabling drivers to order food, book hotels, buy tickets, and track packages via voice commands. Volkswagen separately announced plans to deploy voice AI technology from Alibaba, Tencent and Baidu in China-market vehicles from the second half of 2026. BMW, meanwhile, plans to introduce a new generation of in-car voice services in the second half of 2026 built on Amazon’s Alexa+ AI architecture. The race among technology giants to capture automotive software revenue streams is well underway, and voice commerce sits at its centre.
How Major Retailers Are Embedding Voice Commerce
The past six months have seen a cascade of announcements from the largest retailers in the United States, each of which is betting that voice and conversational AI will become primary interfaces for commerce.
Amazon made the most sweeping move. On May 13, 2026, the company began rolling out Alexa for Shopping to all US customers, free, with no Prime membership, no Echo device and no Alexa subscription required. The new AI-powered shopping assistant replaces the Rufus chatbot and embeds directly into the Amazon search bar across the app, website, and Echo Show devices. At launch, Alexa for Shopping supports setting price alerts, comparing items, and automatically reordering products. Customers can set parameters such as “add this sunscreen to my cart if the price drops to $10 and I have not purchased it in the last two months”. Rajiv Mehta, Amazon’s VP of Conversational Shopping, described it as “like having an expert personal shopper who already knows you and remembers your preferences, your past purchases, and your conversations, and carries that knowledge and understanding of you across your phone, laptop, and Echo devices”. The assistant also introduces “scheduled actions” that automatically look for products and deals, and can even shop on other websites via an agentic “Buy for Me” feature.
Walmart has taken a multi-partner approach. At the National Retail Federation’s Big Show in New York in January 2026, Walmart and Google announced a partnership that lets shoppers find and buy products directly through Google’s Gemini AI chatbot. Incoming Walmart CEO John Furner and Google CEO Sundar Pichai unveiled the deal together. “The transition from traditional web or app search to agent-led commerce represents the next great evolution in retail,” Furner said. This followed a similar deal that Walmart struck with OpenAI’s ChatGPT in October 2025 for “Instant Checkout,” which lets customers buy without leaving the chatbot. Walmart has also partnered with Apple to make its voice-ordering capability available through Siri, allowing users to add items to their online grocery cart by voice and complete purchases through a Siri Shortcut.
Kroger, America’s largest supermarket chain, went live with full-scale agentic commerce in January 2026 through an expanded partnership with Google Cloud. The integration includes both a Meal Assistant and a Shopping Assistant that handle complex, multi-step tasks from a single instruction. Yael Cosset, Kroger’s executive vice-president and chief digital officer, described the vision: “A customer planning a week of dinners, seeking recipe inspiration, or jumping into a new food regimen, will be able to ask our integrated assistant to create a shopping list based on their immediate needs, their budget, and family’s unique preferences”. Critically, Kroger’s recommendations are grounded in actual assortment, pricing, and availability — not generic suggestions. The system can convert requests like “I want to prepare vegan tomato soup” into guided recipes with detailed ingredient lists that populate shopping carts with a single click.
Lowe’s became one of the first retailers to launch Google’s Business Agent, which went live during NRF 2026. The agent allows customers searching for Lowe’s on Google to engage in conversation and complete transactions without leaving the browser. Neelima Sharma, SVP of technology for e-commerce and omnichannel product, described it as delivering “the voice of Lowe’s, the knowledge of Lowe’s”.
A broader philosophical shift was on vivid display at NRF 2026. As Rob Frieman, CIO of URBN, told the audience: “We’ve spent the last decade saying, no bots on our site. Now we’re saying the opposite. We’re saying, bring on all the bots, buy all the things.” That line captured the whiplash retailers experienced at the conference, where agentic commerce dominated stage conversations, booth demos, and hallway debates.
Voice AI Moves to the Retail Sales Floor
Perhaps the most significant innovation in voice commerce during 2026 involves not the consumer-facing interface but the retail associate. At Mobile World Congress in February 2026, a major voice AI company launched a sales-assist agent, a voice-powered AI tool designed for in-store retail teams. The technology provides real-time, data-driven prompts directly to floor staff during live customer conversations, analysing intent and context and delivering instant recommendations, from upgrades and bundles to cross-sell opportunities and compliance disclosures, directly to a tablet or any device with a microphone and a screen.
Instead of having customers wait while associates log into multiple systems, search through pricing plans, or manually calculate upgrade options, the sales-assist agent surfaces the right offer instantly within the natural flow of conversation. Powered by proprietary automatic speech recognition technology, the AI is purpose-built for fast-paced, noisy retail environments with minimal latency. It works by orchestrating multiple specialised AI agents that securely access CRM, billing, promotions, product databases, and coverage tools. “With AI maturing, voice is evolving into a central customer interface that doesn’t just respond but resolves,” said one executive involved in the launch. The company’s stock rose over 6% on the day of the announcement.
The Drive-Thru Reinvention
No category illustrates the maturation of voice commerce more vividly than quick-service restaurants, where voice AI is being deployed at scale in one of the most demanding acoustic environments imaginable: the drive-thru lane.
McDonald’s is embarking on a renewed push in 2026 after its earlier pilot with IBM ended in 2024 due to accuracy and reliability challenges. The company’s new effort is built on its partnership with Google Cloud and includes voice-activated AI chatbots that act as virtual assistants, taking orders at the drive-thru to alleviate pressure on staff, speed up order taking, and reduce wait times. McDonald’s is also deploying AI-driven accuracy scales that automatically weigh each bag of food before it is handed to the customer, prompting staff to check the contents if the weight does not match the expected order. The “Ready on Arrival” program, which uses mobile app geofencing to detect when customers are near and alert the kitchen to begin preparation, is being scaled up in key markets including the US, Japan, and the UK.
Wendy’s has taken a different path. The company built its FreshAI system with Google Cloud, training models on its specific menu and speech patterns, and treating the AI as a co-pilot rather than a full replacement. Escalation paths to humans remain built in. Metrics track order accuracy, handle time, escalation rates, and satisfaction scores. By mid-2025, Wendy’s had confirmed deployment across hundreds of locations, with further growth planned into 2026. The chain’s approach emphasises augmentation over automation.
The broader landscape has expanded rapidly. As of 2026, Taco Bell, Wendy’s, McDonald’s, and dozens of smaller chains have active voice AI deployments across hundreds of locations. Yet consumer sentiment remains divided. A YouGov survey from January 2025 found that 55% of Americans prefer a human taking their drive-thru order, while only 4% preferred AI. The operational reality, however, is that labour shortages, wage pressures, and peak-time bottlenecks are pushing operators to invest regardless of sentiment surveys. Voice AI at the drive-thru is no longer experimental; it is operational, and the question facing the industry is not whether to deploy it but how to deploy it well.
The Trust Barrier
For all the technological progress, a significant obstacle stands between voice commerce and its full potential: consumer trust. The question is not whether the technology works, in controlled environments, voice recognition accuracy has climbed into the high 90s, but whether consumers are willing to hand purchasing authority to an AI agent they cannot see or verify.
The numbers on trust are sobering. Research shows that 90% of consumers are unable to correctly identify AI-generated voice clips, creating what analysts have described as an “AI identity crisis” as voice bots sound more human and make more autonomous decisions. A survey found that 46% of shoppers do not trust voice assistants to process orders correctly. As voice commerce scales, the industry faces a structural gap: voice is becoming the execution layer for commerce without a commensurate trust layer beneath it.
The problem has practical dimensions that go beyond abstract consumer sentiment. As one analysis framed it, agentic commerce is collapsing the traditional customer journey, search, browse, authenticate, pay, confirm, into a single spoken intent. When a consumer says “book my usual flight to New York next Tuesday and use my points,” voice is no longer a channel; it is the execution layer. Yet generative AI has broken a foundational assumption: that a voice is evidence of a person. Synthetic speech is now indistinguishable from human speech at scale. Without addressing this contradiction, voice-led commerce scales risk just as fast as it scales convenience.
The payments industry is grappling with the implications. As one expert framed the challenge, consumers must trust an AI agent “to act on your behalf with that vendor, potentially for the first time, or even a network of vendors. You have to trust through interaction, but also within access and being able to facilitate enabling the right credentialing and set of controls within it, so you don’t have your agentic AI go out and buy you 10,000 rolls of toilet paper because it was more efficient to do it that way”. Given the potential volume and velocity of agent-driven transactions, trust must rest on a firm foundation. Achieving that will require broad industry alignment around authentication standards, voice verification protocols, and clear liability frameworks.
For retailers and platforms, the implications are increasingly clear. Optimised voice experiences are understood to support conversion and maximise return on ad spend. But building the trust infrastructure that makes autonomous voice transactions viable at scale, biometric voice authentication, transparent AI disclosure, seamless human escalation paths and clear consent frameworks is becoming as strategically important as building the voice interface itself.
The Strategic Implications
What distinguishes voice commerce in 2026 from its earlier iterations is not the underlying technology but its integration into broader retail and media strategies. Voice commerce is no longer experimental or niche. It is being embedded into the core operations of the largest retailers in the world, from Walmart’s multi-platform agent strategy to Amazon’s fusion of voice and search, from Kroger’s meal-planning assistant to McDonald’s drive-thru reboot.
The consumer data tells a clear story. Voice is not replacing screens; only 2.8% of voice sessions end in a voice-only purchase, but it is claiming a substantial and growing share of the discovery and reorder journey. The 30% of Gen Z consumers who shop by voice every week, compared to just 6.8% of Boomers, means that voice commerce share will increase with demographic turnover regardless of whether individual retailers invest in the channel. The migration of voice into vehicles, where 15% of voice purchases now originate, signals that the commerce interface is becoming ambient, always available, often hands-free, and increasingly intelligent.
The trust deficit is real and must be addressed, but the operational deployments are accelerating regardless. McDonald’s and Wendy’s are not waiting for consumer sentiment to shift; they are building the infrastructure and learning from millions of real customer interactions. Kroger is not conducting a pilot; it is live nationally with agentic commerce. Amazon has made Alexa for Shopping free to every US customer, embedding voice-powered purchasing into the default search experience of the world’s largest online retailer.
The question for retailers in 2026 is no longer whether voice commerce matters. It is whether their technology stack, their product data, and their customer experience strategy are ready for a world in which nearly half of US consumers already use voice search for shopping, a fifth of purchases will soon originate with a spoken word, and the interface between consumer and commerce is becoming increasingly conversational.
Sources:

