The robotics industry is experiencing a modern renaissance thanks to advances in powerful AI.

Last year, AI robotics startups raised $6.1bn in VC funding, up 19% year-on-year, as machine learning engineers pioneer new technologies to make machines smarter, with the promise of bringing us into a sci-fi future where robots can operate autonomously.

“This will be basically the next industrial revolution where we have basically an infinite amount of physical labor at a really low price. This will fundamentally change the world,” says Ralf Gulde, CEO and cofounder of Sereact, a Stuttgart-based startup developing AI models that it says will “give a brain to robots.”

The company is developing a vision language action model (VLAM), an AI architecture that lets robots interpret their visual environment around them, receive operator instructions given in natural language and translate those into actions.

Tech giants like Google, NVIDIA and Tesla, along with a host of specialised startups, are trying to crack how to build robust and reliable VLAMs, in a bid to lead the next era of automation with machines that need less and less human intervention to operate. 

But building this technology isn’t just theoretically challenging: it also involves processing huge amounts of data, and access to considerable amounts of compute to keep improving these new, robotic brains.

Solving real problems

Unlike language models that are trained on fixed text datasets, VLAMs learn through continuous interaction with their environment, getting feedback from how their actions change what they see. This creates new challenges, such as delays in sensor responses, noisy visual inputs, and the need for the model to constantly adapt its behavior based on the results of its own actions.

Sereact is building its VLAM to be hardware agnostic, so that it can operate any kind of robot assigned to any task, but its first commercial focus is on “pick and pack” machines that sort and package items in warehouses and factories. 

“We’re researching our own VLAM with our frontier lab, but we also have this clear, early commercialisation strategy, with the narrowed down initial use case,” Gulde says, explaining how the company’s tech is already deployed across more than 100 machines with clients including automakers BMW and Daimler Truck and ecommerce logistics provider Zenfulfillment.

Another startup using VLAM technology is Berlin-based Sensmore. It’s building an operating system for clients operating heavy machinery like bulldozers, haul trucks and excavators, which it retrofits with hardware to let them run autonomously, and has deployed with cement and concrete company Cemex.

“Our customers’ workforce tends to be closer to retirement age, and the number of fresh starters is very limited, so labour shortages are a big reason for wanting automation. Secondly, it’s also about efficiency gains, to either produce the same output in less time or to produce more outputs in the same time,” says cofounder and CEO Max Rolf. 

“The third point is that these heavy machines are exposed to quite dangerous conditions, so if you can automate the machine and take out the human operator, that has safety benefits too.”

Heavy compute

While companies like these see huge opportunity in automating industry, they also have big data and compute loads to manage. That’s partly due to the fact that these models are trained on information from a range of sources, including video cameras, sensors that monitor robotic movements, as well as text and photographs.

Sereact says its 100 deployed systems generate “hundreds of gigabytes” of data every day, and these kinds of heavy workflows have spawned a new service category in AI, with startups like Rerun, Roboflow and Voxel51 offering platforms for data management.

Gulde’s cofounder and CTO Marc Tuscher says that the company runs 100 NVIDIA H100 GPUs “full time” to constantly improve its AI model, and that this figure could increase three-fold as the company scales. He hopes that cloud providers can help companies like his by making compute “predictable and frictionless.” 

“We don’t need fancy dashboards, we need guaranteed access to high-end GPUs and seamless bridging between on-prem and cloud,” he says, adding that VLAMs require low-latency compute, as the robots they control are always having to adapt to data they’re gathering from their environment.

“Robotics AI depends on continuous, distributed learning from the physical world, and that only works if compute and data movement stay smooth and scalable.”

Sensmore says that it currently manages most of its compute needs with in-house GPUs, but that as it expands its automation use cases, it will likely require more capacity via cloud providers.

A new robotics industry

If cloud providers can support AI companies like these, the US and Europe can compete in the robotics market that has historically been dominated by Asia, according to Rick Hao, general partner at London-based deeptech VC Ruya Ventures.

“Japan, South Korea and China are very advanced in robotics manufacturing, and it's much more cost effective for them to produce the hardware,” he says, adding that he’s made two stealth investments into European companies applying AI to machines made in Asia.

Hao also believes that, while many companies rely on Silicon Valley-based hyperscalers for their cloud computing, European businesses working in sensitive industries like defence will require providers that are closer to home.

“That will definitely require some tech sovereignty,” he says.

Hao says that the robotics industry’s lofty promises around fully automated, general purpose machines are unlikely to come true in the short term. But if companies working on VLAMs succeed, it could be the technology that justifies the huge hype surrounding AI, by transforming productivity and our approach to labour in our economies.

Keep Reading

No posts found