AMD says Malaysia has a role in Southeast Asia’s yotta-scale AI infrastructure push
AMD sees Malaysia as part of its Southeast Asia AI infrastructure focus.
AMD said yotta-scale AI will require enterprises to rethink AI infrastructure planning.
Yotta-scale AI is changing how enterprises think about infrastructure as workloads move beyond on-demand use and toward continuous inference, reasoning, and autonomous agents, Alexey Navolokin, General Manager for Asia Pacific at AMD, said in an interview with
Tech Wire Asia
.
“In practical terms, yotta-scale AI represents an unprecedented level of global compute scale,” Navolokin said. “One exaflop represents a billion-billion calculations per second, while one yottaflop equals one million exaflops.”
Reaching that level would require the equivalent of millions of today’s exaflop-class supercomputers operating together. Navolokin said the move toward yotta-scale AI is
being driven
by AI’s transition from on-demand workloads to “always-on intelligence”, including continuous inference, reasoning, and autonomous agents serving billions of real-time interactions.
Navolokin said enterprise infrastructure planning is becoming more complex. Rather than focusing only on raw compute performance or individual components, organisations need to consider the wider system. He cited silicon, software, networking, memory, orchestration, and power efficiency as part of that planning.

Alexey Navolokin, General Manager, APAC, at AMD
“For enterprises, the real shift is that AI infrastructure planning is becoming much more complex,” Navolokin said. “Organisations can no longer focus only on raw compute performance or individual components.”
See also:
AMD positions AI as a default part of PC and edge computing at CES 2026
He said yotta-scale AI would require an open, distributed compute fabric where CPUs, GPUs, networking, and software
are designed
to work together. That fabric would need to operate across cloud platforms, centralised data centres, edge systems, and endpoint devices.
At the hardware level, Navolokin pointed to rack-scale and system-level architectures designed for large-scale inference and
agentic AI
workloads. These systems require high-bandwidth memory and energy-efficient compute. They also need tighter integration between CPUs, GPUs, and networking.
Navolokin said networking is becoming a core design requirement. As AI systems scale across thousands or millions of nodes, he said the challenge extends beyond compute performance to moving large volumes of data with low latency.
Navolokin said open standards such as UALink and Ultra Ethernet are relevant because they support scalable
and
interoperable AI infrastructure.
On the software side, Navolokin said
open ecosystems
are needed
for portability and workload optimisation across different environments.
“Developers and enterprises need portability, flexibility, and the ability to optimise workloads across diverse environments without being locked into proprietary stacks,” he said.
Platforms such as AMD ROCm, along with industry collaboration around open standards and frameworks, are part of that approach. Navolokin said this gives developers and enterprises more flexibility when building distributed AI systems.
Malaysia’s role in AMD’s regional AI focus
In
Asia
Pacific, Navolokin said Malaysia is building momentum through investments in digital infrastructure and AI-ready data centres. He also pointed to cloud adoption, workforce development, and the country’s position in the semiconductor and electronics ecosystem.
From AMD’s perspective, Malaysia’s role is tied not only to demand for compute, but also to the broader need for systems-level AI infrastructure. Navolokin said the company sees an opportunity to help enterprises build long-term AI environments. Those environments need to bring together silicon, software, networking, and energy efficiency.
“Malaysia is an important part of Southeast Asia’s growing AI ecosystem,” he said, adding that AMD remains focused on supporting the region through AI infrastructure and ecosystem partnerships.
Navolokin linked AMD’s regional focus to enterprise deployments across cloud, data
centre
, edge, and
endpoint environments
.
He said open platforms and ecosystem collaboration are becoming more important as those deployments expand.
From pilots to production
As organisations move from AI pilots to production, Navolokin said three issues appear most often. The first is infrastructure modernisation, as many enterprises still operate legacy environments that
were not designed
for continuous AI workloads.
He said organisations need to improve compute and power efficiency while optimising data centre space. They also need to refresh ageing systems to support real-time AI operations. These requirements become more important as inference workloads move into production environments.
Some organisations are still using AI mainly for workflow automation and efficiency gains, while others are exploring new business models built around AI, Navolokin said. The pace of that transition depends heavily on data readiness and whether enterprise workflows
are structured
in ways AI systems can use.
The second issue is data readiness. Navolokin said companies need to understand where their data resides and whether it is accessible across the organisation. They also need workflows that AI systems can use.
The third issue is architectural flexibility.
As AI environments evolve, enterprises are
looking for
infrastructure that can integrate multiple technologies and scale across
different
deployment models.
Navolokin said the goal is to do this without adding unnecessary complexity.
“AI readiness depends on how effectively organisations can modernise their enterprise stacks to connect data flows, applications, and operational workflows in ways that make AI practical at production scale,” Navolokin said.
Cloud, edge, and endpoint deployment
Hyperscale infrastructure will remain important for large-scale model training and inference. However, Navolokin said many emerging workloads require low-latency inferencing closer to where data
is generated
. These include use cases in manufacturing, logistics, retail, healthcare, and physical AI.
Navolokin said enterprises are placing
more
emphasis o
n distributed AI deployment across edge, on-premises, cloud, and client devices.
He said organisations are also seeking consistency across these environments, including interoperability, operational efficiency, and predictable performance.
That distributed model also extends to endpoint devices, including AI PCs. Navolokin said some real-time inference workloads are better suited to systems closer to the data source. Latency, energy use, cost, and privacy requirements can differ from centralised infrastructure.
Navolokin said AI infrastructure is becoming more workload-aware. Different workloads require different types of compute in different locations, from centralised data centres to edge systems and endpoint devices.
Efficiency and flexibility
Power consumption and cost are also becoming central considerations. Navolokin said enterprises
are increasingly focused
on infrastructure productivity, or how efficiently they can deliver performance within power, cooling, and budget constraints.
“Different workloads have very different requirements, so improving efficiency at scale increasingly means using the right compute engine for the right job,” he said.
Depending on workload requirements, enterprises may use CPUs, GPUs, adaptive computing, edge systems, or AI PCs. Navolokin said openness and interoperability remain part of that
efficiency discussion
. These considerations become more important as organisations deploy AI across cloud, on-premises, edge, and endpoint environments.
Navolokin said AMD’s regional role is centred on workload flexibility, pointing to CPUs, GPUs, adaptive computing, edge systems, and AI PCs as options for different AI requirements.
See also:
AMD: Why CPUs matter more in agentic AI systems
To avoid repeated re-architecture, Navolokin said enterprises should design AI infrastructure
around
openness and flexibility.
Open ecosystems allow organisations to choose tools for specific workloads, customise deployments, and scale without being locked into proprietary architectures.
He said large-scale model training will continue to rely on centralised infrastructure. At the same time,
real-time inference workloads can be better suited
to edge systems or AI PCs located closer to the data source. These environments can help address latency, energy use, and data privacy requirements.

Want to learn more about AI and big data from industry leaders?
Check out
AI
& Big Data Expo
taking place in Amsterdam, California, and London. The comprehensive event is part of
TechEx
and
is co-located
with other leading technology events
, click
here
for more information.
Tech Wire Asia is powered by
TechForge Media
. Explore other upcoming enterprise technology events and webinars
here
.
