Meta unveils second-gen AI training and inference chip

We Keep you Connected

Meta unveils second-gen AI training and inference chip


Meta’s MTIA v2 doubles the quantity of on-chip reminiscence to triple efficiency on AI duties.


Meta has unveiled its second-generation “training and inference accelerator” chip, or “MTIA”, just about a moment later the primary model, and the corporate says its fresh section brings really extensive efficiency enhancements.

Meta — as with alternative tech giants, corresponding to Microsoft, Google, and Tesla — is making an investment in customized synthetic wisdom (AI) {hardware} to hedge in opposition to the monopoly energy of the important GPU provider, Nvidia. The funding could also be a technique to safeguard the provision of computing for the reason that Nvidia has now not been in a position to create enough quantity chips to fulfill call for right through the surprising surge in generative AI hobby.

Like the primary section, the MTIA model 2 chip is composed of a mesh of blocks of circuits that perform in parallel, an “8×8 grid of processing elements (PEs)”. The chip plays 3.5 instances as speedy as MTIA v1, stated Meta. It’s seven instances sooner on AI duties that contain “sparse” computation, the ones the place variables have a nil price.


Meta stated advantages come from adjustments to the chip’s structure and enhanced reminiscence and attic. “We have tripled the size of the local PE storage, doubled the on-chip SRAM and increased its bandwidth by 3.5X, and doubled the capacity of LPDDR5,” stated the tech vast.

MTIA v2 architectural diagram.


The chip is inbuilt a 5-nanometer procedure generation advanced by means of agreement chip production vast Taiwan Semiconductor Production.

The bigger chip, measuring 421 sq. millimeters as opposed to 373 for the v1, holds 2.4 billion gates, stated Meta, and plays 103 million floating-point math operations in keeping with moment. That efficiency compares to at least one.1 billion gates and 65 million operations for the sooner type.


As with MTIA v1, the fresh chip runs device that optimizes techniques the usage of Meta’s PyTorch open-source developer framework. Two device compilers collaborate — one at the entrance finish compiles the compute graph of a program, and one at the again finish is written in the open-source Triton compiler language to generate optimum device code for the chip.

Meta stated the device building paintings for MTIA v1 allowed the corporate to abruptly deliver the fresh chip to fruition from, “going from first silicon to production models running in 16 regions in less than nine months.” The tech corporate stated the chip is being deployed to assistance score and advice promoting fashions.

Meta stated it has designed a rack-mount pc machine working 72 MTIA v2s in parallel. “Our design ensures we provide denser capabilities with higher compute, memory bandwidth, and memory capacity,” stated Meta. “This density allows us to more easily accommodate a broad range of model complexities and sizes.”

The tech corporate plans to proceed to put money into customized {hardware} design. “We currently have several programs underway aimed at expanding the scope of MTIA, including support for GenAI workloads,” stated Meta. “We’re designing our custom silicon to work in cooperation with our existing infrastructure as well as with new, more advanced hardware (including next-generation GPUs) that we may leverage in the future.”