What is a Hopper GPU?
Hopper GPUs represent NVIDIA’s newest generation of high-performance graphics processors, purpose-built for AI and high-performance computing (HPC). Featuring a cutting-edge architecture with advanced Tensor Cores, they integrate multiple innovative technologies to deliver maximum efficiency. Ideal for a wide range of workloads, Hopper GPUs support AI inference, deep learning training, generative AI, and more.
What is the architectural design of NVIDIA’s Hopper GPUs?
The name ‘Hopper GPU’ is derived from the Hopper architecture, which is the GPU microarchitecture that forms the foundation of high-performance graphics processors and is optimised for AI workloads and HPC applications. Hopper GPUs are manufactured by TSMC using the 4-nanometer process and have over 80 billion transistors, making them some of the most advanced graphics cards available on the market.
With the Hopper architecture, NVIDIA combines the latest generation of Tensor Cores with five groundbreaking innovations: transformer engine, NVLink/NVSwitch/NVLink switch systems, confidential computing, second-generation multi-instance GPUs (MIGs) and DPX instructions. These technologies enable Hopper GPUs to achieve up to 30x AI inference acceleration over the previous generation (based on NVIDIA’s Megatron 530B chatbot — the world’s most comprehensive generative language model).
Manage any workload with flexible GPU computing power, and only pay for the resources you use.
What are the innovative features of Hopper GPUs?
Hopper GPUs have several new features that help improve performance, efficiency and scalability. We present the most important innovations below:
- Transformer engine: With the help of the transformer engine, Hopper GPUs are able to train AI models up to nine times faster. For inference tasks in the area of language models, the GPUs achieve up to 30 times the acceleration of the previous generation.
- NVLink switch system: The fourth generation of NVLink delivers a bidirectional GPU bandwidth of 900 GB/s, while NVSwitch ensures better scalability of H200 clusters. This ensures that AI models with trillions of parameters can be processed efficiently.
- Confidential computing: The Hopper architecture ensures that your data, AI models and algorithms are also protected during processing.
- Multi-instance GPU (MIG) 2.0: The second generation of MIG technology allows a single Hopper GPU to be split into up to seven isolated instances. This allows several people to process different workloads simultaneously without interfering with each other.
- DPX instructions: DPX instructions allow dynamically programmed algorithms to be calculated up to seven times faster than with GPUs of the Ampere architecture.
In our article comparing server GPUs, we present the best graphics processors for your server. You can also find out everything there is to know about GPU servers in another of our helpful articles.
What use cases are Hopper GPUs suitable for?
NVIDIA GPUs based on the Hopper architecture are designed for a wide range of high-performance workloads. The main areas of application for Hopper GPUs are: ¬
- Inference tasks: The GPUs are among the industry-leading solutions for the productive use of AI inference. Whether recommendation systems in ecommerce, medical diagnostics or real-time predictions for autonomous driving, Hopper GPUs can process huge amounts of data quickly and efficiently.
- Generative AI: The high-end GPUs provide the necessary computing power to train and execute tools with generative AI. Parallel processing allows more efficient calculations for creative tasks such as text, image and video generation.
- Deep learning training: With their high computing power, Hopper GPUs are ideal for training large neural networks. The Hopper architecture significantly shortens the training times of AI models.
- Conversational AI: Optimised for natural language processing (NLP), Hopper GPUs are ideal for AI-powered language systems, such as virtual assistants and AI chatbots. They accelerate the processing of large AI models and ensure responsive interaction that can be seamlessly integrated into business processes, such as support.
- Data analysis and big data: Hopper GPUs handle huge amounts of data at high speed and accelerate complex calculations through massive parallel processing. This enables companies to evaluate big data faster in order to make forecasts and initiate the right measures.
- Science and research: As the GPUs are designed for HPC applications, they are ideal for highly complex simulations and calculations. Hopper GPUs are used, for example, in astrophysics, climate modelling and computational chemistry.
Current models from NVIDIA
With the release of the NVIDIA H100 and the NVIDIA H200, the U.S.-based company has introduced two Hopper GPUs to the market. In contrast, the NVIDIA A30 is still built on the previous Ampere architecture. Technically speaking, the H200 isn’t a completely new model but rather an enhanced version of the H100. The following overview highlights the key differences between these two GPUs:
- Memory and bandwidth: While the NVIDIA H100 is equipped with an 80 GB HBM3 memory, the H200 GPU has an HBM3e memory with a capacity of 141 GB. The H200 is also clearly ahead in terms of memory bandwidth with 4.8 TB/s compared to 2 TB/s for the H100.
- Performance for AI inference: In comparison, the NVIDIA H200 provides twice the inference performance for models such as LLaMA 2-70 B. This allows not only faster processing, but also efficient scaling.
- HPC applications and scientific computing: The H100 already offers a first-class level of performance for complex calculations, which the H200 surpasses. The inference speed is up to twice as high, the HPC performance around 20 percent higher.