Millions of RTX PCs and workstations may now access generative AI thanks to NVIDIA’s Tensor Core GPUs, LLMs, and tools

Today, NVIDIA unveiled new laptops with AI from every major manufacturer, GeForce RTXTM SUPER desktop GPUs for supercharged generative AI performance, and new NVIDIA RTXTM-accelerated AI tools and software for consumers and developers alike.

NVIDIA is now offering these tools to enhance PC experiences with generative AI, building on decades of PC leadership with over 100 million of its RTX GPUs driving the AI PC era. These include games that use DLSS 3 technology with Frame Generation, NVIDIA ACE microservices, and NVIDIA TensorRTTM acceleration of the popular Stable Diffusion XL model for text-to-image workflows.

Later this month, AI Workbench—a centralized, user-friendly toolbox for AI developers—will go into beta. Furthermore, more pre-optimized models for PCs are now supported by NVIDIA TensorRT-LLM (TRT-LLM), an open-source package that enhances and speeds up the inference performance of the newest large language models (LLMs). AI aficionados can engage with their notes, documents, and other content through Chat with RTX, an NVIDIA tech demo that is also being released this month. Chat with RTX is accelerated by TRT-LLM.

“Generative AI is the single most significant platform transition in computing history and will transform every industry, including gaming,” said Jensen Huang, founder and CEO of NVIDIA. “With over 100 million RTX AI PCs and workstations, NVIDIA is a massive installed base for developers and gamers to enjoy the magic of generative AI.”

Applications that are sensitive to latency, privacy, or cost must run generative AI locally on a PC. To fine-tune and optimize AI models on the PC platform, a sizable installed base of AI-ready systems is needed, in addition to the appropriate developer tools.

In response to these demands, NVIDIA is introducing improvements throughout its entire technology stack, fostering novel experiences and expanding upon the more than 500 AI-enabled PC games and apps that NVIDIA RTX technology has already accelerated.

Workstations and PCs with RTX AI

On PCs, NVIDIA RTX GPUs enable generative AI to its fullest potential by executing a wide variety of applications at peak performance. These GPUs’ Tensor Cores significantly accelerate AI performance in even the most taxing work and play applications.

The GeForce RTX 4080 SUPER, 4070 Ti SUPER, and 4070 SUPER graphics cards for optimal AI performance are part of the new GeForce RTX 40 SUPER Series, which was also unveiled today at CES. Compared to the GeForce RTX 3080 Ti GPU, the GeForce RTX 4080 SUPER produces AI video 1.5 times faster and pictures 1.7 times faster. With up to 836 trillion operations per second, the Tensor Cores in SUPER GPUs enable revolutionary AI capabilities to everyday productivity, gaming, and creative endeavors.

Prominent producers, such as Acer, ASUS, Dell, HP, Lenovo, MSI, Razer, and Samsung, are introducing a fresh batch of RTX AI laptops that provide consumers a comprehensive range of generative AI features straight from the box. This month, the new systems will go on sale. Compared to employing neural processing units, the systems offer a performance improvement of 20–60x.

NVIDIA AI Enterprise tools, such as TensorRT and NVIDIA RAPIDSTM for easier, safer generative AI and data science development, may be executed on mobile workstations with RTX GPUs. Every NVIDIA A800 40GB Active GPU comes with a three-year license for NVIDIA AI Enterprise, which offers the best workstation development platform for AI and data science.

New Tools for PC Developers Creating AI Models

NVIDIA recently unveiled NVIDIA AI Workbench, a tool designed to assist developers in easily creating, testing, and customizing pretrained generative AI models and LLMs using PC-class performance and memory footprint.

Later this month, AI Workbench will go into beta. It provides developers with easy-to-use interfaces that make it simple to replicate, collaborate on, and transfer projects. Additionally, it provides streamlined access to popular repositories such as Hugging Face, GitHub, and NVIDIA NGCTM.

Projects can be expanded to almost any location, including a public cloud, a data center, or NVIDIA DGXTM Cloud, and then returned to nearby RTX systems on a PC or workstation for analysis and little customization.

By incorporating NVIDIA AI Foundation Models and Endpoints—which include RTX-accelerated AI models and software development kits—into the HP AI Studio, a centralized data science platform, NVIDIA and HP are also streamlining the process of developing AI models. Users will find it simple to import, search for, and use optimized models on cloud-based and PC platforms as a result.

Once AI models are built for PC use cases, developers may use NVIDIA TensorRT to optimize them and fully utilize the Tensor Cores of RTX GPUs.

With the release of TensorRT-LLM for Windows, an open-source package for speeding LLMs, NVIDIA expanded TensorRT to text-based apps. With the most recent TensorRT-LLM release, Phi-2 is now part of the expanding collection of pre-optimized models for PC that operate up to five times quicker than alternative inference backends.

Accelerated Generative AI with RTX Enables Innovative PC Experiences

New generative AI-powered PC apps and services are being released at CES by NVIDIA and its developer partners, including:

A platform for producing gorgeous RTX remasters of vintage games is NVIDIA RTX Remix. It offers generative AI capabilities that can convert simple textures from old-school video games into cutting-edge, physically based, 4K-resolution rendering materials when it launches in beta later this month.

NVIDIA ACE microservices allow developers to create dynamic, intelligent digital avatars to games, such as speech and animation models driven by generative AI.

TensorRT acceleration for Stable Diffusion XL (SDXL) Turbo and latent consistency models, two of the most popular Stable Diffusion acceleration methods. TensorRT improves performance for both by up to 60% compared with the previous fastest implementation. An updated version of the Stable Diffusion WebUI TensorRT extension is also now available, including acceleration for SDXL, SDXL Turbo,

LCM stands for Low-Rank Adaptation (LoRA) and enhanced LoRA assistance.
Twelve of the fourteen new RTX games that have been revealed will use NVIDIA DLSS 3 with Frame Generation, which uses AI to enhance frame rates up to 4x compared with native rendering. These games include Horizon Forbidden West, Pax Dei, and Dragon’s Dogma 2.
Retrieval-augmented generation (RAG) is a widely used technique that makes it simple for AI enthusiasts to connect PC LLMs to their own data. Chat with RTX is an NVIDIA tech demo that will be accessible later this month. Users may interact with their notes, papers, and other stuff fast thanks to TensorRT-LLM’s acceleration of the demo. In order to facilitate developers’ ability to incorporate the same features into their own applications, it will also be made available as an open-source reference project.

Become acquainted with the most recent advancements in generative AI by attending CES with NVIDIA.

Topics #NVIDIA #RTX GPU