NVIDIA Inference Breakthrough Makes Conversational AI Smarter, More Interactive From Cloud to Edge

July 20, 2021April 12, 2022 Admin AI, AI Applications, software, technology, theinfotech

NVIDIA today launched TensorRT™ 8, the eighth generation of the company’s AI software, which slashes inference time in half for language queries — enabling developers to build the world’s best-performing search engines, ad recommendations and chatbots and offer them from the cloud to the edge.

TensorRT 8’s optimizations deliver record-setting speed for language applications, running BERT-Large, one of the world’s most widely used transformer-based models, in 1.2 milliseconds. In the past, companies had to reduce their model size, which resulted in significantly less accurate results. Now, with TensorRT 8, companies can double or triple their model size to achieve dramatic improvements in accuracy.

Read More

You May Also Like

Subscribe to our newsletter