/doc/ai/scaling/mixture-of-experts/2021-04-12-jensenhuang-gtc2021keynote-eAn_oiZwUXA.en.vtt.txt (syntax-highlighted preview)

WEBVTT
Kind: captions
Language: en

00:13:29.708 --> 00:13:31.844
I am a creator

00:13:39.151 --> 00:13:41.353
Blending art and technology

00:13:48.160 --> 00:13:50.729
To immerse our senses

00:13:57.670 --> 00:13:59.138
I am a healer

00:14:01.540 --> 00:14:03.676
Helping us take the next step

00:14:08.614 --> 00:14:10.382
And see what's possible

00:14:19.792 --> 00:14:21.660
I am a pioneer

00:14:25.364 --> 00:14:27.233
Finding life-saving answers

00:14:33.272 --> 00:14:35.841
And pushing the edge to the outer limits.

00:14:41.513 --> 00:14:43.082
I am a guardian

00:14:44.917 --> 00:14:46.585
Defending our oceans

00:14:52.157 --> 00:14:55.394
Magnificent creatures that call them home

00:15:01.433 --> 00:15:03.235
I am a protector

00:15:06.272 --> 00:15:09.775
Helping the earth breathe easier

00:15:16.849 --> 00:15:19.852
And watching over it for generations to come

00:15:23.589 --> 00:15:25.224
I am a storyteller

00:15:27.693 --> 00:15:29.795
Giving emotion to words

00:15:31.830 --> 00:15:33.365
And bringing them to life.

00:15:35.601 --> 00:15:39.138
I am even the composer of the music.

00:15:49.381 --> 00:15:51.116
I am AI

00:15:51.917 --> 00:15:59.959
Brought to life by NVIDIA, deep learning, and brilliant minds 
everywhere.

00:16:06.865 --> 00:16:09.735
There are powerful forces shaping the world's industries.

00:16:10.302 --> 00:16:18.143
Accelerated computing that we pioneered has supercharged scientific 
discovery, while providing the computer industry a path forward.

00:16:18.911 --> 00:16:22.982
Artificial intelligence particularly, has seen incredible advances.

00:16:23.482 --> 00:16:29.355
With NVIDIA GPUs computers learn, and software writes software  no 
human can.

00:16:30.022 --> 00:16:35.894
The AI software is delivered as a service from the cloud, performing 
automation at the speed-of-light.

00:16:36.729 --> 00:16:44.803
Software is now composed of microservices that scale across the 
entire data center - treating the data center as a single-unit of 
computing.

00:16:45.604 --> 00:16:50.999
AI and 5G are the ingredients to kick start the 4th industrial 
revolution

00:16:50.999 --> 00:16:54.771
where automation and robotics can be deployed to the far edges of the 
world.

00:16:55.614 --> 00:17:02.421
There is one more miracle we need, the metaverse,  a virtual world 
that is a digital twin of ours.

00:17:03.422 --> 00:17:07.960
Welcome to GTC 2021 - we are going to talk about these dynamics and 
more.

00:17:09.194 --> 00:17:11.130
Let me give you the architecture of my talk.

00:17:11.764 --> 00:17:17.936
It's organized in four stacks - this is how we work - as a full-stack 
computing platform company.

00:17:18.604 --> 00:17:26.311
The flow also reflects the waves of AI and how we're expanding the 
reach of our platform to solve new problems, and to enter new markets.

00:17:27.212 --> 00:17:35.721
First is Omniverse - built from the ground-up on NVIDIA's body of 
work. It is a platform to create and simulate virtual worlds.

00:17:36.422 --> 00:17:43.495
We'll feature many applications of Omniverse, like design 
collaboration, simulation, and future robotic factories.

00:17:44.329 --> 00:17:47.800
The second stack is DGX and high-performance data centers.

00:17:48.467 --> 00:17:56.975
I'll feature BlueField, new DGXs, new chips, and the new work we're 
doing in AI, drug discovery, and quantum computing.

00:17:57.810 --> 00:18:00.512
Here, I'll also talk about Arm and new Arm partnerships.

00:18:01.313 --> 00:18:08.320
The third stack is one of our most important new platforms - NVIDIA 
EGX with Aerial 5G.

00:18:09.254 --> 00:18:14.159
Now, enterprises and industries can do AI and deploy AI-on-5G.

00:18:15.060 --> 00:18:20.332
We'll talk about NVIDIA AI and Pre-Trained Models, like Jarvis 
Conversational AI.

00:18:21.133 --> 00:18:27.639
And finally, our work with the auto industry to revolutionize the 
future of transportation - NVIDIA Drive.

00:18:28.273 --> 00:18:32.878
We'll talk about new chips, new platforms and software, and lots of 
new customers.

00:18:33.212 --> 00:18:34.146
Let's get started.

00:18:35.380 --> 00:18:39.952
Scientists, researchers, developers, and creators are using NVIDIA to 
do amazing things.

00:18:40.085 --> 00:18:50.162
Your work gets global reach with the installed base of over a billion 
CUDA GPUs shipped and 250 ExaFLOPS of GPU computing power in the 
cloud.

00:18:50.462 --> 00:18:57.202
Two and a half million developers and 7,500 startups are creating 
thousands of applications for accelerated computing.

00:18:57.503 --> 00:19:03.041
We are thrilled by the growth of the ecosystem we are building 
together and will continue to put our heart and soul into advancing 
it.

00:19:03.642 --> 00:19:10.349
Building tools for the Da Vincis of our time is our purpose. And in 
doing so, we also help create the future.

00:19:11.183 --> 00:19:16.755
Democratizing high-performance computers is one of NVIDIA's greatest 
contributions to science.

00:19:17.523 --> 00:19:22.082
With just a GeForce, every student can have a supercomputer.

00:19:22.082 --> 00:19:28.593
This is how Alex Krizhevsky, Ilya, and Hinton trained AlexNet that 
caught the world's attention on deep learning.

00:19:29.368 --> 00:19:34.106
And with GPUs in supercomputers, we gave scientists a time machine.

00:19:35.073 --> 00:19:40.979
A scientist once told me that because of NVIDIA's work, he can do his 
life's work in his lifetime.

00:19:42.214 --> 00:19:43.715
I can't think of a greater purpose.

00:19:44.516 --> 00:19:46.785
Let me highlight a few achievements from last year.

00:19:47.719 --> 00:19:50.622
NVIDIA is continually optimizing the full stack.

00:19:51.156 --> 00:19:56.094
With the chips you have, your software runs faster every year and 
even faster if you upgrade.

00:19:56.828 --> 00:20:05.589
On our gold suite of important science codes, we increased 
performance 13-fold in the last 5 years, and for some, performance 
doubled every year.

00:20:06.305 --> 00:20:12.344
NAMD molecular dynamics simulator, for example, was re-architected 
and can now run across multiple GPUs.

00:20:13.111 --> 00:20:22.921
Researchers led by Dr. Rommie Amaro at UC San Diego, used this 
multi-GPU NAMD, running on Oak Ridge Summit supercomputer's 20,000 
NVIDIA GPUs,

00:20:22.921 --> 00:20:27.960
 to do the largest atomic simulation ever - 305 million atoms.

00:20:28.160 --> 00:20:33.665
This work was critical to a better understanding of the COVID-19 
virus and accelerated the making of the vaccine.

00:20:34.700 --> 00:20:39.238
Dr. Amaro and her collaborators won the Gordon Bell Award for this 
important work.

00:20:39.972 --> 00:20:47.012
I'm very proud to welcome Dr. Amaro and more than 100,000 of you to 
this year's GTC - our largest-ever by double.

00:20:47.846 --> 00:20:52.290
We have some of the greatest computer scientists and researchers of 
our time speaking here.

00:20:52.290 --> 00:21:00.532
3 Turing award winners, 12 Gordon Bell award winners,  9 Kaggle Grand 
Masters - and even 10 Oscar winners.

00:21:00.993 --> 00:21:04.796
We're also delighted to have the brightest minds from industry 
sharing their discoveries.

00:21:05.330 --> 00:21:13.171
Leaders from every field - healthcare, auto, finance, retail, energy, 
internet services, every major enterprise IT company.

00:21:13.939 --> 00:21:20.557
They're bringing you their latest work in COVID research, data 
science, cybersecurity, new approaches to computer graphics

00:21:20.557 --> 00:21:23.317
and the most recent advances in AI and robotics.

00:21:23.849 --> 00:21:31.923
In total, 1,600 talks about the most important technologies of our 
time, from the leaders in the field that are shaping our world.

00:21:32.491 --> 00:21:33.492
Welcome to GTC.

00:21:34.459 --> 00:21:37.763
Let's start where NVIDIA started...computer graphics.

00:21:38.397 --> 00:21:46.102
Computer graphics is the driving force of our technology. Hundreds of 
millions of gamers and creators each year seek out the best NVIDIA 
has to offer.

00:21:46.538 --> 00:21:53.241
At its core, computer graphics is about simulations - using 
mathematics and computer science to simulate the interactions of 
light 

00:21:53.241 --> 00:22:00.052
and material, the physics of objects, particles, and waves; and now 
simulating intelligence and animation.

00:22:00.485 --> 00:22:07.859
The science, engineering, and artistry that we dedicate in pursuit of 
achieving mother nature's physics has led to incredible advances.

00:22:08.460 --> 00:22:14.266
And allowed our technology to contribute to advancing the basic 
sciences, the arts, and the industries.

00:22:15.067 --> 00:22:22.908
This last year, we introduced the 2nd generation of RTX - a new 
rendering approach that fuses rasterization and programmable shading 
with

00:22:22.908 --> 00:22:29.548
hardware-accelerated ray tracing and artificial intelligence. This is 
the culmination of ten years of research.

00:22:30.082 --> 00:22:37.255
RTX has reset computer graphics, giving developers a powerful new 
tool just as rasterization plateaus.

00:22:37.656 --> 00:22:44.129
Let me show you some amazing footage from games in development. The 
technology and artistry is amazing.

00:22:44.863 --> 00:22:48.333
We’re giving the world’s billion gamers an incredible reason to 
upgrade.

00:24:38.877 --> 00:24:41.513
RTX is a reset of computer graphics.

00:24:41.913 --> 00:24:49.187
It has enabled us to build Omniverse - a platform for connecting 3D 
worlds into a shared virtual world.

00:24:49.788 --> 00:24:57.295
Ones not unlike the science fiction metaverse first described by Neal 
Stephenson in his early 1990s novel, Snow Crash, where the metaverse

00:24:57.295 --> 00:25:05.103
would be collectives of shared 3D spaces and virtually-enhanced 
physical spaces that are extensions of the internet.

00:25:05.537 --> 00:25:14.934
Pieces of the early-metaverse vision are already here - massive 
online social games like Fortnite or user-created virtual worlds like 
Minecraft.

00:25:15.714 --> 00:25:22.187
Let me tell you about Omniverse from the perspective of two 
applications - design collaboration and digital twins.

00:25:22.954 --> 00:25:25.023
There are several major parts of the platform.

00:25:25.390 --> 00:25:30.272
First, the Omniverse Nucleus, a database engine that connects users

00:25:30.272 --> 00:25:34.332
and enables the interchange of 3D assets and scene descriptions.

00:25:34.699 --> 00:25:43.690
Once connected, designers doing modeling, layout, shading, animation, 
lighting, special effects or rendering can collaborate to create a 
scene.

00:25:44.075 --> 00:25:52.397
The Omniverse Nucleus is described with the open standard USD, 
Universal Scene Description, a fabulous interchange framework 
invented by Pixar.

00:25:53.251 --> 00:25:59.524
Multiple users can connect to Nucleus, transmitting and receiving 
changes to their world as USD snippets.

00:25:59.991 --> 00:26:06.798
The 2nd part of Omniverse is the composition, rendering, and 
animation engine - the simulation of the virtual world.

00:26:07.299 --> 00:26:15.774
Omniverse is a platform built from the ground up to be 
physically-based. It is fully path-traced. Physics is simulated with 
NVIDIA PhysX,

00:26:15.774 --> 00:26:21.713
materials are simulated with NVIDIA MDL and Omniverse is fully 
integrated with NVIDIA AI.

00:26:22.380 --> 00:26:30.255
Omniverse is cloud-native, multi-GPU scalable and runs on any RTX 
platform and streams remotely to any device.

00:26:31.022 --> 00:26:34.826
The third part is NVIDIA CloudXR, a stargate if you will.

00:26:35.060 --> 00:26:41.800
You can teleport into Omniverse with VR and AIs can teleport out of 
Omniverse with AR.

00:26:42.200 --> 00:26:44.469
Omniverse was released to open beta in December.

00:26:44.769 --> 00:26:46.638
Let me show you what talented creators are doing.

00:27:16.267 --> 00:27:18.903
Creators are doing amazing things with Omniverse.

00:27:19.237 --> 00:27:27.839
At Foster and Partners, designers in 17 locations around the world, 
are designing buildings together in their Omniverse shared virtual 
space.

00:27:28.480 --> 00:27:33.864
ILM is testing Omniverse to bring together internal and external tool 
pipelines from multiple studios.

00:27:33.864 --> 00:27:40.416
Omniverse lets them collaborate, render final shots in real time and 
create massive virtual sets like holodecks.

00:27:41.259 --> 00:27:48.400
Ericsson is using Omniverse to do real-time 5G wave propagation 
simulation, with many multi-path interferences.

00:27:48.800 --> 00:27:55.573
Twin Earth is creating a digital twin of Earth that will run on 
20,000 NVIDIA GPUs.

00:27:55.573 --> 00:28:02.380
And Activision is using Omniverse to organize their more than 100,000 
3D assets into a shared and searchable world.

00:28:12.691 --> 00:28:16.294
Bentley is the world's leading infrastructure engineering software 
company.

00:28:17.162 --> 00:28:23.566
Everything that's constructed - roads and bridges, rail and transit 
systems, airports and seaports -

00:28:23.566 --> 00:28:29.188
about 3% of the world's GDP or three-and-a-half trillion dollars a 
year.

00:28:29.207 --> 00:28:35.451
Bentley's software is used to design, model, and simulate the largest 
infrastructure projects in the world.

00:28:35.451 --> 00:28:39.129
90% of the world's top 250 engineering firms use Bentley.

00:28:40.018 --> 00:28:49.571
They have a new platform called iTwin - an exciting strategy to use 
the 3D model, after construction, to monitor and optimize the 
performance throughout its life.

00:28:50.295 --> 00:28:55.848
We are super excited to partner with Bentley to create infrastructure 
digital twins in Omniverse.

00:28:55.848 --> 00:29:01.614
Bentley is the first 3rd-party company to be developing a suite of 
applications on the Omniverse platform.

00:29:02.140 --> 00:29:08.546
This is just an awesome use of Omniverse, a great example of digital 
twins and Bentley is the perfect partner.

00:29:08.980 --> 00:29:15.120
And here's Perry Nightingale from WPP, the largest ad agency in the 
world, to tell you what they're doing.

00:29:18.490 --> 00:29:25.432
WPP is the largest marketing services organization on the planet, and 
because of that, we're also one of the largest production companies 
in the world 

00:29:25.432 --> 00:29:28.594
That is a major carbon hotspot for us.

00:29:28.594 --> 00:29:35.138
We've partnered with NVIDIA, to capture locations virtually and bring 
them to life with studios in Omniverse.

00:29:35.138 --> 00:29:39.185
Over 10 billion points have turned into a giant mesh in Omniverse.

00:29:39.185 --> 00:29:45.361
For the first time, we can shoot locations virtually that are as real 
as the actual places themselves.

00:29:45.361 --> 00:29:48.838
Omniverse also changes the way we make work.

00:29:48.838 --> 00:29:56.691
A collaborative platform, that means multiple artists, at multiple 
points in the pipeline, in multiple parts of the world can 
collaborate on a single scene.

00:29:56.691 --> 00:30:04.968
Real time CGI in sustainable studios. Collaboration with Omniverse is 
the future of film at WPP.

00:30:07.505 --> 00:30:11.276
One of the most important features of Omniverse is that it obeys the 
laws of physics.

00:30:11.810 --> 00:30:16.347
Omniverse can simulate particles, fluids, materials, springs and 
cables.

00:30:16.881 --> 00:30:19.217
This is a fundamental capability for robotics.

00:30:19.617 --> 00:30:23.721
Once trained, the AI and software can be downloaded from Omniverse.

00:30:24.022 --> 00:30:30.735
In this video, you'll see Omniverse's physics simulation with rigid 
and soft bodies, fluids, and finite element modeling.

00:30:30.735 --> 00:30:32.652
And a lot more - enjoy!

00:32:16.601 --> 00:32:20.972
Omniverse is a physically-based virtual world where robots can learn 
to be robots.

00:32:21.372 --> 00:32:27.278
They'll come in all sizes and shapes - box movers, pick and place 
arms, forklifts, cars, trucks.

00:32:27.545 --> 00:32:34.886
In the future, a factory will be a robot, orchestrating many robots 
inside, building cars that are robots themselves.

00:32:35.320 --> 00:32:41.392
We can use Omniverse to create a virtual factory, train and simulate 
the factory and its robotic workers inside.

00:32:41.726 --> 00:32:48.766
The AI and software that run the virtual factory are exactly the same 
as what will run the actual one.

00:32:49.434 --> 00:32:53.538
The virtual and physical factories and their robots will operate in a 
loop.

00:32:54.339 --> 00:32:55.740
They are digital twins.

00:32:55.773 --> 00:32:59.944
Connecting to ERP systems, simulating the throughput of the factory,

00:33:00.111 --> 00:33:08.019
simulating new plant layouts, and becoming the dashboard of the 
operator - even uplinking into a robot to teleoperate it.

00:33:08.886 --> 00:33:13.291
BMW may very well be the world's largest custom-manufacturing company.

00:33:14.158 --> 00:33:20.665
BMW produces over 2 million cars a year. In their most advanced 
factory, a car a minute. Every car is different.

00:33:21.332 --> 00:33:28.806
We are working with BMW to create a future factory. Designed 
completely in digital, simulated from beginning to end in Omniverse, 
creating a

00:33:28.806 --> 00:33:33.277
digital twin, and operating a factory where robots and humans work 
together.

00:33:34.045 --> 00:33:36.714
Let's take a look at the BMW factory.

00:33:41.185 --> 00:33:50.962
Welcome to BMW Production, Jensen. I am pleased to show you why BMW 
sets the standards for innovation and flexibility. Our collaboration

00:33:50.962 --> 00:33:58.469
with NVIDIA Omniverse and NVIDIA AI leads into a new era of 
digitalization of automobile production.

00:33:58.936 --> 00:34:03.641
Fantastic to be with you, Milan. I am excited to do this virtual 
factory visit with you.

00:34:04.409 --> 00:34:14.218
We are inside the digital twin of BMW's assembly system, powered by 
Omniverse. For the first time, we are able to have our entire factory 
in

00:34:14.218 --> 00:34:24.629
simulation. Global teams can collaborate using different software 
packages like Revit, CATIA, or point clouds to design and plan the 
factory

00:34:24.662 --> 00:34:33.971
in real time 3D. The capability to operate in a perfect simulation 
revolutionizes BMW's planning processes.

00:34:38.509 --> 00:34:48.352
BMW regularly reconfigures its factories to accommodate new vehicle 
launches. Here we see 2 planning experts located in different parts of

00:34:48.352 --> 00:34:58.396
the world, testing a new line design in Omniverse. One of them 
wormholes into an assembly simulation with a motion capture suit, 
records

00:34:58.463 --> 00:35:04.202
task movements, while the other expert adjusts the line design - in 
real time.

00:35:04.202 --> 00:35:09.305
They work together to optimize the line as well as worker ergonomics 
and safety.

00:35:10.274 --> 00:35:12.310
“Can you tell how far I have to bend down there?”

00:35:12.443 --> 00:35:14.278
“First, I’ll get you a taller one”

00:35:14.645 --> 00:35:15.446
“Yeah, it’s perfect”

00:35:15.446 --> 00:35:19.450
We would like to be able to do this at scale in simulation.

00:35:20.218 --> 00:35:23.427
That's exactly why NVIDIA has Digital Human for simulation.

00:35:23.427 --> 00:35:26.861
Digital Humans are trained with data from real associates.

00:35:26.861 --> 00:35:32.623
You can then use Digital Humans in simulation to test new workflows 
for worker ergonomics and efficiency.

00:35:34.699 --> 00:35:41.353
Now, your factories employ 57,000 people that share workspace with 
many robots designed to make their jobs easier.

00:35:41.353 --> 00:35:42.039
Let's talk about them.

00:35:42.807 --> 00:35:43.791
You are right, Jensen.

00:35:43.791 --> 00:35:52.450
Robots are crucial for a modern production system. With NVIDIA Isaac 
robotics platform, BMW is deploying a fleet of

00:35:52.483 --> 00:35:57.506
intelligent robots for logistics to improve the material flow in our 
production.

00:35:57.506 --> 00:36:05.663
This agility is necessary since we produce 2.5 million vehicles per 
year, 99% of them are custom.

00:36:06.998 --> 00:36:12.654
Synthetic data generation and domain randomization available in Isaac 
are key to bootstrapping machine learning.

00:36:12.654 --> 00:36:18.993
Isaac Sim generates millions of synthetic images and vary the 
environment to teach the robots.

00:36:20.611 --> 00:36:29.731
Domain randomization can generate an infinite permutation of 
photorealistic objects, textures, orientations and lighting 
conditions.

00:36:29.731 --> 00:36:36.841
It is ideal for generating ground truth, whether for detection, 
segmentation, or depth perception.

00:36:38.763 --> 00:36:42.443
Let me show you an example of how we can combine it all to operate 
your factory.

00:36:42.443 --> 00:36:49.207
With NVIDIA's Fleet Command, your associates can securely orchestrate 
robots and other devices in the factory from Mission Control.

00:36:50.875 --> 00:37:02.258
They could monitor in real time complex manufacturing cells, update 
software over the air, launch robot missions and teleoperate.

00:37:02.258 --> 00:37:08.232
When a robot needs a helping hand, an alert can be sent to Mission 
Control and one of your associates can take control to help the robot.

00:37:11.362 --> 00:37:17.959
We're in the digital twin of one of your factories, but you have 30 
others spread across 15 countries.

00:37:17.959 --> 00:37:20.142
The scale of BMW production is impressive, Milan.

00:37:20.972 --> 00:37:29.044
Indeed, Jensen, the scale and complexity of our production network 
requires BMW to constantly innovate.

00:37:29.044 --> 00:37:34.159
I am happy about the tight collaboration between our two companies.

00:37:34.159 --> 00:37:41.826
NVIDIA Omniverse and NVIDIA AI give us the chance to simulate all 31 
factories in our production network.

00:37:41.826 --> 00:37:52.668
These new innovations will reduce the planning times, improve 
flexibility and precision and at the end, produce 30% more efficient 
planning processes.

00:37:53.804 --> 00:38:00.804
Milan, I could not be more proud of the innovations that our 
collaboration is bringing to the Factories of the Future.

00:38:00.804 --> 00:38:07.250
I appreciate you hosting me for a virtual visit of the digital twin 
of your BMW Production.

00:38:07.250 --> 00:38:08.719
It is a work of art!

00:38:13.224 --> 00:38:15.593
The ecosystem is really excited about Omniverse.

00:38:16.360 --> 00:38:23.434
This open platform, with USD universal 3D interchange, connects them 
into a large network of users.

00:38:24.135 --> 00:38:28.439
We have 12 connectors to major design tools already with another 40 
in flight.

00:38:29.273 --> 00:38:31.942
Omniverse Connector SDK is available for download now.

00:38:32.810 --> 00:38:35.446
You can see that the most important design tools are already 
signed-up.

00:38:35.980 --> 00:38:43.220
Our lighthouse partners are from some of the world's largest 
industries - Media and Entertainment, Gaming, Architecture, 
Engineering, and

00:38:43.220 --> 00:38:48.459
Construction; Manufacturing, Telecommunications, Infrastructure, and 
Automotive.

00:38:48.959 --> 00:38:55.366
Computer makers worldwide are building NVIDIA-Certified workstations, 
notebooks, and servers optimized for Omniverse.

00:38:56.267 --> 00:39:00.237
And starting this summer, Omniverse will be available for enterprise 
license.

00:39:00.671 --> 00:39:06.544
Omniverse - NVIDIA's platform for creating and simulating shared 
virtual worlds.

00:39:08.446 --> 00:39:10.648
Data center is the new unit of computing.

00:39:11.482 --> 00:39:16.520
Cloud computing and AI are driving fundamental changes in the 
architecture of data centers.

00:39:17.221 --> 00:39:20.925
Traditionally, enterprise data centers ran monolithic software 
packages.

00:39:21.892 --> 00:39:29.845
Virtualization started the trend toward software-defined data centers 
- allowing applications to move about and letting IT manage from a 
"single-pane of glass".

00:39:30.368 --> 00:39:37.608
With virtualization, the compute, networking, storage, and security 
functions are emulated in software running on the CPU.

00:39:38.642 --> 00:39:46.717
Though easier to manage, the added CPU load reduced the data center's 
capacity to run applications, which is its primary purpose.

00:39:47.251 --> 00:39:51.522
This illustration shows the added CPU load in the gold-colored part 
of the stack.

00:39:51.922 --> 00:39:57.628
Cloud computing re-architected data centers again, now to provision 
services for billions of consumers.

00:39:58.229 --> 00:40:04.568
Monolithic applications were disaggregated into smaller microservices 
that can take advantage of any idle resource.

00:40:05.269 --> 00:40:10.207
Equally important, multiple engineering teams can work concurrently 
using CI/CD methods.

00:40:11.409 --> 00:40:17.314
Data center networks became swamped by east-west traffic generated by 
disaggregated microservices.

00:40:18.382 --> 00:40:22.119
CSPs tackled this with Mellanox's high-speed low-latency networking.

00:40:23.387 --> 00:40:31.295
Then, deep learning emerged. Magical internet services were rolled 
out, attracting more customers, and better engagement than ever.

00:40:32.663 --> 00:40:36.700
Deep learning is compute-intensive which drove adoption of GPUs.

00:40:37.368 --> 00:40:44.074
Nearly overnight, consumer AI services became the biggest users of 
GPU supercomputing technologies.

00:40:44.942 --> 00:40:52.416
Now, adding Zero-Trust security initiatives makes infrastructure 
software processing one of the largest workloads in the data center.

00:40:53.083 --> 00:41:00.090
The answer is a new type of chip for Data Center Infrastructure 
Processing like NVIDIA's Bluefield DPU.

00:41:00.090 --> 00:41:04.028
Let me illustrate this with our own cloud gaming service, GeForce 
Now, as an example.

00:41:04.428 --> 00:41:07.598
GeForce Now is NVIDIA's GeForce-in-the-cloud service.

00:41:08.232 --> 00:41:15.172
GeForce Now serves 10 million members in 70 countries. Incredible 
growth.

00:41:15.706 --> 00:41:19.417
GeForce Now is a seriously hard consumer service to deliver.

00:41:19.417 --> 00:41:32.475
Everything matters - speed of light, visual quality, frame rate, 
response, smoothness, start-up time, server cost, and most important 
of all, security.

00:41:33.390 --> 00:41:35.259
We're transitioning GeForce Now to BlueField.

00:41:35.893 --> 00:41:43.300
With Bluefield, we can isolate the infrastructure from the game 
instances, and offload and accelerate the networking, storage, and 
security.

00:41:44.068 --> 00:41:52.176
The GeForce Now infrastructure is costly. With BlueField, we will 
improve our quality of service and concurrent users at the same time 
- the

00:41:52.343 --> 00:41:53.944
ROI of BlueField is excellent.

00:41:54.712 --> 00:42:00.184
I'm thrilled to announce our first data center infrastructure SDK - 
DOCA 1.0 is available today!

00:42:00.818 --> 00:42:02.920
DOCA is our SDK to program BlueField.

00:42:03.587 --> 00:42:05.489
There's all kinds of great technology inside.

00:42:05.890 --> 00:42:15.375
Deep packet inspection, secure boot, TLS crypto offload, RegEX 
acceleration, and a very exciting capability - a hardware-based,

00:42:15.375 --> 00:42:20.813
real-time clock that can be used for synchronous data centers, 5G, 
and video broadcast.

00:42:21.705 --> 00:42:25.175
We have great partners working with us to optimize leading platforms 
on BlueField:

00:42:25.709 --> 00:42:34.752
Infrastructure software providers, edge and CDN providers, 
cybersecurity solutions, and storage providers - basically the 
world's leading

00:42:34.752 --> 00:42:36.754
companies in data center infrastructure.

00:42:37.421 --> 00:42:42.793
Though we're just getting started with BlueField 2, today we're 
announcing BlueField 3.

00:42:42.793 --> 00:42:44.461
22 billion transistors.

00:42:44.929 --> 00:42:47.698
The first 400 Gbps networking chip.

00:42:48.165 --> 00:42:54.638
16 Arm CPUs to run the entire virtualization software stack - for 
instance, running VMware ESX.

00:42:55.005 --> 00:43:03.614
BlueField 3 takes security to a whole new level, fully offloading and 
accelerating IPSEC and TLS cryptography, secret key management, and

00:43:03.614 --> 00:43:05.049
regular expression processing.

00:43:05.783 --> 00:43:09.253
We are on a pace to introduce a new Bluefield generation every 18 
months.

00:43:10.120 --> 00:43:16.093
BlueField 3 will do 400 Gbps and be 10x the processing capability of 
BlueField 2.

00:43:17.161 --> 00:43:25.436
And BlueField 4 will do 800 Gbps and add NVIDIA's AI computing 
technologies to get another 10x boost.

00:43:26.236 --> 00:43:30.240
100x in 3 years -- and all of it will be needed.

00:43:31.175 --> 00:43:39.750
A simple way to think about this is that 1/3rd of the roughly 30 
million data center servers shipped each year are consumed running the

00:43:39.750 --> 00:43:42.286
software-defined data center stack.

00:43:43.020 --> 00:43:46.090
This workload is increasing much faster than Moore's law.

00:43:46.690 --> 00:43:52.563
So, unless we offload and accelerate this workload, data centers will 
have fewer and fewer CPUs to run applications.

00:43:52.930 --> 00:43:54.264
The time for BlueField has come.

00:43:55.065 --> 00:44:04.241
At the beginning of the big bang of modern AI, we recognized the need 
to create a new kind of computer for a new way of developing software.

00:44:04.642 --> 00:44:08.545
Software will be written by software running on AI computers.

00:44:09.613 --> 00:44:19.723
This new type of computer will need new chips, new system 
architecture, new ways to network, new software, and new 
methodologies and tools.

00:44:20.524 --> 00:44:24.995
We've invested billions into this intuition, and it has proven 
helpful to the industry.

00:44:25.763 --> 00:44:29.767
It all comes together as DGX - a computer for AI.

00:44:30.234 --> 00:44:37.374
We offer DGX as a fully integrated system, as well as offer the 
components to the industry to create differentiated options.

00:44:37.841 --> 00:44:41.286
I am pleased to see so much AI research advancing because of DGX -

00:44:41.286 --> 00:44:48.997
top universities, research hospitals, telcos, banks, consumer 
products companies, car makers, and aerospace companies.

00:44:49.787 --> 00:44:56.319
DGX help their AI researchers - whose expertise is rare, scarce, and 
their work strategic.

00:44:56.319 --> 00:44:58.862
It is imperative to make sure they have the right instrument.

00:44:59.663 --> 00:45:06.570
Simply, if software is to be written by computers, then companies 
with the best software engineers will also need the best computers.

00:45:07.204 --> 00:45:10.441
We offer several configurations - all software compatible.

00:45:10.908 --> 00:45:18.849
The DGX A100 is a building block that contains 5 petaFLOPS of 
computing and superfast storage and networking to feed it.

00:45:19.316 --> 00:45:25.889
DGX Station is an AI data center in-a-box designed for workgroups - 
plugs into a normal outlet.

00:45:26.423 --> 00:45:35.634
And DGX SuperPOD is a fully integrated, fully network-optimized, 
AI-data-center-as-a-product. SuperPOD is for intensive AI research 
and development.

00:45:36.734 --> 00:45:41.083
NVIDIA's own new supercomputer, called Selene, is 4 SuperPODs.

00:45:41.083 --> 00:45:46.943
It is the 5th fastest supercomputer and the fastest industrial 
supercomputer in the world's Top 500.

00:45:47.745 --> 00:45:50.914
We have a new DGX Station 320G.

00:45:51.482 --> 00:46:04.591
DGX Station can train large models - 320 gigabytes of super-fast 
HBM2e connected to 4 A100 GPUs over 8 terabytes per second of memory 
bandwidth.

00:46:04.762 --> 00:46:07.998
8 terabytes transferred in one second.

00:46:08.332 --> 00:46:12.236
It would take 40 CPU servers to achieve this memory bandwidth.

00:46:13.303 --> 00:46:23.614
DGX Station plugs into a normal wall outlet, like a big gaming rig, 
consumes just 1,500 watts, and is liquid-cooled to a silent 37 db.

00:46:24.348 --> 00:46:27.284
Take a look at the cinematic that our engineers and creative team did.

00:47:30.814 --> 00:47:34.284
A CPU cluster of this performance would cost about a million dollars 
today.

00:47:34.818 --> 00:47:41.959
DGX Station is $149,000 - the ideal AI programming companion for 
every AI researcher.

00:47:42.659 --> 00:47:45.729
Today we are also announcing a new DGX SuperPOD.

00:47:46.063 --> 00:47:47.464
Three major upgrades:

00:47:48.098 --> 00:47:59.710
The new 80 gigabyte A100 which brings the SuperPOD to 90 terabytes of 
HBM2 memory, with aggregate bandwidth of 2.2 exabytes per second.

00:48:00.744 --> 00:48:10.921
It would take 11,000 CPU servers to achieve this bandwidth - about a 
250-rack data center - 15 times bigger than the SuperPOD.

00:48:12.389 --> 00:48:15.993
Second, SuperPOD has been upgraded with NVIDIA BlueField-2.

00:48:16.727 --> 00:48:25.903
SuperPOD is now the world's first cloud-native supercomputer, 
multi-tenant shareable, with full isolation and bare-metal 
performance.

00:48:26.937 --> 00:48:33.610
And third, we're offering Base Command, the DGX management and 
orchestration tool used within NVIDIA.

00:48:34.845 --> 00:48:42.519
We use Base Command to support thousands of engineers, over two 
hundred teams, consuming a million-plus GPU-hours a week.

00:48:43.420 --> 00:48:48.258
DGX SuperPOD starts at seven million dollars and scales to sixty 
million dollars for a full system.

00:48:49.693 --> 00:48:51.862
Let me highlight 3 great uses of DGX.

00:48:52.562 --> 00:48:56.033
Transformers have led to dramatic breakthroughs in Natural Language 
Processing.

00:48:56.867 --> 00:49:02.539
Like RNN and LSTM, Transformers are designed to inference on 
sequential data.

00:49:03.173 --> 00:49:12.816
However, Transformers, more than meets the eyes, are not trained 
sequentially, but use a mechanism called attention such that 
Transformers

00:49:12.816 --> 00:49:14.117
can be trained in parallel.

00:49:14.751 --> 00:49:22.781
This breakthrough reduced training time, which more importantly 
enabled the training of huge models with a correspondingly enormous 
amount of data.

00:49:23.560 --> 00:49:27.864
Unsupervised learning can now achieve excellent results, but the 
models are huge.

00:49:28.665 --> 00:49:32.002
Google's Transformer was 65 million parameters.

00:49:32.869 --> 00:49:38.275
OpenAI's GPT-3 is 175 billion parameters.

00:49:38.742 --> 00:49:41.845
That's 3000x times larger in just 3 years.

00:49:42.579 --> 00:49:45.182
The applications for GPT-3 are really incredible.

00:49:45.649 --> 00:49:47.150
Generate document summaries.

00:49:47.718 --> 00:49:49.319
Email phrase completion.

00:49:50.120 --> 00:49:57.461
GPT-3 can even generate Javascript and HTML from plain English - 
essentially telling an AI to write code based on what you want it to 
do.

00:49:58.528 --> 00:50:03.734
Model sizes are growing exponentially - on a pace of doubling every 
two and half months.

00:50:04.534 --> 00:50:11.341
We expect to see multi-trillion-parameter models by next year, and 
100 trillion+ parameter models by 2023.

00:50:13.110 --> 00:50:19.082
As a very loose comparison, the human brain has roughly 125 trillion 
synapses.

00:50:20.250 --> 00:50:22.252
 So these transformer models are getting quite large. 

00:50:22.986 --> 00:50:26.456
Training models of this scale is incredible computer science.

00:50:27.324 --> 00:50:32.996
Today, we are announcing NVIDIA Megatron - for training Transformers.

00:50:34.164 --> 00:50:41.505
Megatron trains giant Transformer models - it partitions and 
distributes the model for optimal multi-GPU and multi-node 
parallelism.

00:50:42.406 --> 00:50:51.481
Megatron does fast data loading, micro batching, scheduling and 
syncing, kernel fusing. It pushes the limits of every NVIDIA 
invention -

00:50:51.648 --> 00:50:56.186
NCCL, NVLink, Infiniband, Tensor Cores.

00:50:57.154 --> 00:51:01.691
Even with Megatron, a trillion-parameter model will take about 3-4 
months to train on Selene.

00:51:02.659 --> 00:51:05.862
So, lots of DGX SuperPODS will be needed around the world

00:51:06.563 --> 00:51:10.834
Inferencing giant Transformer models is also a great computer science 
challenge.

00:51:11.234 --> 00:51:22.476
GPT-3 is so big, with so many floating-point operations, that it 
would take a dual-CPU server over a minute to respond to a single 
128-word query.

00:51:23.713 --> 00:51:29.319
And GPT-3 is so large that it doesn't fit in GPU memory - so it will 
have to be distributed.

00:51:30.187 --> 00:51:33.423
multi-GPU multi-node inference has never been done.

00:51:34.257 --> 00:51:39.563
Today, we're announcing the Megatron Triton Inference Server.

00:51:40.764 --> 00:51:50.273
A DGX with Megatron Triton will respond within a second! Not a minute 
- a second! And for 16 queries at the same time.

00:51:51.174 --> 00:51:59.649
DGX is 1000 times faster and opens up many new use-cases, like 
call-center support, where a one-minute response is effectively 
unusable.

00:52:01.118 --> 00:52:03.019
Naver is Korea's #1 search engine.

00:52:03.920 --> 00:52:10.360
They installed a DGX SuperPOD and are running their AI platform CLOVA 
to train language models for Korean.

00:52:11.261 --> 00:52:15.275
I expect many leading service providers around the world to do the 
same

00:52:15.275 --> 00:52:20.455
Use DGX to develop and operate region-specific and industry-specific 
language services.

00:52:22.339 --> 00:52:31.114
NVIDIA Clara Discovery is our suite of acceleration libraries created 
for computational drug discovery - from imaging, to quantum chemistry,

00:52:31.748 --> 00:52:39.923
to gene variant-calling, to using NLP to understand genetics, and 
using AI to generate new drug compounds.

00:52:41.124 --> 00:52:44.528
Today we're announcing four new models available in Clara Discovery:

00:52:45.495 --> 00:52:49.831
MegaMolBART is a model for generating biomolecular compounds.

00:52:49.831 --> 00:52:56.617
This method has seen recent success with Insilico Medicine using AI 
to find a new drug in less than two years.

00:52:57.073 --> 00:53:05.549
NVIDIA ATAC-seq denoising algorithm for rare and single cell 
epi-genomics is helping to understand gene expression for individual 
cells.

00:53:06.516 --> 00:53:12.689
AlphaFold1 is a model that can predict the 3D structure of a protein 
from the amino acid sequence.

00:53:12.689 --> 00:53:18.895
GatorTron is the world's largest clinical language model that can 
read and understand doctors' notes.

00:53:18.895 --> 00:53:26.468
GatorTron was developed at UoF, using Megatron, and trained on the 
DGX SuperPOD gifted

00:53:26.468 --> 00:53:29.877
to his alma mater by Chris Malachowsky, who founded NVIDIA with 
Curtis and me.

00:53:31.041 --> 00:53:39.082
Oxford Nanopore is the 3rd generation genomics sequencing technology 
capable of ultra high throughput in digitizing biology -

00:53:39.082 --> 00:53:46.356
1/5 of the SARS-CoV-2 virus genomes in the global database were 
generated on Oxford Nanopore.

00:53:46.356 --> 00:53:52.596
Last year, Oxford Nanopore developed a diagnostic test for COVID-19 
called LamPORE, which is used by NHS.

00:53:53.463 --> 00:53:55.865
Oxford Nanopore is GPU-accelerated throughout.

00:53:56.766 --> 00:54:06.539
DNA samples pass through nanopores and the current signal is fed into 
an AI model, like speech recognition, but trained to recognize 
genetic code. 

00:54:07.577 --> 00:54:11.648
Another model called Medaka reads the code and detects genetic 
variants.

00:54:12.015 --> 00:54:14.784
Both models were trained on DGX SuperPOD.

00:54:15.352 --> 00:54:24.327
These new deep learning algorithms achieve 99.9% detection accuracy 
of single nucleotide variants - this is the gold standard of human 
sequencing.

00:54:24.861 --> 00:54:32.602
Pharma is a 1.3 trillion-dollar industry where a new drug can take 
10+ years and fails 90% of the time.

00:54:33.770 --> 00:54:40.443
Schrodinger is the leading physics-based and machine learning 
computational platform for drug discovery and material science.

00:54:40.844 --> 00:54:50.808
Schrodinger is already a heavy user of NVIDIA GPUs, recently entering 
into an agreement to use hundreds of millions of NVIDIA GPU hours on 
the Google cloud.

00:54:51.554 --> 00:54:59.996
Some customers can't use the cloud, so today we are announcing a 
partnership to accelerate Schrodinger's drug discovery workflow with 
NVIDIA

00:55:00.063 --> 00:55:04.301
Clara Discovery libraries and NVIDIA DGX.

00:55:04.301 --> 00:55:10.740
The world's top 20 pharmas use Schrodinger today. Their researchers 
are going to see a giant boost in productivity.

00:55:11.574 --> 00:55:19.182
Recursion is a biotech company using leading-edge computer science to 
decode biology to industrialize drug discovery.

00:55:19.949 --> 00:55:30.506
The Recursion Operating System is built on NVIDIA DGX SuperPOD for 
generating, analyzing and gaining insight from massive biological and 
chemical datasets.

00:55:31.561 --> 00:55:37.767
They call their SuperPOD the BioHive-1 - it's the most powerful 
computer at any pharma today.

00:55:38.702 --> 00:55:46.409
Using deep learning on DGX, Recursion is classifying cell responses 
after exposure to small molecule drugs.

00:55:47.277 --> 00:55:56.226
Quantum computing is a field of physics that studies the use of 
natural quantum behavior - superposition, entanglement, and 
interference - to build a computer.

00:55:56.853 --> 00:56:01.491
The computation is performed using quantum circuits that operate on 
quantum bits - called qubits.

00:56:02.492 --> 00:56:09.866
Qubits can be 0 or 1, like a classical computing bit, but also in 
superposition - meaning they exist simultaneously in both states.

00:56:10.533 --> 00:56:15.672
The qubits can be entangled where the behavior of one can affect or 
control the behavior of others.

00:56:16.206 --> 00:56:22.078
Adding and entangling more qubits lets quantum computers calculate 
exponentially more information.

00:56:22.812 --> 00:56:26.883
There is a large community around the world doing research in quantum 
computers and algorithms.

00:56:27.550 --> 00:56:32.689
Well over 50 teams in industry, academia, and national labs are 
researching the field.

00:56:33.390 --> 00:56:34.624
We're working with many of them.

00:56:35.325 --> 00:56:44.067
Quantum computing can solve Exponential Order Complexity problems, 
like factoring large numbers for cryptography, simulating atoms and

00:56:44.067 --> 00:56:49.806
molecules for drug discovery, finding shortest path optimizations, 
like the traveling salesman problem,

00:56:50.507 --> 00:56:57.647
The limiter in quantum computing is decoherence, falling out of 
quantum states, caused by the tiniest of background noise

00:56:58.081 --> 00:56:59.649
So error correction is essential.

00:57:00.483 --> 00:57:07.323
It is estimated that to solve meaningful problems, several million 
physical qubits will be required to sufficiently error correct.

00:57:07.957 --> 00:57:17.357
The research community is making fast progress, doubling physical 
Qubits each year, so likely achieving the milestone by 2035 to 2040.

00:57:17.357 --> 00:57:19.759
Well within my career horizon.

00:57:20.036 --> 00:57:26.843
In the meantime, our mission is to help the community research the 
computer of tomorrow with the fastest computer of today.

00:57:27.777 --> 00:57:34.117
Today, we're announcing cuQuantum - an acceleration library designed 
for simulating quantum circuits

00:57:34.484 --> 00:57:37.520
for both Tensor Network Solvers and State Vector Solvers.

00:57:38.788 --> 00:57:44.260
It is optimized to scale to large GPU memories, multiple GPUs, and 
multiple DGX nodes.

00:57:45.428 --> 00:57:54.571
The speed-up of cuQuantum on DGX is excellent. Running the cuQuantum 
Benchmark, state vector simulation takes 10 days on a dual-CPU server

00:57:55.271 --> 00:58:02.979
but only 2 hours on a DGX A100. cuQuantum on DGX can productively 
simulate 10's of qubits.

00:58:04.481 --> 00:58:15.792
And Caltech, using Contengra/Quimb, simulated the Sycamore quantum 
circuit at depth 20 in record time using cuQuantum on NVIDIA's Selene 
supercomputer.

00:58:16.392 --> 00:58:23.299
 What would have taken years on CPUs can now run in a few days on 
cuQuantum and DGX.

00:58:24.167 --> 00:58:31.808
cuQuantum will accelerate quantum circuit simulators so researchers 
can design better quantum computers and verify their results, 
architect

00:58:31.808 --> 00:58:38.348
hybrid quantum-classical systems -and discover more Quantum-Optimal 
algorithms like Shor's and Grover's.

00:58:39.115 --> 00:58:43.853
cuQuantum on DGX is going to give the quantum community a huge boost.

00:58:44.554 --> 00:58:49.659
I'm hoping cuQuantum will do for quantum computing what cuDNN did for 
deep learning.

00:58:50.493 --> 00:58:54.964
Modern data centers host diverse applications that require varying 
system architectures.

00:58:55.532 --> 00:59:01.337
Enterprise servers are optimized for a balance of strong 
single-threaded performance and a nominal number of cores.

00:59:01.905 --> 00:59:10.046
Hyperscale servers, optimized for microservice containers, are 
designed for a high number of cores, low cost, and great 
energy-efficiency.

00:59:10.680 --> 00:59:14.918
Storage servers are optimized for large number of cores and high IO 
throughput.

00:59:15.285 --> 00:59:22.592
Deep learning training servers are built like supercomputers - with 
the largest number of fast CPU cores, the fastest memory, 

00:59:22.592 --> 00:59:26.796
the fastest IO, and high-speed links to connect the GPUs.

00:59:26.996 --> 00:59:33.670
Deep learning inference servers are optimized for energy-efficiency 
and best ability to process a large number of models concurrently.

00:59:34.437 --> 00:59:45.448
The genius of the x86 server architecture is the ability to do a good 
job using varying configurations of the CPU, memory, PCI express, and

00:59:45.448 --> 00:59:48.384
peripherals to serve all of these applications.

00:59:49.218 --> 00:59:56.793
Yet processing large amounts of data remains a challenge for computer 
systems today - this is particularly true for AI models like

00:59:56.793 --> 00:59:58.595
transformers and recommender systems.

00:59:59.228 --> 01:00:03.099
Let me illustrate the bottleneck with half of a DGX.

01:00:03.733 --> 01:00:10.440
Each Ampere GPU is connected to 80GB of super fast memory running at 
2 TB/sec.

01:00:11.374 --> 01:00:18.715
Together, the 4 Amperes process 320 GB at 8 Terabytes per second.

01:00:19.382 --> 01:00:27.308
Contrast that with CPU memory, which is 1TB large, but only 0.2 
Terabytes per second.

01:00:27.308 --> 01:00:32.178
The CPU memory is 3 times larger but 40 times slower than the GPU.

01:00:33.029 --> 01:00:39.102
We would love to utilize the full 1,320 GB of memory of this node to 
train AI models.

01:00:39.869 --> 01:00:41.404
So, why not something like this?

01:00:42.171 --> 01:00:49.379
Make faster CPU memories, connect 4 channels to the CPU, a dedicated 
channel to feed each GPU.

01:00:49.846 --> 01:00:53.750
Even if a package can be made, PCIe is now the bottleneck.

01:00:54.450 --> 01:01:02.258
We can surely use NVLINK. NVLINK is fast enough. But no x86 CPU has 
NVLINK, not to mention 4 NVLINKS.

01:01:03.593 --> 01:01:13.469
Today, we're announcing our first data center CPU, Project Grace, 
named after Grace Hopper, a computer scientist and U.S. Navy rear 
Admiral,

01:01:13.903 --> 01:01:16.105
who in the '50s pioneered computer programming.

01:01:17.006 --> 01:01:23.646
Grace is Arm-based and purpose-built for accelerated computing 
applications of large amounts of data - such as AI.

01:01:24.380 --> 01:01:26.358
Grace highlights the beauty of Arm.

01:01:26.358 --> 01:01:33.527
Their IP model allowed us to create the optimal CPU for this 
application which achieves x-factors speed-up.

01:01:34.323 --> 01:01:45.568
The Arm core in Grace is a next generation off-the-shelf IP for 
servers. Each CPU will deliver over 300 SpecInt with a total of over 
2,400

01:01:45.568 --> 01:01:51.774
SPECint_rate CPU performance for an 8-GPU DGX.

01:01:52.275 --> 01:02:00.149
For comparison, todays DGX, the highest performance computer in the 
world today is 450 SPECint_rate.

01:02:00.883 --> 01:02:07.256
2400 SPECint_rate with Grace versus 450 SPECint_rate today.

01:02:07.890 --> 01:02:17.133
So look at this again - Before, After, Before, After.

01:02:18.367 --> 01:02:21.604
Amazing increase in system and memory bandwidth.

01:02:22.538 --> 01:02:25.174
Today, we're introducing a new kind of computer.

01:02:26.075 --> 01:02:28.578
The basic building block of the modern data center.

01:02:29.479 --> 01:02:30.146
Here it is.

01:02:42.391 --> 01:02:50.688
What I'm about to show you brings together the latest GPU accelerated 
computing, Mellanox high performance networking, and something brand 
new.

01:02:51.267 --> 01:02:52.935
The final piece of the puzzle.

01:02:59.142 --> 01:03:07.416
The world's first CPU designed for terabyte-scale accelerated 
computing... her secret codename - GRACE.

01:03:09.385 --> 01:03:19.310
This powerful, Arm-based CPU gives us the third foundational 
technology for computing, and the ability to rearchitect every aspect 
of the data center for AI.

01:03:20.663 --> 01:03:27.804
We're thrilled to announce the Swiss National Supercomputing Center 
will build a supercomputer powered by Grace and our next generation 
GPU.

01:03:28.504 --> 01:03:39.582
This new supercomputer, called Alps, will be 20 exaflops for AI, 10 
times faster than the world's fastest supercomputer today.

01:03:40.583 --> 01:03:48.100
Alps will be used to do whole-earth-scale weather and climate 
simulation, quantum chemistry and quantum physics for the Large 
Hadron Collider.

01:03:48.925 --> 01:03:54.730
Alps will be built by HPE and is come on-line in 2023.

01:03:54.730 --> 01:04:02.605
We're thrilled by the enthusiasm of the supercomputing community, 
welcoming us to make Arm a top-notch scientific computing platform.

01:04:03.372 --> 01:04:11.214
Our data center roadmap is now a rhythm consisting of 3-chips: CPU, 
GPU, and DPU.

01:04:12.114 --> 01:04:16.986
Each chip architecture has a two-year rhythm with likely a kicker in 
between.

01:04:17.954 --> 01:04:20.590
One year will focus on x86 platforms.

01:04:21.023 --> 01:04:23.125
One year will focus on Arm platforms.

01:04:23.826 --> 01:04:26.662
Every year will see new exciting products from us.

01:04:27.396 --> 01:04:34.403
The NVIDIA architecture and platforms will support x86 and Arm - 
whatever customers and markets prefer.

01:04:35.338 --> 01:04:39.075
Three chips. Yearly Leaps. One Architecture.

01:04:39.976 --> 01:04:42.311
Arm is the most popular CPU in the world.

01:04:43.246 --> 01:04:51.921
For good reason - its super energy-efficient. Its open licensing 
model inspires a world of innovators to create products around it.

01:04:52.955 --> 01:04:55.391
Arm is used broadly in mobile and embedded today.

01:04:56.392 --> 01:05:03.018
For other markets like the cloud, enterprise and edge data centers, 
supercomputing and PCs.

01:05:03.018 --> 01:05:07.096
Arm is just starting and has great growth opportunities.

01:05:07.637 --> 01:05:14.110
Each market has different applications and has unique systems, 
software, peripherals, and ecosystems.

01:05:15.011 --> 01:05:18.347
For the markets we serve, we can accelerate Arm's adoption.

01:05:19.248 --> 01:05:20.950
Let's start with the big one - Cloud.

01:05:21.951 --> 01:05:29.558
One of the earliest designers of Arm CPUs for data centers is AWS - 
its Graviton CPUs are extremely impressive.

01:05:30.259 --> 01:05:37.767
Today, we're announcing NVIDIA and AWS are partnering to bring 
Graviton2 and NVIDIA GPUs together.

01:05:38.668 --> 01:05:44.774
This partnership brings Arm into the most demanding cloud workloads - 
AI and cloud gaming.

01:05:45.408 --> 01:05:50.079
Mobile gaming is growing fast and is the primary form of gaming in 
some markets.

01:05:50.813 --> 01:05:58.621
With AWS-designed Graviton2, users can stream Arm-based applications 
and Android games straight from AWS.

01:05:59.422 --> 01:06:00.656
It's expected later this year.

01:06:01.524 --> 01:06:08.197
We are announcing a partnership with Ampere Computing to create a 
scientific and cloud computing SDK and reference system.

01:06:09.265 --> 01:06:19.141
Ampere Computing's Altra CPU is excellent - 80 cores, 285 SPECint17, 
right up there with the highest performance x86.

01:06:20.176 --> 01:06:25.581
We are seeing excellent reception at supercomputing centers around 
the world and at Android cloud gaming services.

01:06:26.282 --> 01:06:33.689
We are also announcing a partnership with Marvell to create an edge 
and enterprise computing SDK and reference system.

01:06:34.623 --> 01:06:42.765
Marvell Octeon excels at IO, storage and 5G processing. This system 
is ideal for hyperconverged edge servers.

01:06:44.066 --> 01:06:50.439
We're announcing a partnership with Mediatek to create a reference 
system and SDK for Chrome OS and Linux PC's.

01:06:51.140 --> 01:06:54.777
Mediatek is the world's largest SOC maker.

01:06:55.611 --> 01:07:01.250
Combining NVIDIA GPUs and Mediatek SOCs will make excellent PCs and 
notebooks.

01:07:02.485 --> 01:07:09.492
AI, computers automating intelligence, is the most powerful 
technology force of our time.

01:07:10.192 --> 01:07:11.927
We see AI in four waves.

01:07:12.261 --> 01:07:20.202
The first wave was to reinvent computing for this new way of doing 
software - we're all in and have been driving this for nearly 10 
years.

01:07:20.903 --> 01:07:27.810
The first adopters of AI were the internet companies - they have 
excellent computer scientists, large computing infrastructures, and 
the

01:07:27.810 --> 01:07:29.712
ability to collect a lot of training data.

01:07:30.880 --> 01:07:32.915
We are now at the beginning of the next wave.

01:07:33.716 --> 01:07:42.691
The next wave is enterprise and the industrial edge, where AI can 
revolutionize the world's largest industries. From manufacturing,

01:07:42.792 --> 01:07:47.797
logistics, agriculture, healthcare, financial services, and 
transportation.

01:07:48.531 --> 01:07:53.869
There are many challenges to overcome, one of which is connectivity, 
which 5G will solve.

01:07:54.503 --> 01:08:01.510
And then autonomous systems. Self-driving cars are an excellent 
example. But everything that moves will eventually be autonomous.

01:08:02.244 --> 01:08:09.518
The industrial edge and autonomous systems are the most challenging, 
but also the largest opportunities for AI to make an impact.

01:08:10.152 --> 01:08:17.560
Trillion dollar industries can soon apply AI to improve productivity, 
and invent new products, services and business models.

01:08:18.761 --> 01:08:24.800
We have to make AI easier to use - turn AI from computer science to 
computer products.

01:08:25.201 --> 01:08:32.842
We're building the new computing platform for this fundamentally new 
software approach - the computer for the age of AI.

01:08:33.676 --> 01:08:47.103
AI is not just about an algorithm - building and operating AI is a 
fundamental change in every aspect of software - Andrej Karpathy 
rightly called it Software 2.0.

01:08:47.123 --> 01:08:54.663
Machine learning, at the highest level, is a continuous learning 
system that starts with data scientists developing data strategies 

01:08:54.663 --> 01:09:00.636
and engineering predictive features - this data is the digital life 
experience of a company.

01:09:01.470 --> 01:09:06.675
Training involves inventing or adapting an AI model that learns to 
make the desired predictions.

01:09:07.543 --> 01:09:14.049
Simulation and validation test the AI application for accuracy, 
generalization, and potential bias.

01:09:14.550 --> 01:09:22.165
And finally, orchestrating a fleet of computers, whether in your data 
center or at the edge in the warehouse, farms, or wireless base 
stations.

01:09:22.191 --> 01:09:31.233
NVIDIA created the chips, systems, and libraries needed for 
end-to-end machine learning - for example, technologies like Tensor 
Core GPUs,

01:09:31.333 --> 01:09:36.972
NVLINK, DGX, cuDNN, RAPIDS, NCCL, GPU Direct, DOCA, and so much more.

01:09:37.673 --> 01:09:39.642
We call the platform NVIDIA AI.

01:09:40.709 --> 01:09:46.649
NVIDIA AI libraries accelerate every step, from data processing to 
fleet orchestration.

01:09:47.416 --> 01:09:51.787
NVIDIA AI is integrated into all of the industry's popular tools and 
workflows.

01:09:52.788 --> 01:10:00.196
NVIDIA AI is in every cloud, used by the world's largest companies, 
and by over 7,500 AI startups around the world.

01:10:00.596 --> 01:10:11.006
And NVIDIA AI runs on any system that includes NVIDIA GPUs, from PCs 
and laptops, to workstations, to supercomputers, in any cloud, to our

01:10:11.006 --> 01:10:13.209
$99 Jetson robot computer.

01:10:13.976 --> 01:10:18.247
One segment of computing we've not served is enterprise computing.

01:10:19.181 --> 01:10:23.118
70% of the world's enterprises run VMware, as we do at NVIDIA.

01:10:23.619 --> 01:10:27.890
VMware was created to run many applications on one virtualized 
machine.

01:10:28.857 --> 01:10:36.098
AI, on the other hand, runs a single job, bare-metal, on multiple 
GPUs and often multiple nodes.

01:10:36.932 --> 01:10:44.473
All of the NVIDIA optimizations for compute and data transfer are now 
plumbed through the VMware stack so AI workloads can be distributed to

01:10:44.473 --> 01:10:47.610
multiple systems and achieve bare-metal performance.

01:10:48.444 --> 01:10:51.914
The VMware stack is also offloaded and accelerated on NVIDIA 
BlueField.

01:10:52.815 --> 01:11:02.224
NVIDIA AI now runs in its full glory on VMware, which means 
everything that has been accelerated by NVIDIA AI now runs great on 
VMware.

01:11:03.158 --> 01:11:07.630
AI applications can be deployed and orchestrated with Kubernetes 
running on VMware Tanzu.

01:11:08.364 --> 01:11:11.867
We call this platform NVIDIA EGX for Enterprise.

01:11:12.534 --> 01:11:22.444
The enterprise IT ecosystem is thrilled - finally the 300,000 VMware 
enterprise customers can easily build an AI computing infrastructure

01:11:22.778 --> 01:11:25.814
that seamlessly integrates into their existing environment.

01:11:26.548 --> 01:11:32.721
In total, over 50 servers from the world's top server makers will be 
certified for NVIDIA EGX Enterprise.

01:11:33.022 --> 01:11:38.227
BlueField 2 offloads and accelerates the VMware stack and does the 
networking for distributed computing.

01:11:38.794 --> 01:11:46.001
Enterprise can choose big or small GPUs for heavy-compute or 
heavy-graphics workloads like Omniverse, or mix and match.

01:11:46.902 --> 01:11:48.370
All run NVIDIA AI.

01:11:49.171 --> 01:11:58.013
Enterprise companies make up the world's largest industries and they 
operate at the edge - in hospitals, factories, plants, warehouses,

01:11:58.013 --> 01:12:02.117
stores, farms, cities and roads - far from data centers.

01:12:03.185 --> 01:12:04.787
The missing link is 5G.

01:12:05.621 --> 01:12:10.059
Consumer 5G is great, but Private 5G is revolutionary.

01:12:10.893 --> 01:12:20.102
Today, we're announcing the Aerial A100 - bringing together 5G and AI 
into a new type of computing platform designed for the edge.

01:12:20.636 --> 01:12:29.578
Aerial A100 integrates the Ampere GPU and BlueField DPU into one card 
- this is the most advanced PCI express card ever created.

01:12:29.978 --> 01:12:35.918
So, it's not a surprise that Aerial A100 in an EGX system will be a 
complete 5G base station.

01:12:36.952 --> 01:12:52.201
Aerial A100 delivers up to full 20 Gbps and can process up to 9 
100Mhz massive MIMO for 64T64R - or 64 transmit and 64 receive 
antenna arrays

01:12:52.201 --> 01:12:54.603
- state of the art capabilities.

01:12:55.270 --> 01:13:06.843
Aerial A100 is software-defined, with accelerated features like PHY, 
Virtual Network Functions, network acceleration, packet pacing, and 
line-rate cryptography.

01:13:08.117 --> 01:13:16.825
Our partners ERICSSON, Fujitsu, Mavenir, Altran, and Radisys will 
build their total 5G solutions on top of the Aerial library.

01:13:17.826 --> 01:13:28.303
NVIDIA EGX server with Aerial A100 is the first 5G base-station that 
is also a cloud-native, secure, AI edge data center.

01:13:29.138 --> 01:13:32.708
We have brought the power of the cloud to the 5G edge.

01:13:33.609 --> 01:13:36.779
Aerial also extends the power of 5G into the cloud.

01:13:37.312 --> 01:13:43.152
Today, we are excited to announce that Google will support NVIDIA 
Aerial in the GCP cloud.

01:13:43.986 --> 01:13:45.621
I have an important new platform to tell you about.

01:13:46.522 --> 01:13:54.229
The rise of microservice-based applications and hybrid-cloud has 
exposed billions of connections in a data center to potential attack.

01:13:54.863 --> 01:14:04.889
Modern Zero-Trust security models assume the intruder is already 
inside and all container-to-container communications should be 
inspected, even within a node. 

01:14:05.274 --> 01:14:06.708
This is not possible today.

01:14:07.476 --> 01:14:12.448
The CPU load of monitoring every piece of traffic is simply too great.

01:14:13.215 --> 01:14:20.489
Today, we are announcing NVIDIA Morpheus - a data center security 
platform for real-time all-packet inspection.

01:14:21.156 --> 01:14:28.530
Morpheus is built on NVIDIA AI, NVIDIA BlueField, Net-Q network 
telemetry software, and EGX.

01:14:29.431 --> 01:14:39.374
We're working to create solutions with industry leaders in data 
center security - Fortinet, Red Hat, Cloudflare, Splunk, F5, and Aria

01:14:39.374 --> 01:14:46.782
Cybersecurity. And early customers - Booz Allen Hamilton, Best Buy, 
and of course, our own team at NVIDIA.

01:14:47.716 --> 01:14:49.952
Let me show you how we're using Morpheus at NVIDIA.

01:14:52.654 --> 01:14:54.089
It starts with a network.

01:14:54.523 --> 01:15:01.563
Here we see a representation of a network, where dots are servers and 
lines (the edges) are packets flowing between those servers.

01:15:01.897 --> 01:15:09.938
Except in this network, Morpheus is deployed. This enables AI 
inferencing across your entire network, including east/west traffic. 
The

01:15:09.938 --> 01:15:17.613
particular model being used here has been trained to identify 
sensitive information - AWS credentials, GitHub credentials, private 
keys,

01:15:17.713 --> 01:15:22.751
passwords. If observed in the packet, these would appear as red 
lines, and we don't see any of that.

01:15:23.519 --> 01:15:24.820
Uh oh, what happened.

01:15:25.454 --> 01:15:29.391
An updated configuration was deployed to a critical business app on 
this server.

01:15:30.025 --> 01:15:34.830
This update accidentally removed encryption, and now everything that 
communicates with that app

01:15:34.830 --> 01:15:37.933
sends and receives sensitive credentials in the clear.

01:15:38.800 --> 01:15:47.075
This can quickly impact additional servers. This translates to 
continuing exposure on the network. The AI model in Morpheus is 
searching

01:15:47.075 --> 01:15:53.982
through every packet for any of these credentials, continually 
flagging when it encounters such data. And rather than using pattern