WEBVTT
Kind: captions
Language: en
00:13:29.708 --> 00:13:31.844
I am a creator
00:13:39.151 --> 00:13:41.353
Blending art and technology
00:13:48.160 --> 00:13:50.729
To immerse our senses
00:13:57.670 --> 00:13:59.138
I am a healer
00:14:01.540 --> 00:14:03.676
Helping us take the next step
00:14:08.614 --> 00:14:10.382
And see what's possible
00:14:19.792 --> 00:14:21.660
I am a pioneer
00:14:25.364 --> 00:14:27.233
Finding life-saving answers
00:14:33.272 --> 00:14:35.841
And pushing the edge to the outer limits.
00:14:41.513 --> 00:14:43.082
I am a guardian
00:14:44.917 --> 00:14:46.585
Defending our oceans
00:14:52.157 --> 00:14:55.394
Magnificent creatures that call them home
00:15:01.433 --> 00:15:03.235
I am a protector
00:15:06.272 --> 00:15:09.775
Helping the earth breathe easier
00:15:16.849 --> 00:15:19.852
And watching over it for generations to come
00:15:23.589 --> 00:15:25.224
I am a storyteller
00:15:27.693 --> 00:15:29.795
Giving emotion to words
00:15:31.830 --> 00:15:33.365
And bringing them to life.
00:15:35.601 --> 00:15:39.138
I am even the composer of the music.
00:15:49.381 --> 00:15:51.116
I am AI
00:15:51.917 --> 00:15:59.959
Brought to life by NVIDIA, deep learning, and brilliant minds
everywhere.
00:16:06.865 --> 00:16:09.735
There are powerful forces shaping the world's industries.
00:16:10.302 --> 00:16:18.143
Accelerated computing that we pioneered has supercharged scientific
discovery, while providing the computer industry a path forward.
00:16:18.911 --> 00:16:22.982
Artificial intelligence particularly, has seen incredible advances.
00:16:23.482 --> 00:16:29.355
With NVIDIA GPUs computers learn, and software writes software no
human can.
00:16:30.022 --> 00:16:35.894
The AI software is delivered as a service from the cloud, performing
automation at the speed-of-light.
00:16:36.729 --> 00:16:44.803
Software is now composed of microservices that scale across the
entire data center - treating the data center as a single-unit of
computing.
00:16:45.604 --> 00:16:50.999
AI and 5G are the ingredients to kick start the 4th industrial
revolution
00:16:50.999 --> 00:16:54.771
where automation and robotics can be deployed to the far edges of the
world.
00:16:55.614 --> 00:17:02.421
There is one more miracle we need, the metaverse, a virtual world
that is a digital twin of ours.
00:17:03.422 --> 00:17:07.960
Welcome to GTC 2021 - we are going to talk about these dynamics and
more.
00:17:09.194 --> 00:17:11.130
Let me give you the architecture of my talk.
00:17:11.764 --> 00:17:17.936
It's organized in four stacks - this is how we work - as a full-stack
computing platform company.
00:17:18.604 --> 00:17:26.311
The flow also reflects the waves of AI and how we're expanding the
reach of our platform to solve new problems, and to enter new markets.
00:17:27.212 --> 00:17:35.721
First is Omniverse - built from the ground-up on NVIDIA's body of
work. It is a platform to create and simulate virtual worlds.
00:17:36.422 --> 00:17:43.495
We'll feature many applications of Omniverse, like design
collaboration, simulation, and future robotic factories.
00:17:44.329 --> 00:17:47.800
The second stack is DGX and high-performance data centers.
00:17:48.467 --> 00:17:56.975
I'll feature BlueField, new DGXs, new chips, and the new work we're
doing in AI, drug discovery, and quantum computing.
00:17:57.810 --> 00:18:00.512
Here, I'll also talk about Arm and new Arm partnerships.
00:18:01.313 --> 00:18:08.320
The third stack is one of our most important new platforms - NVIDIA
EGX with Aerial 5G.
00:18:09.254 --> 00:18:14.159
Now, enterprises and industries can do AI and deploy AI-on-5G.
00:18:15.060 --> 00:18:20.332
We'll talk about NVIDIA AI and Pre-Trained Models, like Jarvis
Conversational AI.
00:18:21.133 --> 00:18:27.639
And finally, our work with the auto industry to revolutionize the
future of transportation - NVIDIA Drive.
00:18:28.273 --> 00:18:32.878
We'll talk about new chips, new platforms and software, and lots of
new customers.
00:18:33.212 --> 00:18:34.146
Let's get started.
00:18:35.380 --> 00:18:39.952
Scientists, researchers, developers, and creators are using NVIDIA to
do amazing things.
00:18:40.085 --> 00:18:50.162
Your work gets global reach with the installed base of over a billion
CUDA GPUs shipped and 250 ExaFLOPS of GPU computing power in the
cloud.
00:18:50.462 --> 00:18:57.202
Two and a half million developers and 7,500 startups are creating
thousands of applications for accelerated computing.
00:18:57.503 --> 00:19:03.041
We are thrilled by the growth of the ecosystem we are building
together and will continue to put our heart and soul into advancing
it.
00:19:03.642 --> 00:19:10.349
Building tools for the Da Vincis of our time is our purpose. And in
doing so, we also help create the future.
00:19:11.183 --> 00:19:16.755
Democratizing high-performance computers is one of NVIDIA's greatest
contributions to science.
00:19:17.523 --> 00:19:22.082
With just a GeForce, every student can have a supercomputer.
00:19:22.082 --> 00:19:28.593
This is how Alex Krizhevsky, Ilya, and Hinton trained AlexNet that
caught the world's attention on deep learning.
00:19:29.368 --> 00:19:34.106
And with GPUs in supercomputers, we gave scientists a time machine.
00:19:35.073 --> 00:19:40.979
A scientist once told me that because of NVIDIA's work, he can do his
life's work in his lifetime.
00:19:42.214 --> 00:19:43.715
I can't think of a greater purpose.
00:19:44.516 --> 00:19:46.785
Let me highlight a few achievements from last year.
00:19:47.719 --> 00:19:50.622
NVIDIA is continually optimizing the full stack.
00:19:51.156 --> 00:19:56.094
With the chips you have, your software runs faster every year and
even faster if you upgrade.
00:19:56.828 --> 00:20:05.589
On our gold suite of important science codes, we increased
performance 13-fold in the last 5 years, and for some, performance
doubled every year.
00:20:06.305 --> 00:20:12.344
NAMD molecular dynamics simulator, for example, was re-architected
and can now run across multiple GPUs.
00:20:13.111 --> 00:20:22.921
Researchers led by Dr. Rommie Amaro at UC San Diego, used this
multi-GPU NAMD, running on Oak Ridge Summit supercomputer's 20,000
NVIDIA GPUs,
00:20:22.921 --> 00:20:27.960
to do the largest atomic simulation ever - 305 million atoms.
00:20:28.160 --> 00:20:33.665
This work was critical to a better understanding of the COVID-19
virus and accelerated the making of the vaccine.
00:20:34.700 --> 00:20:39.238
Dr. Amaro and her collaborators won the Gordon Bell Award for this
important work.
00:20:39.972 --> 00:20:47.012
I'm very proud to welcome Dr. Amaro and more than 100,000 of you to
this year's GTC - our largest-ever by double.
00:20:47.846 --> 00:20:52.290
We have some of the greatest computer scientists and researchers of
our time speaking here.
00:20:52.290 --> 00:21:00.532
3 Turing award winners, 12 Gordon Bell award winners, 9 Kaggle Grand
Masters - and even 10 Oscar winners.
00:21:00.993 --> 00:21:04.796
We're also delighted to have the brightest minds from industry
sharing their discoveries.
00:21:05.330 --> 00:21:13.171
Leaders from every field - healthcare, auto, finance, retail, energy,
internet services, every major enterprise IT company.
00:21:13.939 --> 00:21:20.557
They're bringing you their latest work in COVID research, data
science, cybersecurity, new approaches to computer graphics
00:21:20.557 --> 00:21:23.317
and the most recent advances in AI and robotics.
00:21:23.849 --> 00:21:31.923
In total, 1,600 talks about the most important technologies of our
time, from the leaders in the field that are shaping our world.
00:21:32.491 --> 00:21:33.492
Welcome to GTC.
00:21:34.459 --> 00:21:37.763
Let's start where NVIDIA started...computer graphics.
00:21:38.397 --> 00:21:46.102
Computer graphics is the driving force of our technology. Hundreds of
millions of gamers and creators each year seek out the best NVIDIA
has to offer.
00:21:46.538 --> 00:21:53.241
At its core, computer graphics is about simulations - using
mathematics and computer science to simulate the interactions of
light
00:21:53.241 --> 00:22:00.052
and material, the physics of objects, particles, and waves; and now
simulating intelligence and animation.
00:22:00.485 --> 00:22:07.859
The science, engineering, and artistry that we dedicate in pursuit of
achieving mother nature's physics has led to incredible advances.
00:22:08.460 --> 00:22:14.266
And allowed our technology to contribute to advancing the basic
sciences, the arts, and the industries.
00:22:15.067 --> 00:22:22.908
This last year, we introduced the 2nd generation of RTX - a new
rendering approach that fuses rasterization and programmable shading
with
00:22:22.908 --> 00:22:29.548
hardware-accelerated ray tracing and artificial intelligence. This is
the culmination of ten years of research.
00:22:30.082 --> 00:22:37.255
RTX has reset computer graphics, giving developers a powerful new
tool just as rasterization plateaus.
00:22:37.656 --> 00:22:44.129
Let me show you some amazing footage from games in development. The
technology and artistry is amazing.
00:22:44.863 --> 00:22:48.333
We’re giving the world’s billion gamers an incredible reason to
upgrade.
00:24:38.877 --> 00:24:41.513
RTX is a reset of computer graphics.
00:24:41.913 --> 00:24:49.187
It has enabled us to build Omniverse - a platform for connecting 3D
worlds into a shared virtual world.
00:24:49.788 --> 00:24:57.295
Ones not unlike the science fiction metaverse first described by Neal
Stephenson in his early 1990s novel, Snow Crash, where the metaverse
00:24:57.295 --> 00:25:05.103
would be collectives of shared 3D spaces and virtually-enhanced
physical spaces that are extensions of the internet.
00:25:05.537 --> 00:25:14.934
Pieces of the early-metaverse vision are already here - massive
online social games like Fortnite or user-created virtual worlds like
Minecraft.
00:25:15.714 --> 00:25:22.187
Let me tell you about Omniverse from the perspective of two
applications - design collaboration and digital twins.
00:25:22.954 --> 00:25:25.023
There are several major parts of the platform.
00:25:25.390 --> 00:25:30.272
First, the Omniverse Nucleus, a database engine that connects users
00:25:30.272 --> 00:25:34.332
and enables the interchange of 3D assets and scene descriptions.
00:25:34.699 --> 00:25:43.690
Once connected, designers doing modeling, layout, shading, animation,
lighting, special effects or rendering can collaborate to create a
scene.
00:25:44.075 --> 00:25:52.397
The Omniverse Nucleus is described with the open standard USD,
Universal Scene Description, a fabulous interchange framework
invented by Pixar.
00:25:53.251 --> 00:25:59.524
Multiple users can connect to Nucleus, transmitting and receiving
changes to their world as USD snippets.
00:25:59.991 --> 00:26:06.798
The 2nd part of Omniverse is the composition, rendering, and
animation engine - the simulation of the virtual world.
00:26:07.299 --> 00:26:15.774
Omniverse is a platform built from the ground up to be
physically-based. It is fully path-traced. Physics is simulated with
NVIDIA PhysX,
00:26:15.774 --> 00:26:21.713
materials are simulated with NVIDIA MDL and Omniverse is fully
integrated with NVIDIA AI.
00:26:22.380 --> 00:26:30.255
Omniverse is cloud-native, multi-GPU scalable and runs on any RTX
platform and streams remotely to any device.
00:26:31.022 --> 00:26:34.826
The third part is NVIDIA CloudXR, a stargate if you will.
00:26:35.060 --> 00:26:41.800
You can teleport into Omniverse with VR and AIs can teleport out of
Omniverse with AR.
00:26:42.200 --> 00:26:44.469
Omniverse was released to open beta in December.
00:26:44.769 --> 00:26:46.638
Let me show you what talented creators are doing.
00:27:16.267 --> 00:27:18.903
Creators are doing amazing things with Omniverse.
00:27:19.237 --> 00:27:27.839
At Foster and Partners, designers in 17 locations around the world,
are designing buildings together in their Omniverse shared virtual
space.
00:27:28.480 --> 00:27:33.864
ILM is testing Omniverse to bring together internal and external tool
pipelines from multiple studios.
00:27:33.864 --> 00:27:40.416
Omniverse lets them collaborate, render final shots in real time and
create massive virtual sets like holodecks.
00:27:41.259 --> 00:27:48.400
Ericsson is using Omniverse to do real-time 5G wave propagation
simulation, with many multi-path interferences.
00:27:48.800 --> 00:27:55.573
Twin Earth is creating a digital twin of Earth that will run on
20,000 NVIDIA GPUs.
00:27:55.573 --> 00:28:02.380
And Activision is using Omniverse to organize their more than 100,000
3D assets into a shared and searchable world.
00:28:12.691 --> 00:28:16.294
Bentley is the world's leading infrastructure engineering software
company.
00:28:17.162 --> 00:28:23.566
Everything that's constructed - roads and bridges, rail and transit
systems, airports and seaports -
00:28:23.566 --> 00:28:29.188
about 3% of the world's GDP or three-and-a-half trillion dollars a
year.
00:28:29.207 --> 00:28:35.451
Bentley's software is used to design, model, and simulate the largest
infrastructure projects in the world.
00:28:35.451 --> 00:28:39.129
90% of the world's top 250 engineering firms use Bentley.
00:28:40.018 --> 00:28:49.571
They have a new platform called iTwin - an exciting strategy to use
the 3D model, after construction, to monitor and optimize the
performance throughout its life.
00:28:50.295 --> 00:28:55.848
We are super excited to partner with Bentley to create infrastructure
digital twins in Omniverse.
00:28:55.848 --> 00:29:01.614
Bentley is the first 3rd-party company to be developing a suite of
applications on the Omniverse platform.
00:29:02.140 --> 00:29:08.546
This is just an awesome use of Omniverse, a great example of digital
twins and Bentley is the perfect partner.
00:29:08.980 --> 00:29:15.120
And here's Perry Nightingale from WPP, the largest ad agency in the
world, to tell you what they're doing.
00:29:18.490 --> 00:29:25.432
WPP is the largest marketing services organization on the planet, and
because of that, we're also one of the largest production companies
in the world
00:29:25.432 --> 00:29:28.594
That is a major carbon hotspot for us.
00:29:28.594 --> 00:29:35.138
We've partnered with NVIDIA, to capture locations virtually and bring
them to life with studios in Omniverse.
00:29:35.138 --> 00:29:39.185
Over 10 billion points have turned into a giant mesh in Omniverse.
00:29:39.185 --> 00:29:45.361
For the first time, we can shoot locations virtually that are as real
as the actual places themselves.
00:29:45.361 --> 00:29:48.838
Omniverse also changes the way we make work.
00:29:48.838 --> 00:29:56.691
A collaborative platform, that means multiple artists, at multiple
points in the pipeline, in multiple parts of the world can
collaborate on a single scene.
00:29:56.691 --> 00:30:04.968
Real time CGI in sustainable studios. Collaboration with Omniverse is
the future of film at WPP.
00:30:07.505 --> 00:30:11.276
One of the most important features of Omniverse is that it obeys the
laws of physics.
00:30:11.810 --> 00:30:16.347
Omniverse can simulate particles, fluids, materials, springs and
cables.
00:30:16.881 --> 00:30:19.217
This is a fundamental capability for robotics.
00:30:19.617 --> 00:30:23.721
Once trained, the AI and software can be downloaded from Omniverse.
00:30:24.022 --> 00:30:30.735
In this video, you'll see Omniverse's physics simulation with rigid
and soft bodies, fluids, and finite element modeling.
00:30:30.735 --> 00:30:32.652
And a lot more - enjoy!
00:32:16.601 --> 00:32:20.972
Omniverse is a physically-based virtual world where robots can learn
to be robots.
00:32:21.372 --> 00:32:27.278
They'll come in all sizes and shapes - box movers, pick and place
arms, forklifts, cars, trucks.
00:32:27.545 --> 00:32:34.886
In the future, a factory will be a robot, orchestrating many robots
inside, building cars that are robots themselves.
00:32:35.320 --> 00:32:41.392
We can use Omniverse to create a virtual factory, train and simulate
the factory and its robotic workers inside.
00:32:41.726 --> 00:32:48.766
The AI and software that run the virtual factory are exactly the same
as what will run the actual one.
00:32:49.434 --> 00:32:53.538
The virtual and physical factories and their robots will operate in a
loop.
00:32:54.339 --> 00:32:55.740
They are digital twins.
00:32:55.773 --> 00:32:59.944
Connecting to ERP systems, simulating the throughput of the factory,
00:33:00.111 --> 00:33:08.019
simulating new plant layouts, and becoming the dashboard of the
operator - even uplinking into a robot to teleoperate it.
00:33:08.886 --> 00:33:13.291
BMW may very well be the world's largest custom-manufacturing company.
00:33:14.158 --> 00:33:20.665
BMW produces over 2 million cars a year. In their most advanced
factory, a car a minute. Every car is different.
00:33:21.332 --> 00:33:28.806
We are working with BMW to create a future factory. Designed
completely in digital, simulated from beginning to end in Omniverse,
creating a
00:33:28.806 --> 00:33:33.277
digital twin, and operating a factory where robots and humans work
together.
00:33:34.045 --> 00:33:36.714
Let's take a look at the BMW factory.
00:33:41.185 --> 00:33:50.962
Welcome to BMW Production, Jensen. I am pleased to show you why BMW
sets the standards for innovation and flexibility. Our collaboration
00:33:50.962 --> 00:33:58.469
with NVIDIA Omniverse and NVIDIA AI leads into a new era of
digitalization of automobile production.
00:33:58.936 --> 00:34:03.641
Fantastic to be with you, Milan. I am excited to do this virtual
factory visit with you.
00:34:04.409 --> 00:34:14.218
We are inside the digital twin of BMW's assembly system, powered by
Omniverse. For the first time, we are able to have our entire factory
in
00:34:14.218 --> 00:34:24.629
simulation. Global teams can collaborate using different software
packages like Revit, CATIA, or point clouds to design and plan the
factory
00:34:24.662 --> 00:34:33.971
in real time 3D. The capability to operate in a perfect simulation
revolutionizes BMW's planning processes.
00:34:38.509 --> 00:34:48.352
BMW regularly reconfigures its factories to accommodate new vehicle
launches. Here we see 2 planning experts located in different parts of
00:34:48.352 --> 00:34:58.396
the world, testing a new line design in Omniverse. One of them
wormholes into an assembly simulation with a motion capture suit,
records
00:34:58.463 --> 00:35:04.202
task movements, while the other expert adjusts the line design - in
real time.
00:35:04.202 --> 00:35:09.305
They work together to optimize the line as well as worker ergonomics
and safety.
00:35:10.274 --> 00:35:12.310
“Can you tell how far I have to bend down there?”
00:35:12.443 --> 00:35:14.278
“First, I’ll get you a taller one”
00:35:14.645 --> 00:35:15.446
“Yeah, it’s perfect”
00:35:15.446 --> 00:35:19.450
We would like to be able to do this at scale in simulation.
00:35:20.218 --> 00:35:23.427
That's exactly why NVIDIA has Digital Human for simulation.
00:35:23.427 --> 00:35:26.861
Digital Humans are trained with data from real associates.
00:35:26.861 --> 00:35:32.623
You can then use Digital Humans in simulation to test new workflows
for worker ergonomics and efficiency.
00:35:34.699 --> 00:35:41.353
Now, your factories employ 57,000 people that share workspace with
many robots designed to make their jobs easier.
00:35:41.353 --> 00:35:42.039
Let's talk about them.
00:35:42.807 --> 00:35:43.791
You are right, Jensen.
00:35:43.791 --> 00:35:52.450
Robots are crucial for a modern production system. With NVIDIA Isaac
robotics platform, BMW is deploying a fleet of
00:35:52.483 --> 00:35:57.506
intelligent robots for logistics to improve the material flow in our
production.
00:35:57.506 --> 00:36:05.663
This agility is necessary since we produce 2.5 million vehicles per
year, 99% of them are custom.
00:36:06.998 --> 00:36:12.654
Synthetic data generation and domain randomization available in Isaac
are key to bootstrapping machine learning.
00:36:12.654 --> 00:36:18.993
Isaac Sim generates millions of synthetic images and vary the
environment to teach the robots.
00:36:20.611 --> 00:36:29.731
Domain randomization can generate an infinite permutation of
photorealistic objects, textures, orientations and lighting
conditions.
00:36:29.731 --> 00:36:36.841
It is ideal for generating ground truth, whether for detection,
segmentation, or depth perception.
00:36:38.763 --> 00:36:42.443
Let me show you an example of how we can combine it all to operate
your factory.
00:36:42.443 --> 00:36:49.207
With NVIDIA's Fleet Command, your associates can securely orchestrate
robots and other devices in the factory from Mission Control.
00:36:50.875 --> 00:37:02.258
They could monitor in real time complex manufacturing cells, update
software over the air, launch robot missions and teleoperate.
00:37:02.258 --> 00:37:08.232
When a robot needs a helping hand, an alert can be sent to Mission
Control and one of your associates can take control to help the robot.
00:37:11.362 --> 00:37:17.959
We're in the digital twin of one of your factories, but you have 30
others spread across 15 countries.
00:37:17.959 --> 00:37:20.142
The scale of BMW production is impressive, Milan.
00:37:20.972 --> 00:37:29.044
Indeed, Jensen, the scale and complexity of our production network
requires BMW to constantly innovate.
00:37:29.044 --> 00:37:34.159
I am happy about the tight collaboration between our two companies.
00:37:34.159 --> 00:37:41.826
NVIDIA Omniverse and NVIDIA AI give us the chance to simulate all 31
factories in our production network.
00:37:41.826 --> 00:37:52.668
These new innovations will reduce the planning times, improve
flexibility and precision and at the end, produce 30% more efficient
planning processes.
00:37:53.804 --> 00:38:00.804
Milan, I could not be more proud of the innovations that our
collaboration is bringing to the Factories of the Future.
00:38:00.804 --> 00:38:07.250
I appreciate you hosting me for a virtual visit of the digital twin
of your BMW Production.
00:38:07.250 --> 00:38:08.719
It is a work of art!
00:38:13.224 --> 00:38:15.593
The ecosystem is really excited about Omniverse.
00:38:16.360 --> 00:38:23.434
This open platform, with USD universal 3D interchange, connects them
into a large network of users.
00:38:24.135 --> 00:38:28.439
We have 12 connectors to major design tools already with another 40
in flight.
00:38:29.273 --> 00:38:31.942
Omniverse Connector SDK is available for download now.
00:38:32.810 --> 00:38:35.446
You can see that the most important design tools are already
signed-up.
00:38:35.980 --> 00:38:43.220
Our lighthouse partners are from some of the world's largest
industries - Media and Entertainment, Gaming, Architecture,
Engineering, and
00:38:43.220 --> 00:38:48.459
Construction; Manufacturing, Telecommunications, Infrastructure, and
Automotive.
00:38:48.959 --> 00:38:55.366
Computer makers worldwide are building NVIDIA-Certified workstations,
notebooks, and servers optimized for Omniverse.
00:38:56.267 --> 00:39:00.237
And starting this summer, Omniverse will be available for enterprise
license.
00:39:00.671 --> 00:39:06.544
Omniverse - NVIDIA's platform for creating and simulating shared
virtual worlds.
00:39:08.446 --> 00:39:10.648
Data center is the new unit of computing.
00:39:11.482 --> 00:39:16.520
Cloud computing and AI are driving fundamental changes in the
architecture of data centers.
00:39:17.221 --> 00:39:20.925
Traditionally, enterprise data centers ran monolithic software
packages.
00:39:21.892 --> 00:39:29.845
Virtualization started the trend toward software-defined data centers
- allowing applications to move about and letting IT manage from a
"single-pane of glass".
00:39:30.368 --> 00:39:37.608
With virtualization, the compute, networking, storage, and security
functions are emulated in software running on the CPU.
00:39:38.642 --> 00:39:46.717
Though easier to manage, the added CPU load reduced the data center's
capacity to run applications, which is its primary purpose.
00:39:47.251 --> 00:39:51.522
This illustration shows the added CPU load in the gold-colored part
of the stack.
00:39:51.922 --> 00:39:57.628
Cloud computing re-architected data centers again, now to provision
services for billions of consumers.
00:39:58.229 --> 00:40:04.568
Monolithic applications were disaggregated into smaller microservices
that can take advantage of any idle resource.
00:40:05.269 --> 00:40:10.207
Equally important, multiple engineering teams can work concurrently
using CI/CD methods.
00:40:11.409 --> 00:40:17.314
Data center networks became swamped by east-west traffic generated by
disaggregated microservices.
00:40:18.382 --> 00:40:22.119
CSPs tackled this with Mellanox's high-speed low-latency networking.
00:40:23.387 --> 00:40:31.295
Then, deep learning emerged. Magical internet services were rolled
out, attracting more customers, and better engagement than ever.
00:40:32.663 --> 00:40:36.700
Deep learning is compute-intensive which drove adoption of GPUs.
00:40:37.368 --> 00:40:44.074
Nearly overnight, consumer AI services became the biggest users of
GPU supercomputing technologies.
00:40:44.942 --> 00:40:52.416
Now, adding Zero-Trust security initiatives makes infrastructure
software processing one of the largest workloads in the data center.
00:40:53.083 --> 00:41:00.090
The answer is a new type of chip for Data Center Infrastructure
Processing like NVIDIA's Bluefield DPU.
00:41:00.090 --> 00:41:04.028
Let me illustrate this with our own cloud gaming service, GeForce
Now, as an example.
00:41:04.428 --> 00:41:07.598
GeForce Now is NVIDIA's GeForce-in-the-cloud service.
00:41:08.232 --> 00:41:15.172
GeForce Now serves 10 million members in 70 countries. Incredible
growth.
00:41:15.706 --> 00:41:19.417
GeForce Now is a seriously hard consumer service to deliver.
00:41:19.417 --> 00:41:32.475
Everything matters - speed of light, visual quality, frame rate,
response, smoothness, start-up time, server cost, and most important
of all, security.
00:41:33.390 --> 00:41:35.259
We're transitioning GeForce Now to BlueField.
00:41:35.893 --> 00:41:43.300
With Bluefield, we can isolate the infrastructure from the game
instances, and offload and accelerate the networking, storage, and
security.
00:41:44.068 --> 00:41:52.176
The GeForce Now infrastructure is costly. With BlueField, we will
improve our quality of service and concurrent users at the same time
- the
00:41:52.343 --> 00:41:53.944
ROI of BlueField is excellent.
00:41:54.712 --> 00:42:00.184
I'm thrilled to announce our first data center infrastructure SDK -
DOCA 1.0 is available today!
00:42:00.818 --> 00:42:02.920
DOCA is our SDK to program BlueField.
00:42:03.587 --> 00:42:05.489
There's all kinds of great technology inside.
00:42:05.890 --> 00:42:15.375
Deep packet inspection, secure boot, TLS crypto offload, RegEX
acceleration, and a very exciting capability - a hardware-based,
00:42:15.375 --> 00:42:20.813
real-time clock that can be used for synchronous data centers, 5G,
and video broadcast.
00:42:21.705 --> 00:42:25.175
We have great partners working with us to optimize leading platforms
on BlueField:
00:42:25.709 --> 00:42:34.752
Infrastructure software providers, edge and CDN providers,
cybersecurity solutions, and storage providers - basically the
world's leading
00:42:34.752 --> 00:42:36.754
companies in data center infrastructure.
00:42:37.421 --> 00:42:42.793
Though we're just getting started with BlueField 2, today we're
announcing BlueField 3.
00:42:42.793 --> 00:42:44.461
22 billion transistors.
00:42:44.929 --> 00:42:47.698
The first 400 Gbps networking chip.
00:42:48.165 --> 00:42:54.638
16 Arm CPUs to run the entire virtualization software stack - for
instance, running VMware ESX.
00:42:55.005 --> 00:43:03.614
BlueField 3 takes security to a whole new level, fully offloading and
accelerating IPSEC and TLS cryptography, secret key management, and
00:43:03.614 --> 00:43:05.049
regular expression processing.
00:43:05.783 --> 00:43:09.253
We are on a pace to introduce a new Bluefield generation every 18
months.
00:43:10.120 --> 00:43:16.093
BlueField 3 will do 400 Gbps and be 10x the processing capability of
BlueField 2.
00:43:17.161 --> 00:43:25.436
And BlueField 4 will do 800 Gbps and add NVIDIA's AI computing
technologies to get another 10x boost.
00:43:26.236 --> 00:43:30.240
100x in 3 years -- and all of it will be needed.
00:43:31.175 --> 00:43:39.750
A simple way to think about this is that 1/3rd of the roughly 30
million data center servers shipped each year are consumed running the
00:43:39.750 --> 00:43:42.286
software-defined data center stack.
00:43:43.020 --> 00:43:46.090
This workload is increasing much faster than Moore's law.
00:43:46.690 --> 00:43:52.563
So, unless we offload and accelerate this workload, data centers will
have fewer and fewer CPUs to run applications.
00:43:52.930 --> 00:43:54.264
The time for BlueField has come.
00:43:55.065 --> 00:44:04.241
At the beginning of the big bang of modern AI, we recognized the need
to create a new kind of computer for a new way of developing software.
00:44:04.642 --> 00:44:08.545
Software will be written by software running on AI computers.
00:44:09.613 --> 00:44:19.723
This new type of computer will need new chips, new system
architecture, new ways to network, new software, and new
methodologies and tools.
00:44:20.524 --> 00:44:24.995
We've invested billions into this intuition, and it has proven
helpful to the industry.
00:44:25.763 --> 00:44:29.767
It all comes together as DGX - a computer for AI.
00:44:30.234 --> 00:44:37.374
We offer DGX as a fully integrated system, as well as offer the
components to the industry to create differentiated options.
00:44:37.841 --> 00:44:41.286
I am pleased to see so much AI research advancing because of DGX -
00:44:41.286 --> 00:44:48.997
top universities, research hospitals, telcos, banks, consumer
products companies, car makers, and aerospace companies.
00:44:49.787 --> 00:44:56.319
DGX help their AI researchers - whose expertise is rare, scarce, and
their work strategic.
00:44:56.319 --> 00:44:58.862
It is imperative to make sure they have the right instrument.
00:44:59.663 --> 00:45:06.570
Simply, if software is to be written by computers, then companies
with the best software engineers will also need the best computers.
00:45:07.204 --> 00:45:10.441
We offer several configurations - all software compatible.
00:45:10.908 --> 00:45:18.849
The DGX A100 is a building block that contains 5 petaFLOPS of
computing and superfast storage and networking to feed it.
00:45:19.316 --> 00:45:25.889
DGX Station is an AI data center in-a-box designed for workgroups -
plugs into a normal outlet.
00:45:26.423 --> 00:45:35.634
And DGX SuperPOD is a fully integrated, fully network-optimized,
AI-data-center-as-a-product. SuperPOD is for intensive AI research
and development.
00:45:36.734 --> 00:45:41.083
NVIDIA's own new supercomputer, called Selene, is 4 SuperPODs.
00:45:41.083 --> 00:45:46.943
It is the 5th fastest supercomputer and the fastest industrial
supercomputer in the world's Top 500.
00:45:47.745 --> 00:45:50.914
We have a new DGX Station 320G.
00:45:51.482 --> 00:46:04.591
DGX Station can train large models - 320 gigabytes of super-fast
HBM2e connected to 4 A100 GPUs over 8 terabytes per second of memory
bandwidth.
00:46:04.762 --> 00:46:07.998
8 terabytes transferred in one second.
00:46:08.332 --> 00:46:12.236
It would take 40 CPU servers to achieve this memory bandwidth.
00:46:13.303 --> 00:46:23.614
DGX Station plugs into a normal wall outlet, like a big gaming rig,
consumes just 1,500 watts, and is liquid-cooled to a silent 37 db.
00:46:24.348 --> 00:46:27.284
Take a look at the cinematic that our engineers and creative team did.
00:47:30.814 --> 00:47:34.284
A CPU cluster of this performance would cost about a million dollars
today.
00:47:34.818 --> 00:47:41.959
DGX Station is $149,000 - the ideal AI programming companion for
every AI researcher.
00:47:42.659 --> 00:47:45.729
Today we are also announcing a new DGX SuperPOD.
00:47:46.063 --> 00:47:47.464
Three major upgrades:
00:47:48.098 --> 00:47:59.710
The new 80 gigabyte A100 which brings the SuperPOD to 90 terabytes of
HBM2 memory, with aggregate bandwidth of 2.2 exabytes per second.
00:48:00.744 --> 00:48:10.921
It would take 11,000 CPU servers to achieve this bandwidth - about a
250-rack data center - 15 times bigger than the SuperPOD.
00:48:12.389 --> 00:48:15.993
Second, SuperPOD has been upgraded with NVIDIA BlueField-2.
00:48:16.727 --> 00:48:25.903
SuperPOD is now the world's first cloud-native supercomputer,
multi-tenant shareable, with full isolation and bare-metal
performance.
00:48:26.937 --> 00:48:33.610
And third, we're offering Base Command, the DGX management and
orchestration tool used within NVIDIA.
00:48:34.845 --> 00:48:42.519
We use Base Command to support thousands of engineers, over two
hundred teams, consuming a million-plus GPU-hours a week.
00:48:43.420 --> 00:48:48.258
DGX SuperPOD starts at seven million dollars and scales to sixty
million dollars for a full system.
00:48:49.693 --> 00:48:51.862
Let me highlight 3 great uses of DGX.
00:48:52.562 --> 00:48:56.033
Transformers have led to dramatic breakthroughs in Natural Language
Processing.
00:48:56.867 --> 00:49:02.539
Like RNN and LSTM, Transformers are designed to inference on
sequential data.
00:49:03.173 --> 00:49:12.816
However, Transformers, more than meets the eyes, are not trained
sequentially, but use a mechanism called attention such that
Transformers
00:49:12.816 --> 00:49:14.117
can be trained in parallel.
00:49:14.751 --> 00:49:22.781
This breakthrough reduced training time, which more importantly
enabled the training of huge models with a correspondingly enormous
amount of data.
00:49:23.560 --> 00:49:27.864
Unsupervised learning can now achieve excellent results, but the
models are huge.
00:49:28.665 --> 00:49:32.002
Google's Transformer was 65 million parameters.
00:49:32.869 --> 00:49:38.275
OpenAI's GPT-3 is 175 billion parameters.
00:49:38.742 --> 00:49:41.845
That's 3000x times larger in just 3 years.
00:49:42.579 --> 00:49:45.182
The applications for GPT-3 are really incredible.
00:49:45.649 --> 00:49:47.150
Generate document summaries.
00:49:47.718 --> 00:49:49.319
Email phrase completion.
00:49:50.120 --> 00:49:57.461
GPT-3 can even generate Javascript and HTML from plain English -
essentially telling an AI to write code based on what you want it to
do.
00:49:58.528 --> 00:50:03.734
Model sizes are growing exponentially - on a pace of doubling every
two and half months.
00:50:04.534 --> 00:50:11.341
We expect to see multi-trillion-parameter models by next year, and
100 trillion+ parameter models by 2023.
00:50:13.110 --> 00:50:19.082
As a very loose comparison, the human brain has roughly 125 trillion
synapses.
00:50:20.250 --> 00:50:22.252
So these transformer models are getting quite large.
00:50:22.986 --> 00:50:26.456
Training models of this scale is incredible computer science.
00:50:27.324 --> 00:50:32.996
Today, we are announcing NVIDIA Megatron - for training Transformers.
00:50:34.164 --> 00:50:41.505
Megatron trains giant Transformer models - it partitions and
distributes the model for optimal multi-GPU and multi-node
parallelism.
00:50:42.406 --> 00:50:51.481
Megatron does fast data loading, micro batching, scheduling and
syncing, kernel fusing. It pushes the limits of every NVIDIA
invention -
00:50:51.648 --> 00:50:56.186
NCCL, NVLink, Infiniband, Tensor Cores.
00:50:57.154 --> 00:51:01.691
Even with Megatron, a trillion-parameter model will take about 3-4
months to train on Selene.
00:51:02.659 --> 00:51:05.862
So, lots of DGX SuperPODS will be needed around the world
00:51:06.563 --> 00:51:10.834
Inferencing giant Transformer models is also a great computer science
challenge.
00:51:11.234 --> 00:51:22.476
GPT-3 is so big, with so many floating-point operations, that it
would take a dual-CPU server over a minute to respond to a single
128-word query.
00:51:23.713 --> 00:51:29.319
And GPT-3 is so large that it doesn't fit in GPU memory - so it will
have to be distributed.
00:51:30.187 --> 00:51:33.423
multi-GPU multi-node inference has never been done.
00:51:34.257 --> 00:51:39.563
Today, we're announcing the Megatron Triton Inference Server.
00:51:40.764 --> 00:51:50.273
A DGX with Megatron Triton will respond within a second! Not a minute
- a second! And for 16 queries at the same time.
00:51:51.174 --> 00:51:59.649
DGX is 1000 times faster and opens up many new use-cases, like
call-center support, where a one-minute response is effectively
unusable.
00:52:01.118 --> 00:52:03.019
Naver is Korea's #1 search engine.
00:52:03.920 --> 00:52:10.360
They installed a DGX SuperPOD and are running their AI platform CLOVA
to train language models for Korean.
00:52:11.261 --> 00:52:15.275
I expect many leading service providers around the world to do the
same
00:52:15.275 --> 00:52:20.455
Use DGX to develop and operate region-specific and industry-specific
language services.
00:52:22.339 --> 00:52:31.114
NVIDIA Clara Discovery is our suite of acceleration libraries created
for computational drug discovery - from imaging, to quantum chemistry,
00:52:31.748 --> 00:52:39.923
to gene variant-calling, to using NLP to understand genetics, and
using AI to generate new drug compounds.
00:52:41.124 --> 00:52:44.528
Today we're announcing four new models available in Clara Discovery:
00:52:45.495 --> 00:52:49.831
MegaMolBART is a model for generating biomolecular compounds.
00:52:49.831 --> 00:52:56.617
This method has seen recent success with Insilico Medicine using AI
to find a new drug in less than two years.
00:52:57.073 --> 00:53:05.549
NVIDIA ATAC-seq denoising algorithm for rare and single cell
epi-genomics is helping to understand gene expression for individual
cells.
00:53:06.516 --> 00:53:12.689
AlphaFold1 is a model that can predict the 3D structure of a protein
from the amino acid sequence.
00:53:12.689 --> 00:53:18.895
GatorTron is the world's largest clinical language model that can
read and understand doctors' notes.
00:53:18.895 --> 00:53:26.468
GatorTron was developed at UoF, using Megatron, and trained on the
DGX SuperPOD gifted
00:53:26.468 --> 00:53:29.877
to his alma mater by Chris Malachowsky, who founded NVIDIA with
Curtis and me.
00:53:31.041 --> 00:53:39.082
Oxford Nanopore is the 3rd generation genomics sequencing technology
capable of ultra high throughput in digitizing biology -
00:53:39.082 --> 00:53:46.356
1/5 of the SARS-CoV-2 virus genomes in the global database were
generated on Oxford Nanopore.
00:53:46.356 --> 00:53:52.596
Last year, Oxford Nanopore developed a diagnostic test for COVID-19
called LamPORE, which is used by NHS.
00:53:53.463 --> 00:53:55.865
Oxford Nanopore is GPU-accelerated throughout.
00:53:56.766 --> 00:54:06.539
DNA samples pass through nanopores and the current signal is fed into
an AI model, like speech recognition, but trained to recognize
genetic code.
00:54:07.577 --> 00:54:11.648
Another model called Medaka reads the code and detects genetic
variants.
00:54:12.015 --> 00:54:14.784
Both models were trained on DGX SuperPOD.
00:54:15.352 --> 00:54:24.327
These new deep learning algorithms achieve 99.9% detection accuracy
of single nucleotide variants - this is the gold standard of human
sequencing.
00:54:24.861 --> 00:54:32.602
Pharma is a 1.3 trillion-dollar industry where a new drug can take
10+ years and fails 90% of the time.
00:54:33.770 --> 00:54:40.443
Schrodinger is the leading physics-based and machine learning
computational platform for drug discovery and material science.
00:54:40.844 --> 00:54:50.808
Schrodinger is already a heavy user of NVIDIA GPUs, recently entering
into an agreement to use hundreds of millions of NVIDIA GPU hours on
the Google cloud.
00:54:51.554 --> 00:54:59.996
Some customers can't use the cloud, so today we are announcing a
partnership to accelerate Schrodinger's drug discovery workflow with
NVIDIA
00:55:00.063 --> 00:55:04.301
Clara Discovery libraries and NVIDIA DGX.
00:55:04.301 --> 00:55:10.740
The world's top 20 pharmas use Schrodinger today. Their researchers
are going to see a giant boost in productivity.
00:55:11.574 --> 00:55:19.182
Recursion is a biotech company using leading-edge computer science to
decode biology to industrialize drug discovery.
00:55:19.949 --> 00:55:30.506
The Recursion Operating System is built on NVIDIA DGX SuperPOD for
generating, analyzing and gaining insight from massive biological and
chemical datasets.
00:55:31.561 --> 00:55:37.767
They call their SuperPOD the BioHive-1 - it's the most powerful
computer at any pharma today.
00:55:38.702 --> 00:55:46.409
Using deep learning on DGX, Recursion is classifying cell responses
after exposure to small molecule drugs.
00:55:47.277 --> 00:55:56.226
Quantum computing is a field of physics that studies the use of
natural quantum behavior - superposition, entanglement, and
interference - to build a computer.
00:55:56.853 --> 00:56:01.491
The computation is performed using quantum circuits that operate on
quantum bits - called qubits.
00:56:02.492 --> 00:56:09.866
Qubits can be 0 or 1, like a classical computing bit, but also in
superposition - meaning they exist simultaneously in both states.
00:56:10.533 --> 00:56:15.672
The qubits can be entangled where the behavior of one can affect or
control the behavior of others.
00:56:16.206 --> 00:56:22.078
Adding and entangling more qubits lets quantum computers calculate
exponentially more information.
00:56:22.812 --> 00:56:26.883
There is a large community around the world doing research in quantum
computers and algorithms.
00:56:27.550 --> 00:56:32.689
Well over 50 teams in industry, academia, and national labs are
researching the field.
00:56:33.390 --> 00:56:34.624
We're working with many of them.
00:56:35.325 --> 00:56:44.067
Quantum computing can solve Exponential Order Complexity problems,
like factoring large numbers for cryptography, simulating atoms and
00:56:44.067 --> 00:56:49.806
molecules for drug discovery, finding shortest path optimizations,
like the traveling salesman problem,
00:56:50.507 --> 00:56:57.647
The limiter in quantum computing is decoherence, falling out of
quantum states, caused by the tiniest of background noise
00:56:58.081 --> 00:56:59.649
So error correction is essential.
00:57:00.483 --> 00:57:07.323
It is estimated that to solve meaningful problems, several million
physical qubits will be required to sufficiently error correct.
00:57:07.957 --> 00:57:17.357
The research community is making fast progress, doubling physical
Qubits each year, so likely achieving the milestone by 2035 to 2040.
00:57:17.357 --> 00:57:19.759
Well within my career horizon.
00:57:20.036 --> 00:57:26.843
In the meantime, our mission is to help the community research the
computer of tomorrow with the fastest computer of today.
00:57:27.777 --> 00:57:34.117
Today, we're announcing cuQuantum - an acceleration library designed
for simulating quantum circuits
00:57:34.484 --> 00:57:37.520
for both Tensor Network Solvers and State Vector Solvers.
00:57:38.788 --> 00:57:44.260
It is optimized to scale to large GPU memories, multiple GPUs, and
multiple DGX nodes.
00:57:45.428 --> 00:57:54.571
The speed-up of cuQuantum on DGX is excellent. Running the cuQuantum
Benchmark, state vector simulation takes 10 days on a dual-CPU server
00:57:55.271 --> 00:58:02.979
but only 2 hours on a DGX A100. cuQuantum on DGX can productively
simulate 10's of qubits.
00:58:04.481 --> 00:58:15.792
And Caltech, using Contengra/Quimb, simulated the Sycamore quantum
circuit at depth 20 in record time using cuQuantum on NVIDIA's Selene
supercomputer.
00:58:16.392 --> 00:58:23.299
What would have taken years on CPUs can now run in a few days on
cuQuantum and DGX.
00:58:24.167 --> 00:58:31.808
cuQuantum will accelerate quantum circuit simulators so researchers
can design better quantum computers and verify their results,
architect
00:58:31.808 --> 00:58:38.348
hybrid quantum-classical systems -and discover more Quantum-Optimal
algorithms like Shor's and Grover's.
00:58:39.115 --> 00:58:43.853
cuQuantum on DGX is going to give the quantum community a huge boost.
00:58:44.554 --> 00:58:49.659
I'm hoping cuQuantum will do for quantum computing what cuDNN did for
deep learning.
00:58:50.493 --> 00:58:54.964
Modern data centers host diverse applications that require varying
system architectures.
00:58:55.532 --> 00:59:01.337
Enterprise servers are optimized for a balance of strong
single-threaded performance and a nominal number of cores.
00:59:01.905 --> 00:59:10.046
Hyperscale servers, optimized for microservice containers, are
designed for a high number of cores, low cost, and great
energy-efficiency.
00:59:10.680 --> 00:59:14.918
Storage servers are optimized for large number of cores and high IO
throughput.
00:59:15.285 --> 00:59:22.592
Deep learning training servers are built like supercomputers - with
the largest number of fast CPU cores, the fastest memory,
00:59:22.592 --> 00:59:26.796
the fastest IO, and high-speed links to connect the GPUs.
00:59:26.996 --> 00:59:33.670
Deep learning inference servers are optimized for energy-efficiency
and best ability to process a large number of models concurrently.
00:59:34.437 --> 00:59:45.448
The genius of the x86 server architecture is the ability to do a good
job using varying configurations of the CPU, memory, PCI express, and
00:59:45.448 --> 00:59:48.384
peripherals to serve all of these applications.
00:59:49.218 --> 00:59:56.793
Yet processing large amounts of data remains a challenge for computer
systems today - this is particularly true for AI models like
00:59:56.793 --> 00:59:58.595
transformers and recommender systems.
00:59:59.228 --> 01:00:03.099
Let me illustrate the bottleneck with half of a DGX.
01:00:03.733 --> 01:00:10.440
Each Ampere GPU is connected to 80GB of super fast memory running at
2 TB/sec.
01:00:11.374 --> 01:00:18.715
Together, the 4 Amperes process 320 GB at 8 Terabytes per second.
01:00:19.382 --> 01:00:27.308
Contrast that with CPU memory, which is 1TB large, but only 0.2
Terabytes per second.
01:00:27.308 --> 01:00:32.178
The CPU memory is 3 times larger but 40 times slower than the GPU.
01:00:33.029 --> 01:00:39.102
We would love to utilize the full 1,320 GB of memory of this node to
train AI models.
01:00:39.869 --> 01:00:41.404
So, why not something like this?
01:00:42.171 --> 01:00:49.379
Make faster CPU memories, connect 4 channels to the CPU, a dedicated
channel to feed each GPU.
01:00:49.846 --> 01:00:53.750
Even if a package can be made, PCIe is now the bottleneck.
01:00:54.450 --> 01:01:02.258
We can surely use NVLINK. NVLINK is fast enough. But no x86 CPU has
NVLINK, not to mention 4 NVLINKS.
01:01:03.593 --> 01:01:13.469
Today, we're announcing our first data center CPU, Project Grace,
named after Grace Hopper, a computer scientist and U.S. Navy rear
Admiral,
01:01:13.903 --> 01:01:16.105
who in the '50s pioneered computer programming.
01:01:17.006 --> 01:01:23.646
Grace is Arm-based and purpose-built for accelerated computing
applications of large amounts of data - such as AI.
01:01:24.380 --> 01:01:26.358
Grace highlights the beauty of Arm.
01:01:26.358 --> 01:01:33.527
Their IP model allowed us to create the optimal CPU for this
application which achieves x-factors speed-up.
01:01:34.323 --> 01:01:45.568
The Arm core in Grace is a next generation off-the-shelf IP for
servers. Each CPU will deliver over 300 SpecInt with a total of over
2,400
01:01:45.568 --> 01:01:51.774
SPECint_rate CPU performance for an 8-GPU DGX.
01:01:52.275 --> 01:02:00.149
For comparison, todays DGX, the highest performance computer in the
world today is 450 SPECint_rate.
01:02:00.883 --> 01:02:07.256
2400 SPECint_rate with Grace versus 450 SPECint_rate today.
01:02:07.890 --> 01:02:17.133
So look at this again - Before, After, Before, After.
01:02:18.367 --> 01:02:21.604
Amazing increase in system and memory bandwidth.
01:02:22.538 --> 01:02:25.174
Today, we're introducing a new kind of computer.
01:02:26.075 --> 01:02:28.578
The basic building block of the modern data center.
01:02:29.479 --> 01:02:30.146
Here it is.
01:02:42.391 --> 01:02:50.688
What I'm about to show you brings together the latest GPU accelerated
computing, Mellanox high performance networking, and something brand
new.
01:02:51.267 --> 01:02:52.935
The final piece of the puzzle.
01:02:59.142 --> 01:03:07.416
The world's first CPU designed for terabyte-scale accelerated
computing... her secret codename - GRACE.
01:03:09.385 --> 01:03:19.310
This powerful, Arm-based CPU gives us the third foundational
technology for computing, and the ability to rearchitect every aspect
of the data center for AI.
01:03:20.663 --> 01:03:27.804
We're thrilled to announce the Swiss National Supercomputing Center
will build a supercomputer powered by Grace and our next generation
GPU.
01:03:28.504 --> 01:03:39.582
This new supercomputer, called Alps, will be 20 exaflops for AI, 10
times faster than the world's fastest supercomputer today.
01:03:40.583 --> 01:03:48.100
Alps will be used to do whole-earth-scale weather and climate
simulation, quantum chemistry and quantum physics for the Large
Hadron Collider.
01:03:48.925 --> 01:03:54.730
Alps will be built by HPE and is come on-line in 2023.
01:03:54.730 --> 01:04:02.605
We're thrilled by the enthusiasm of the supercomputing community,
welcoming us to make Arm a top-notch scientific computing platform.
01:04:03.372 --> 01:04:11.214
Our data center roadmap is now a rhythm consisting of 3-chips: CPU,
GPU, and DPU.
01:04:12.114 --> 01:04:16.986
Each chip architecture has a two-year rhythm with likely a kicker in
between.
01:04:17.954 --> 01:04:20.590
One year will focus on x86 platforms.
01:04:21.023 --> 01:04:23.125
One year will focus on Arm platforms.
01:04:23.826 --> 01:04:26.662
Every year will see new exciting products from us.
01:04:27.396 --> 01:04:34.403
The NVIDIA architecture and platforms will support x86 and Arm -
whatever customers and markets prefer.
01:04:35.338 --> 01:04:39.075
Three chips. Yearly Leaps. One Architecture.
01:04:39.976 --> 01:04:42.311
Arm is the most popular CPU in the world.
01:04:43.246 --> 01:04:51.921
For good reason - its super energy-efficient. Its open licensing
model inspires a world of innovators to create products around it.
01:04:52.955 --> 01:04:55.391
Arm is used broadly in mobile and embedded today.
01:04:56.392 --> 01:05:03.018
For other markets like the cloud, enterprise and edge data centers,
supercomputing and PCs.
01:05:03.018 --> 01:05:07.096
Arm is just starting and has great growth opportunities.
01:05:07.637 --> 01:05:14.110
Each market has different applications and has unique systems,
software, peripherals, and ecosystems.
01:05:15.011 --> 01:05:18.347
For the markets we serve, we can accelerate Arm's adoption.
01:05:19.248 --> 01:05:20.950
Let's start with the big one - Cloud.
01:05:21.951 --> 01:05:29.558
One of the earliest designers of Arm CPUs for data centers is AWS -
its Graviton CPUs are extremely impressive.
01:05:30.259 --> 01:05:37.767
Today, we're announcing NVIDIA and AWS are partnering to bring
Graviton2 and NVIDIA GPUs together.
01:05:38.668 --> 01:05:44.774
This partnership brings Arm into the most demanding cloud workloads -
AI and cloud gaming.
01:05:45.408 --> 01:05:50.079
Mobile gaming is growing fast and is the primary form of gaming in
some markets.
01:05:50.813 --> 01:05:58.621
With AWS-designed Graviton2, users can stream Arm-based applications
and Android games straight from AWS.
01:05:59.422 --> 01:06:00.656
It's expected later this year.
01:06:01.524 --> 01:06:08.197
We are announcing a partnership with Ampere Computing to create a
scientific and cloud computing SDK and reference system.
01:06:09.265 --> 01:06:19.141
Ampere Computing's Altra CPU is excellent - 80 cores, 285 SPECint17,
right up there with the highest performance x86.
01:06:20.176 --> 01:06:25.581
We are seeing excellent reception at supercomputing centers around
the world and at Android cloud gaming services.
01:06:26.282 --> 01:06:33.689
We are also announcing a partnership with Marvell to create an edge
and enterprise computing SDK and reference system.
01:06:34.623 --> 01:06:42.765
Marvell Octeon excels at IO, storage and 5G processing. This system
is ideal for hyperconverged edge servers.
01:06:44.066 --> 01:06:50.439
We're announcing a partnership with Mediatek to create a reference
system and SDK for Chrome OS and Linux PC's.
01:06:51.140 --> 01:06:54.777
Mediatek is the world's largest SOC maker.
01:06:55.611 --> 01:07:01.250
Combining NVIDIA GPUs and Mediatek SOCs will make excellent PCs and
notebooks.
01:07:02.485 --> 01:07:09.492
AI, computers automating intelligence, is the most powerful
technology force of our time.
01:07:10.192 --> 01:07:11.927
We see AI in four waves.
01:07:12.261 --> 01:07:20.202
The first wave was to reinvent computing for this new way of doing
software - we're all in and have been driving this for nearly 10
years.
01:07:20.903 --> 01:07:27.810
The first adopters of AI were the internet companies - they have
excellent computer scientists, large computing infrastructures, and
the
01:07:27.810 --> 01:07:29.712
ability to collect a lot of training data.
01:07:30.880 --> 01:07:32.915
We are now at the beginning of the next wave.
01:07:33.716 --> 01:07:42.691
The next wave is enterprise and the industrial edge, where AI can
revolutionize the world's largest industries. From manufacturing,
01:07:42.792 --> 01:07:47.797
logistics, agriculture, healthcare, financial services, and
transportation.
01:07:48.531 --> 01:07:53.869
There are many challenges to overcome, one of which is connectivity,
which 5G will solve.
01:07:54.503 --> 01:08:01.510
And then autonomous systems. Self-driving cars are an excellent
example. But everything that moves will eventually be autonomous.
01:08:02.244 --> 01:08:09.518
The industrial edge and autonomous systems are the most challenging,
but also the largest opportunities for AI to make an impact.
01:08:10.152 --> 01:08:17.560
Trillion dollar industries can soon apply AI to improve productivity,
and invent new products, services and business models.
01:08:18.761 --> 01:08:24.800
We have to make AI easier to use - turn AI from computer science to
computer products.
01:08:25.201 --> 01:08:32.842
We're building the new computing platform for this fundamentally new
software approach - the computer for the age of AI.
01:08:33.676 --> 01:08:47.103
AI is not just about an algorithm - building and operating AI is a
fundamental change in every aspect of software - Andrej Karpathy
rightly called it Software 2.0.
01:08:47.123 --> 01:08:54.663
Machine learning, at the highest level, is a continuous learning
system that starts with data scientists developing data strategies
01:08:54.663 --> 01:09:00.636
and engineering predictive features - this data is the digital life
experience of a company.
01:09:01.470 --> 01:09:06.675
Training involves inventing or adapting an AI model that learns to
make the desired predictions.
01:09:07.543 --> 01:09:14.049
Simulation and validation test the AI application for accuracy,
generalization, and potential bias.
01:09:14.550 --> 01:09:22.165
And finally, orchestrating a fleet of computers, whether in your data
center or at the edge in the warehouse, farms, or wireless base
stations.
01:09:22.191 --> 01:09:31.233
NVIDIA created the chips, systems, and libraries needed for
end-to-end machine learning - for example, technologies like Tensor
Core GPUs,
01:09:31.333 --> 01:09:36.972
NVLINK, DGX, cuDNN, RAPIDS, NCCL, GPU Direct, DOCA, and so much more.
01:09:37.673 --> 01:09:39.642
We call the platform NVIDIA AI.
01:09:40.709 --> 01:09:46.649
NVIDIA AI libraries accelerate every step, from data processing to
fleet orchestration.
01:09:47.416 --> 01:09:51.787
NVIDIA AI is integrated into all of the industry's popular tools and
workflows.
01:09:52.788 --> 01:10:00.196
NVIDIA AI is in every cloud, used by the world's largest companies,
and by over 7,500 AI startups around the world.
01:10:00.596 --> 01:10:11.006
And NVIDIA AI runs on any system that includes NVIDIA GPUs, from PCs
and laptops, to workstations, to supercomputers, in any cloud, to our
01:10:11.006 --> 01:10:13.209
$99 Jetson robot computer.
01:10:13.976 --> 01:10:18.247
One segment of computing we've not served is enterprise computing.
01:10:19.181 --> 01:10:23.118
70% of the world's enterprises run VMware, as we do at NVIDIA.
01:10:23.619 --> 01:10:27.890
VMware was created to run many applications on one virtualized
machine.
01:10:28.857 --> 01:10:36.098
AI, on the other hand, runs a single job, bare-metal, on multiple
GPUs and often multiple nodes.
01:10:36.932 --> 01:10:44.473
All of the NVIDIA optimizations for compute and data transfer are now
plumbed through the VMware stack so AI workloads can be distributed to
01:10:44.473 --> 01:10:47.610
multiple systems and achieve bare-metal performance.
01:10:48.444 --> 01:10:51.914
The VMware stack is also offloaded and accelerated on NVIDIA
BlueField.
01:10:52.815 --> 01:11:02.224
NVIDIA AI now runs in its full glory on VMware, which means
everything that has been accelerated by NVIDIA AI now runs great on
VMware.
01:11:03.158 --> 01:11:07.630
AI applications can be deployed and orchestrated with Kubernetes
running on VMware Tanzu.
01:11:08.364 --> 01:11:11.867
We call this platform NVIDIA EGX for Enterprise.
01:11:12.534 --> 01:11:22.444
The enterprise IT ecosystem is thrilled - finally the 300,000 VMware
enterprise customers can easily build an AI computing infrastructure
01:11:22.778 --> 01:11:25.814
that seamlessly integrates into their existing environment.
01:11:26.548 --> 01:11:32.721
In total, over 50 servers from the world's top server makers will be
certified for NVIDIA EGX Enterprise.
01:11:33.022 --> 01:11:38.227
BlueField 2 offloads and accelerates the VMware stack and does the
networking for distributed computing.
01:11:38.794 --> 01:11:46.001
Enterprise can choose big or small GPUs for heavy-compute or
heavy-graphics workloads like Omniverse, or mix and match.
01:11:46.902 --> 01:11:48.370
All run NVIDIA AI.
01:11:49.171 --> 01:11:58.013
Enterprise companies make up the world's largest industries and they
operate at the edge - in hospitals, factories, plants, warehouses,
01:11:58.013 --> 01:12:02.117
stores, farms, cities and roads - far from data centers.
01:12:03.185 --> 01:12:04.787
The missing link is 5G.
01:12:05.621 --> 01:12:10.059
Consumer 5G is great, but Private 5G is revolutionary.
01:12:10.893 --> 01:12:20.102
Today, we're announcing the Aerial A100 - bringing together 5G and AI
into a new type of computing platform designed for the edge.
01:12:20.636 --> 01:12:29.578
Aerial A100 integrates the Ampere GPU and BlueField DPU into one card
- this is the most advanced PCI express card ever created.
01:12:29.978 --> 01:12:35.918
So, it's not a surprise that Aerial A100 in an EGX system will be a
complete 5G base station.
01:12:36.952 --> 01:12:52.201
Aerial A100 delivers up to full 20 Gbps and can process up to 9
100Mhz massive MIMO for 64T64R - or 64 transmit and 64 receive
antenna arrays
01:12:52.201 --> 01:12:54.603
- state of the art capabilities.
01:12:55.270 --> 01:13:06.843
Aerial A100 is software-defined, with accelerated features like PHY,
Virtual Network Functions, network acceleration, packet pacing, and
line-rate cryptography.
01:13:08.117 --> 01:13:16.825
Our partners ERICSSON, Fujitsu, Mavenir, Altran, and Radisys will
build their total 5G solutions on top of the Aerial library.
01:13:17.826 --> 01:13:28.303
NVIDIA EGX server with Aerial A100 is the first 5G base-station that
is also a cloud-native, secure, AI edge data center.
01:13:29.138 --> 01:13:32.708
We have brought the power of the cloud to the 5G edge.
01:13:33.609 --> 01:13:36.779
Aerial also extends the power of 5G into the cloud.
01:13:37.312 --> 01:13:43.152
Today, we are excited to announce that Google will support NVIDIA
Aerial in the GCP cloud.
01:13:43.986 --> 01:13:45.621
I have an important new platform to tell you about.
01:13:46.522 --> 01:13:54.229
The rise of microservice-based applications and hybrid-cloud has
exposed billions of connections in a data center to potential attack.
01:13:54.863 --> 01:14:04.889
Modern Zero-Trust security models assume the intruder is already
inside and all container-to-container communications should be
inspected, even within a node.
01:14:05.274 --> 01:14:06.708
This is not possible today.
01:14:07.476 --> 01:14:12.448
The CPU load of monitoring every piece of traffic is simply too great.
01:14:13.215 --> 01:14:20.489
Today, we are announcing NVIDIA Morpheus - a data center security
platform for real-time all-packet inspection.
01:14:21.156 --> 01:14:28.530
Morpheus is built on NVIDIA AI, NVIDIA BlueField, Net-Q network
telemetry software, and EGX.
01:14:29.431 --> 01:14:39.374
We're working to create solutions with industry leaders in data
center security - Fortinet, Red Hat, Cloudflare, Splunk, F5, and Aria
01:14:39.374 --> 01:14:46.782
Cybersecurity. And early customers - Booz Allen Hamilton, Best Buy,
and of course, our own team at NVIDIA.
01:14:47.716 --> 01:14:49.952
Let me show you how we're using Morpheus at NVIDIA.
01:14:52.654 --> 01:14:54.089
It starts with a network.
01:14:54.523 --> 01:15:01.563
Here we see a representation of a network, where dots are servers and
lines (the edges) are packets flowing between those servers.
01:15:01.897 --> 01:15:09.938
Except in this network, Morpheus is deployed. This enables AI
inferencing across your entire network, including east/west traffic.
The
01:15:09.938 --> 01:15:17.613
particular model being used here has been trained to identify
sensitive information - AWS credentials, GitHub credentials, private
keys,
01:15:17.713 --> 01:15:22.751
passwords. If observed in the packet, these would appear as red
lines, and we don't see any of that.
01:15:23.519 --> 01:15:24.820
Uh oh, what happened.
01:15:25.454 --> 01:15:29.391
An updated configuration was deployed to a critical business app on
this server.
01:15:30.025 --> 01:15:34.830
This update accidentally removed encryption, and now everything that
communicates with that app
01:15:34.830 --> 01:15:37.933
sends and receives sensitive credentials in the clear.
01:15:38.800 --> 01:15:47.075
This can quickly impact additional servers. This translates to
continuing exposure on the network. The AI model in Morpheus is
searching
01:15:47.075 --> 01:15:53.982
through every packet for any of these credentials, continually
flagging when it encounters such data. And rather than using pattern
…[File truncated due to length; see original file]…