Categories
Offsites

Large sequence models for software development activities

Software isn’t created in one dramatic step. It improves bit by bit, one little step at a time — editing, running unit tests, fixing build errors, addressing code reviews, editing some more, appeasing linters, and fixing more errors — until finally it becomes good enough to merge into a code repository. Software engineering isn’t an isolated process, but a dialogue among human developers, code reviewers, bug reporters, software architects and tools, such as compilers, unit tests, linters and static analyzers.

Today we describe DIDACT (​​Dynamic Integrated Developer ACTivity), which is a methodology for training large machine learning (ML) models for software development. The novelty of DIDACT is that it uses the process of software development as the source of training data for the model, rather than just the polished end state of that process, the finished code. By exposing the model to the contexts that developers see as they work, paired with the actions they take in response, the model learns about the dynamics of software development and is more aligned with how developers spend their time. We leverage instrumentation of Google’s software development to scale up the quantity and diversity of developer-activity data beyond previous works. Results are extremely promising along two dimensions: usefulness to professional software developers, and as a potential basis for imbuing ML models with general software development skills.

DIDACT is a multi-task model trained on development activities that include editing, debugging, repair, and code review.

We built and deployed internally three DIDACT tools, Comment Resolution (which we recently announced), Build Repair, and Tip Prediction, each integrated at different stages of the development workflow. All three of these tools received enthusiastic feedback from thousands of internal developers. We see this as the ultimate test of usefulness: do professional developers, who are often experts on the code base and who have carefully honed workflows, leverage the tools to improve their productivity?

Perhaps most excitingly, we demonstrate how DIDACT is a first step towards a general-purpose developer-assistance agent. We show that the trained model can be used in a variety of surprising ways, via prompting with prefixes of developer activities, and by chaining together multiple predictions to roll out longer activity trajectories. We believe DIDACT paves a promising path towards developing agents that can generally assist across the software development process.

A treasure trove of data about the software engineering process

Google’s software engineering toolchains store every operation related to code as a log of interactions among tools and developers, and have done so for decades. In principle, one could use this record to replay in detail the key episodes in the “software engineering video” of how Google’s codebase came to be, step-by-step — one code edit, compilation, comment, variable rename, etc., at a time.

Google code lives in a monorepo, a single repository of code for all tools and systems. A software developer typically experiments with code changes in a local copy-on-write workspace managed by a system called Clients in the Cloud (CitC). When the developer is ready to package a set of code changes together for a specific purpose (e.g., fixing a bug), they create a changelist (CL) in Critique, Google’s code-review system. As with other types of code-review systems, the developer engages in a dialog with a peer reviewer about functionality and style. The developer edits their CL to address reviewer comments as the dialog progresses. Eventually, the reviewer declares “LGTM!” (“looks good to me”), and the CL is merged into the code repository.

Of course, in addition to a dialog with the code reviewer, the developer also maintains a “dialog” of sorts with a plethora of other software engineering tools, such as the compiler, the testing framework, linters, static analyzers, fuzzers, etc.

An illustration of the intricate web of activities involved in developing software: small actions by the developer, interactions with a code reviewer, and invocations of tools such as compilers.

A multi-task model for software engineering

DIDACT utilizes interactions among engineers and tools to power ML models that assist Google developers, by suggesting or enhancing actions developers take — in context — while pursuing their software-engineering tasks. To do that, we have defined a number of tasks about individual developer activities: repairing a broken build, predicting a code-review comment, addressing a code-review comment, renaming a variable, editing a file, etc. We use a common formalism for each activity: it takes some State (a code file), some Intent (annotations specific to the activity, such as code-review comments or compiler errors), and produces an Action (the operation taken to address the task). This Action is like a mini programming language, and can be extended for newly added activities. It covers things like editing, adding comments, renaming variables, marking up code with errors, etc. We call this language DevScript.

The DIDACT model is prompted with a task, code snippets, and annotations related to that task, and produces development actions, e.g., edits or comments.

This state-intent-action formalism enables us to capture many different tasks in a general way. What’s more, DevScript is a concise way to express complex actions, without the need to output the whole state (the original code) as it would be after the action takes place; this makes the model more efficient and more interpretable. For example, a rename might touch a file in dozens of places, but a model can predict a single rename action.

An ML peer programmer

DIDACT does a good job on individual assistive tasks. For example, below we show DIDACT doing code clean-up after functionality is mostly done. It looks at the code along with some final comments by the code reviewer (marked with “human” in the animation), and predicts edits to address those comments (rendered as a diff).

Given an initial snippet of code and the comments that a code reviewer attached to that snippet, the Pre-Submit Cleanup task of DIDACT produces edits (insertions and deletions of text) that address those comments.

The multimodal nature of DIDACT also gives rise to some surprising capabilities, reminiscent of behaviors emerging with scale. One such capability is history augmentation, which can be enabled via prompting. Knowing what the developer did recently enables the model to make a better guess about what the developer should do next.

An illustration of history-augmented code completion in action.

A powerful such task exemplifying this capability is history-augmented code completion. In the figure below, the developer adds a new function parameter (1), and moves the cursor into the documentation (2). Conditioned on the history of developer edits and the cursor position, the model completes the line (3) by correctly predicting the docstring entry for the new parameter.

An illustration of edit prediction, over multiple chained iterations.

In an even more powerful history-augmented task, edit prediction, the model can choose where to edit next in a fashion that is historically consistent. If the developer deletes a function parameter (1), the model can use history to correctly predict an update to the docstring (2) that removes the deleted parameter (without the human developer manually placing the cursor there) and to update a statement in the function (3) in a syntactically (and — arguably — semantically) correct way. With history, the model can unambiguously decide how to continue the “editing video” correctly. Without history, the model wouldn’t know whether the missing function parameter is intentional (because the developer is in the process of a longer edit to remove it) or accidental (in which case the model should re-add it to fix the problem).

The model can go even further. For example, we started with a blank file and asked the model to successively predict what edits would come next until it had written a full code file. The astonishing part is that the model developed code in a step-by-step way that would seem natural to a developer: It started by first creating a fully working skeleton with imports, flags, and a basic main function. It then incrementally added new functionality, like reading from a file and writing results, and added functionality to filter out some lines based on a user-provided regular expression, which required changes across the file, like adding new flags.

Conclusion

DIDACT turns Google’s software development process into training demonstrations for ML developer assistants, and uses those demonstrations to train models that construct code in a step-by-step fashion, interactively with tools and code reviewers. These innovations are already powering tools enjoyed by Google developers every day. The DIDACT approach complements the great strides taken by large language models at Google and elsewhere, towards technologies that ease toil, improve productivity, and enhance the quality of work of software engineers.

Acknowledgements

This work is the result of a multi-year collaboration among Google Research, Google Core Systems and Experiences, and DeepMind. We would like to acknowledge our colleagues Jacob Austin, Pascal Lamblin, Pierre-Antoine Manzagol, and Daniel Zheng, who join us as the key drivers of this project. This work could not have happened without the significant and sustained contributions of our partners at Alphabet (Peter Choy, Henryk Michalewski, Subhodeep Moitra, Malgorzata Salawa, Vaibhav Tulsyan, and Manushree Vijayvergiya), as well as the many people who collected data, identified tasks, built products, strategized, evangelized, and helped us execute on the many facets of this agenda (Ankur Agarwal, Paige Bailey, Marc Brockschmidt, Rodrigo Damazio Bovendorp, Satish Chandra, Savinee Dancs, Matt Frazier, Alexander Frömmgen, Nimesh Ghelani, Chris Gorgolewski, Chenjie Gu, Vincent Hellendoorn, Franjo Ivančić, Marko Ivanković, Emily Johnston, Luka Kalinovcic, Lera Kharatyan, Jessica Ko, Markus Kusano, Kathy Nix, Sara Qu, Marc Rasi, Marcus Revaj, Ballie Sandhu, Michael Sloan, Tom Small, Gabriela Surita, Maxim Tabachnyk, David Tattersall, Sara Toth, Kevin Villela, Sara Wiltberger, and Donald Duo Zhao) and our extremely supportive leadership (Martín Abadi, Joelle Barral, Jeff Dean, Madhura Dudhgaonkar, Douglas Eck, Zoubin Ghahramani, Hugo Larochelle, Chandu Thekkath, and Niranjan Tulpule). Thank you!

Categories
Misc

Decentralizing AI with a Liquid-Cooled Development Platform by Supermicro and NVIDIA

Photo of hardware system from Supermicro.AI is the topic of conversation around the world in 2023. It is rapidly being adopted by all industries including media, entertainment, and broadcasting. To be…Photo of hardware system from Supermicro.

AI is the topic of conversation around the world in 2023. It is rapidly being adopted by all industries including media, entertainment, and broadcasting. To be successful in 2023 and beyond, companies and agencies must embrace and deploy AI more rapidly than ever before. The capabilities of new AI programs like video analytics, ChatGPT, recommenders, speech recognition, and customer service are far surpassing anything thought possible just a few years ago.

However, according to research, less than half of companies or agencies are successfully deploying AI applications due to cost. The other half are scrambling to determine how exactly they can harness this somewhat mysterious and new software that promises to provide a competitive advantage throughout every industry in the world.

In April 2023, Supermicro launched a new system to help expedite the deployment of AI workloads for developers, new adopters, as well as established users. The liquid-cooled AI development platform is called Supermicro SYS-751GE-TNRT-NV1 and there is nothing like it available in the world today. 

The hardware and software system comes with the full suite of NVIDIA AI Enterprise frameworks, models, and tools and the Ubuntu 22.04 operating system. The beauty of this new and revolutionary system is that it decentralizes AI development at an entry-level cost point far cheaper than a large supercomputer.

  • Normally, researchers must book time slots to use the supercomputer and wait in the queue.
  • They run an application (machine learning training, for example) and receive results.
  • When they make changes in the software, they must run the training again by booking another time slot and waiting in the queue.

This is all too time-consuming. It takes too long to get the desired results and increases the overall total cost of ownership (TCO).

With the new AI development platform, all these issues are resolved and the TCO goes down substantially. You can run ML tests, get the results quickly, and run the tests again without waiting. With the proximity of the new system to the actual AI developer, latency is lowered, which can be critically important for many AI workloads.

Optimized hardware for AI enterprise software

The technology that makes this Supermicro product unique is the ability to liquid-cool this solution. The internal closed-loop radiator and cooling system that is super-quiet, extremely energy-efficient, and less expensive than most AI hardware. It puts out virtually no heat.

In addition to this new revolutionary hardware technology, the AI development platform is perfectly optimized for the included downloadable NVIDIA AI Enterprise software programs. This includes over 50 workflows, frameworks, pretrained models, and infrastructure optimization that can run on VMware vSphere.

Most importantly, this AI development platform is literally plug-and-play. Take it out of the box, turn it on, connect to the network, download the included AI software of your choice, and start running those AI applications! 

The technical advancement here is the perfect pairing and optimization of hardware systems to specific NVIDIA AI Enterprise software applications, maximizing the software capabilities to capitalize on the intrinsic advantages of AI.

Optimizing the Supermicro hardware to the unique requirements of NVIDIA AI Enterprise software applications removes all questions about how much memory you need, how many GPUs are needed, or what kind of processors must be installed. Frankly, this system just works, right out of the box. 

Here are some of the resulting customer benefits:

  • Cost-effectiveness: With the price point closer to a workstation, you can deploy AI more cost-effectively than ever before, without trying to figure out what technical hardware components are required to run your applications.
  • Whisper-quiet system: Quieter than many household appliances, it’s perfect for using in a data closet, remote location, under your desk, or even in your home.
  • Super-powerful system: The platform includes four NVIDIA A100 GPUs and two Intel CPUs that can run any AI application available today.
  • Lower TCO with a significant energy savings: The self-contained liquid cooling system almost completely cools itself without needing external AC or connections to any building chilled-water system.
  • Increased security: The platform can be operated in a local data center, with or without relying on the cloud, and it’s secure in either location.
  • Significant time savings: You can have a dedicated, decentralized system that enables you to run ML tests, get results, and re-run without waiting.

Energy-efficient and quiet cooling

The new AI development platform from Supermicro features a novel liquid-cooling solution offering unmatched performance and customer experience.

The liquid cooling solution is self-contained and invisible to the user. This system can be used like any other air-cooled system and offers a problem-free, liquid-cooling experience for any type of user.

The optimized Supermicro cold plates deliver efficient cooling to two 4th Gen Intel Xeon Scalable CPUs (270 W TDP) and up to four NVIDIA A100 GPUs (300 W TDP).

An N+1 redundant pumping system module moves the liquid through the cold plates to cool the CPUs and GPUs. Its redundancy enables for continuous operation in case of pump failure for high system uptime.

The heat is transferred to the surrounding air with a high-performance radiator coupled with low-power fans.

The innovative liquid cooling system designed by Supermicro effectively cools down the system for less than 3% of its total power consumption against 15% for standard air-cooled products.

Finally, the system operates at an extremely low noise level (~30 dB) at idle, making it perfect for a quiet office environment.

For more information, see Liquid Cooled AI Development Platform.

Categories
Misc

NVIDIA Launches Accelerated Ethernet Platform for Hyperscale Generative AI

NVIDIA today announced NVIDIA Spectrum-X™, an accelerated networking platform designed to improve the performance and efficiency of Ethernet-based AI clouds.

Categories
Misc

NVIDIA Brings Advanced Autonomy to Mobile Robots With Isaac AMR

As mobile robot shipments surge to meet the growing demands of industries seeking operational efficiencies, NVIDIA is launching a new platform to enable the next generation of autonomous mobile robot (AMR) fleets. Isaac AMR brings advanced mapping, autonomy and simulation to mobile robots and will soon be available for early customers, NVIDIA founder and CEO Read article >

Categories
Misc

Electronics Giants Tap Into Industrial Automation With NVIDIA Metropolis for Factories

The $46 trillion global electronics manufacturing industry spans more than 10 million factories worldwide, where much is at stake in producing defect-free products. To drive product excellence, leading electronics manufacturers are adopting NVIDIA Metropolis for Factories. More than 50 manufacturing giants and industrial automation providers — including Foxconn Industrial Internet, Pegatron, Quanta, Siemens and Wistron Read article >

Categories
Misc

NVIDIA Collaborates With SoftBank Corp. to Power SoftBank’s Next-Gen Data Centers Using Grace Hopper Superchip for Generative AI and 5G/6G

NVIDIA and SoftBank Corp. today announced they are collaborating on a pioneering platform for generative AI and 5G/6G applications that is based on the NVIDIA GH200 Grace Hopper™ Superchip and which SoftBank plans to roll out at new, distributed AI data centers across Japan.

Categories
Misc

WPP Partners With NVIDIA to Build Generative AI-Enabled Content Engine for Digital Advertising

NVIDIA and WPP today announced they are developing a content engine that harnesses NVIDIA Omniverse™ and AI to enable creative teams to produce high-quality commercial content faster, more efficiently and at scale while staying fully aligned with a client’s brand.

Categories
Misc

Techman Robot Selects NVIDIA Isaac Sim to Optimize Automated Optical Inspection

How do you help robots build better robots? By simulating even more robots. NVIDIA founder and CEO Jensen Huang today showcased how leading electronics manufacturer Quanta is using AI-enabled robots to inspect the quality of its products. In his keynote speech at this week’s COMPUTEX trade show in Taipei, Huang presented a video demonstrating how Read article >

Categories
Misc

World’s Leading Electronics Manufacturers Adopt NVIDIA Generative AI and Omniverse to Digitalize State-of-the-Art Factories

NVIDIA today announced that electronics manufacturers worldwide are advancing their industrial digitalization efforts using a new, comprehensive reference workflow that combines NVIDIA technologies for generative AI, 3D collaboration, simulation and autonomous machines.

Categories
Misc

Live From Taipei: NVIDIA CEO Unveils Gen AI Platforms for Every Industry

In his first live keynote since the pandemic, NVIDIA founder and CEO Jensen Huang today kicked off the COMPUTEX conference in Taipei, announcing platforms that companies can use to ride a historic wave of generative AI that’s transforming industries from advertising to manufacturing to telecom. Speaking to a packed house of some 3,500, he described Read article >