Deepmind launches CodeMender, an AI agent to automate security fixes for open-source code

Deepmind launches CodeMender, an AI agent to automate security fixes for open-source code

Google Deepmind has introduced CodeMender, an AI agent that detects, fixes, and helps prevent software security vulnerabilities. It scans source code, identifies flaws, and can apply security patches automatically, having already contributed over 70 verified fixes to open-source projects. The system aims to go beyond traditional methods like fuzzing and static analysis, which struggle to keep up with modern threats.

Built on earlier research such as Big Sleep and OSS-Fuzz, CodeMender uses the Gemini Deep Think model along with static and dynamic code analysis, differential testing, fuzzing, and SMT solvers. Each patch is automatically validated for functional accuracy and then reviewed by human researchers before integration.

CodeMender has resolved complex issues like heap buffer overflows in XML handling and memory errors in C-based code. Deepmind is also testing compiler-level protections such as -fbounds-safety annotations in the libwebp library to prevent exploits like CVE-2023-4863. The company is working with the open-source community and plans to eventually make CodeMender available as a developer tool.

by Mauricio B. Holguin

cz
em
city_zen found this interesting
  • ...

Google Gemini is an AI chatbot providing direct access to Google AI, facilitating tasks like writing, planning, and learning. Rated 3.5, it features an AI-powered, ad-free experience with no coding required. Users seeking alternatives may explore other AI chatbot solutions.

Comments

UserPower
0

"Currently, all patches generated by CodeMender are reviewed by human researchers before they’re submitted upstream." (from Google article). And that's pretty much the big problem: LLMs know nothing about security. They are pretty much a smart trick to use all possible tools to gather data when things break. So far, it inserts pseudo-random data until it gets a memory error and create a summary of the crash log (which allow bisecting the faulting code), and theses errors are often more difficult to find than to fix (most memory CVE doesn't require more than few dozens code lines). It's some kind of fuzzy testing but using a more probabilistic model. This explains the ridiculous 72 security fixes (on literally dozen of billions lines of code of open source projects, when LLMs know writing a big load of code for cheap). Not that could be great tool, Google is doing a tremendous work with Mandiant and OSS-Fuzz, and getting more tools to test software (automatically to some extend) allows developers to focus on non security fixes and functionalities (since they are rarely security experts, and experts have rarely months of free time to spend on analyzing ever-changing code). But fixing Linux or cURL is infinitely more complex than libwebp. And Google's "automatically creating and applying high-quality security patches" promise is just marketing delirium.

Gu