Chair of Programming Languages and AI
print


Breadcrumb Navigation


Content

Code Semantics: A Machine Learning Perspective

Introduction

Code semantics form the backbone of numerous program analysis tasks, such as software testing, vulnerability detection, and reverse engineering. In recent years, we have witnessed an initial surge in applying ML to facilitate the understanding of code semantics and further enhance downstream analysis tasks. Unlike traditional methods, ML in program analysis needs less expert knowledge, offering promising accuracy and efficiency. However, challenges remain. To take binary code as illustration, how to ensure generalization and scalability across diverse platforms, compilers, optimization options and obfuscation are still open questions. Additionally, the reliability and usage boundaries of ML in program analysis prompt ongoing discussions.

In this seminar, we will delve into the development of ML techniques for code semantic representation. Tracing its evolution from applying traditional ML algorithms to large language models, as well as the shift from manual feature engineering to self-supervised tasks and neural networks specifically designed for code semantics.

Prerequisites include an interest in program understanding and basic knowledge of deep learning. This seminar aims to deepen your understanding of code semantics and neural networks through discussions and coding practices.


Appointments

  • Time & Location: Thursdays 16:00-18:00, Room U 133, Oettingenstr. 67
  • First Session: 18-04-2024
  • Notice: Attendance at the first session is mandatory!

Pre-requisites

  • Any questions, please email to yunru.wang@ifi.lmu.de
  • English Only.
  • Interest in binary analysis and a basic knowledge of machine learning.