Radare2 GSoC 2020 Introduction Project Ideas Micro Tasks

Project Ideas

Radare2

^

Type Analysis Improvements

Currently we have types support in radare2, including basic (low-level) ability to edit type with pf and higher-level, C-like types with t command. Currently you can parse the C type definition from C headers for example, or load from "precompiled" SDB file. The goal of this task is to integrate more types handling into the radare2 analysis loop, including automatic inference and suggestions. Some basic type inference is already implemented as a part of aft and afc commands, and anal.types.* configuration options.

Task

  1. Importing types and variables information from DWARF and PDB files
  2. Integrating C++/ObjC/Swift/etc vtables analysis with types information
  3. Improve support of the constrained types
  4. Improve the ability to autosubstitute structure offset where possible (e.g. when you specified function parameter type and it uses it inside)
  5. Improve the type inference based on function arguments types, function return types and callgraph

Skills

Student should know C as well as be familiar with basics of the program analysis.

Difficulty

Medium

Benefits for the student

Student will understand modern program analysis problems related to type inference, understand low level details on how compilers produce native code from OOP concepts, and will meet the most common reverse engineering task in its advanced incarnation.

Benefits for the project

This feature will make radare2 more usable for day-to-day reverse engineering of complex programs, and will make integration with radeco decompilator even easier.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources

^

CPU/Platform profiles

While instruction set defines architecture, it is common that particular CPU or SoC models implement only a subset of it or extend it with custom instructions and registers. Moreover, various SoC modifications can define peripheral devices interaction through ports (rare), registers or MMIO spaces. All this helps the reverse engineering process, because a lot of the code will make sense upon a glance once you see it accesses certain registers (if named) or peripheral devices (when MMIO area is defined). A common example is SVD loading for ARM architecture.

Task

  1. Implement support for CPU profiles - see #8467
  2. Implement support for platform profiles
  3. Add support for register and MMIO specific setups
  4. Integrate these in analysis loop, handling register and memory accesses.
  5. Implement tests and documentation in radare2 book
  6. Provide an API for setting these values from r2pipe and lang-* plugins

Skills

Student should know C and understand basics of the hardware platforms, architectures and chips.

Difficulty

Medium

Benefits for the student

The student will improve familiarity with reverse engineering for various architectures and platforms, along with the improving the efficiency of radare2.

Benefits for the project

Huge benefits for end users in UX and better support for extension.

Assess requirements for evaluations

Mentors

^

Radiff2 improvements

Radare2 has had the ability to perform binary diffing for over a decade. Nevertheless the support is quite basic and there is room for improvement. One of the most important tasks is to deepen the integration with analysis loop. Integration with the analysis loop will allow radare2 to find and highlight the difference between arguments count, local variables count, their types and other analysis metainformation. The next big task is to modernize radiff2 (and corresponding parts in RCore) in terms of performance and user interface. And of course - cover the radiff2 and radare2 diffing features with regression tests and unit tests.

Tasks

Skills

Student should know C as well as be familiar with basics of the program analysis. Having an experience with other binary diffing software is a plus.

Difficulty

Medium

Benefits for the student

Student will understand modern program analysis problems in application to binary diffing, and how to improve the performance of patch analysis.

Benefits for the project

This feature will make radare2 usable for day-to-day patch analysis of modern software, as well as improve the automation and performance of this task.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources

^

Handle EXE/DLL as FAT binaries

Windows programs are like Apple's FAT binaries, they contain multiple programs inside, and r2 should be able to list and select them when loading. Also, it may be possible to extract them with rabin2 -x foo.exe. The sub-bins inside an EXE are: 1. DOS program 2. W16 program 3. W32 program 4. MSIL program (.NET)

Task

This task also includes adding support for .NET in RBin, to be able to list the symbols, get the entrypoint, code metadata, etc. This will require rethinking some of the commands to allow switch between parts of this FAT binary on the fly.

  1. Fix current fatmach0
  2. Improving loading dyldcache, including the filtering of shared dyldcache objects
  3. PE (dos, win, .net) separation
  4. Add support for iOS OTA images

Skills

The student should be comfortable with the C language, and be familiar with windows binaries

Difficulty

Advanced

Benefits for the student

The student will gain a deep understanding of Microsoft's executable formats.

Benefits for the project

Currently, there are no up to date modern tools to deal with .NET programs in a low-level manner, when decompilers fail. With this task, we'd like to fill this gap.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources

^

Exploitation capabilities improvements

Since modern architectures are now enforcing W^X, exploiters are using ROP. (Un)fortunately, building ROP chain by hand can be tedious, this is why some tools can be used to ease this construction: ImmunityDBG has mona.py, there is also ROPgadget and dropper.There exist even tools that can generate ROP chains automatically, for example exrop. It's a shame that despite having ESIL, radare2 doesn't have something similar yet. One of the possible solutions would be to build an external plugin or tool which will reuse power of libr and ragg2. Moreover it makes sense to think about SROP, COOP and BROP support.

The ragg2 tool while has the ability to create a custom shellcode has the outdated database of the shellcodes, so updating them is crucial for the tool to be relevant.

Task

  1. Update the shellcodes database, imrove ragg2 features and documentation
  2. Implement a ropchain syntax parser that uses ragg2 or a custom DSL, something like: register reg1 = 0; register reg2 = whatever; register reg3 = reg1 + reg2; system(reg3);
  3. Write a compiler which uses SMT solver (like Z3 for example) to produce the ropchain.
  4. Support main architectures - x86, ARM, MIPS, PowerPC

Skills

The student should be comfortable with the C language, know some assembly and a high-level language. Also, knowing a little bit of automatic binary analysis wouldn’t hurt.

Difficulty

Advanced

Benefits for the student

The student will improve their skills in software exploitation and solvers.

Benefits for the project

This feature would greatly help during exploits development, and people would be able to ditch mona.py for radare2 ;)

Assess requirements for evaluation

Mentors

Links/Resources

Cutter

^

Plugins and Python High Level API

We currently don't have API almost for plugin authors to use. We need to improve a lot of things about our Plugins support and take it few steps ahead.

Task

Skills

The student should be comfortable with the C++ and Python languages, and be familiar with Qt framework

Difficulty

Advanced

Benefits for the student

The student will gain an experience of creating a suitable API for scripting graphical interface programs.

Benefits for the project

It will greatly improve the scripting experience, will make API more consistent and will ease creating Cutter plugins by the community. Moreover, it will simplify testing of the Cutter features.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources

^

Decompiler Widget

In recent years, the Decompiler has become an almost essential features for reverse-engineers. It takes the disassembly and turns it into a readable C program. Cutter has a Decompiler widget in which several decompiler plugins can show the decompiled output. The curretnly supported decompilers are Ghidra, r2dec, and retdec.

The current Decompiler widget provides only basic features and interaction, and it is far from being as advanced as IDA's or Ghidra's views.

The following task aims to take the Decompiler usage experience in Cutter to the next steps to be as good as in Hex-Rays and Ghidra. This is a highly demanded features from our users.

Tasks

This is a big task and among others, it contains the following sub-tasks:

Skills

The student should be comfortable with the C++, and be familiar with Qt framework.

Difficulty

Advanced

Benefits for the student

The student will gain an experience of creating efficient graphical interfaces.

Benefits for the project

It will put Cutter on par with the rest of the advanced reverse engineering tools.

Assess requirements for midterm/final evaluation

Mentors

^

Multi-Tasking and Event-driven architecture

Cutter is a reverse engineering framework that is powered by radare2. The information it gets about functions, strings, imports, and the analysis are all performed in radare2 and displayed in Cutter. Currently, Cutter is pulling information from radare2 only on demand. This is problematic because sometimes the user performs changes (via plugins, the console widget, and more), that are affecting the information from radare2, but Cutter doesn't know about these changes to apply the to the UI. For example, if a user will define a new function in a Python script or via the console widget by using the radare2 commadn af @ <addr>, Cutter will now show this new function in the Functions widget until the user will refresh the interface manually (edit -> Refresh Contents).

In addition, this task will also handle the analysis in the background feature, to allow the analysis performed by radare2 to happen while the interface is active.

Tasks

The overall implementation of this task should start from radare2 by adding events to many of the functions. This can be done using r_events. For example, add an even for function creating, for section creation, for flag deletion, for name changed, and more

Skills

The student should be comfortable with the C++ for Cutter and C for radare2. The student should be familiar with Qt framework.

Difficulty

Advanced

Benefits for the student

The student will gain an experience of creating complex event-driven software in both C and C++ languages.

Benefits for the project

It will allow to work on big files effortlessly in Cutter, will improve analysis quality as well.

Assess requirements for midterm/final evaluation

Mentors

^

Heap viewer

We already have a nice heap (and memory map) parser and visualizer in radare2 (dm and dmh commands). After debugging becomes a first-class citizen in cutterland it would be awesome to have memory map and heap visualizations.

Task

Skills

The student should be comfortable with the C++, and be familiar with Qt framework

Difficulty

Medium

Benefits for the student

The student will gain the understanding on how modern runtimes provide the heap for various programs, which will be beneficial for the binary exploitation skills.

Benefits for the project

It will greatly improve the debugging and reverse engineering experience for complex programs, also provides the way to design the exploitation techniques with the help of radare2/Cutter.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources

^

Diffing mode

Binary diffing is one of the most common tasks for the reverse engineer. There are many various tools available, but most of them are either detached from the main RE toolbox or poorly integrated. Radare2 provides basic diffing features out of the box with radiff2 tool, but Cutter has no interface to represent similar functionality.

Task

Skills

The student should be comfortable with the C++ language, and be familiar with Qt framework

Difficulty

Advanced

Benefits for the student

The student will gain an experience of creating efficient graphical interfaces.

Benefits for the project

It will greatly benefit the project since Cutter will be the only FOSS RE tool to provide this feature out of the box.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources

^

Debugging and reverse debugging improvements (both radare2 and Cutter)

Radare2 already supports a basic "Record and Replay" feature, similar to gdb's process recorded. The reverse debugger is designed to work by logging the execution of each machine instruction in the debugee, together with each corresponding change in machine state (the values of memory and registers). While the feature exists, it is still basic and somewhat unstable. Also, radare2 includes support for reverse debugging gdbserver based targets with reverse debugging support. A good recent example from Tetrane which shows the workflow of working with reversible debugging. Another part of the task will be improving existing GDB/LLDB remote debugging implementation along with WinDbg improvements. Recently WinDbg added support for the record and replay, supporting it would be beneficial for radare2 and Cutter users. Currently, radare2 only supports WinDbg debug over the unencrypted serial protocol using windows/qemu pipes. To improve our Windows debugging capabilities we would like to add proper ethernet WinDbg protocol support to reach remote targets and improve the user experience. This task will require reverse engineering of WinDbg's protocol using programs like windbgshark which will also require decryption and encryption of protocol packets.

Tasks

Skills

The student should be comfortable with the C and C++ languages, basic Qt framework knowledge will be beneficial.

Difficulty

Advanced

Benefits for the student

The student will gain an experience of understanding modern debugging techniques and tools, understanding how debugging works "under the hood", and will practice on creating useful interfaces for debugging tools.

Benefits for the project

It will greatly benefit the project since Cutter will be the only FOSS RE tool to provide this feature out of the box.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources

R2Ghidra

^

SLEIGH Disassembler Backend

The release of the Ghidra reverse engineering suite has had a great impact on the reverse engineering landscape in the sense that it instantly became highly popular. For disassembling raw binary data, it uses an interesting special purpose language called SLEIGH to define all of its supported instruction sets. Because of the mentioned popularity, many SLEIGH modules for various architectures have been written by users of the tool.

Tasks

The goal is to integrate SLEIGH as a disassembly backend into radare2. This will make it possible to directly support all architectures that are supported by Ghidra, but also take advantage of the interface, analysis and flexibility of radare2. A similar project that has been successful is the existing integration of Ghidra's decompiler into radare2, r2ghidra-dec. The C++ code of this decompiler includes a full implementation of the SLEIGH-based disassembly engine. A proof-of-concept of disassembling using this engine is already available as the pdgsd command. This task should thus be implemented in r2ghidra's codebase. Radare2's disassembly is based on plugins, which expose C functions that, given raw binary data, return the corresponding disassembled instruction along with additional information about its semantics. One such plugin should be implemented that will use SLEIGH. As an optional task, a translator from P-code, Ghidra's intermediate language for analysis, to ESIL, which is radare2's intermediate language, can be implemented. This will enable additional features, such as emulation and emulation-based analysis.

Skills

The student should know have good C and C++ skills. Knowledge about SLEIGH, P-code and ESIL is not a necessity, but a plus.

Difficulty

Medium

Benefits for the student

The student will gain deep insight in the SLEIGH disassembly engine, as well as intermediate languages used for program analysis.

Benefits for the project

Radare2 will be able to reuse any architecture module that has been created for the Ghidra framework.

Assess requirements for midterm/final evaluation

Mentors

Links/Resources



--radareorg @ 2020