It is strongly recommended that students who want to apply to the radare2 GSoC/RSoC projects will perform a small tasks, to get the idea of students’ capabilities and make them familiar with radare2 codebase, structure and development process. Note, some tasks mentioned here are "meta" issues, which are quite big lists of smaller tasks. Of course finishing such big issue is impossible in a short period of time, so this means the student can take a few list items from those bugs as their microtask. Here is the list of such “qualification” tasks:
The current code analysis have many little caveats and issues which may be good to be addressed, fixing them and writing more tests is very important to stabilize and enhance it.
See these issues
Currently radare2 has support for heap exploration and analysis, but the feature is still basic and can be improved. Moreover, other allocators can be added, but this should be done after a proper refactoring, because heap analysis shouldn't depend on the debugger backend, and we may be able to use different heap tools.
So the most important part of supporting heap analysis is to create a new subset of commands, and put all that stuff under data analysis or debugger-wide features, not in the target debugger backend. Moreover many things are done in C while they can be solved with format strings.
Modern object oriented languages such as C++, ObjectiveC, Swift, D, etc are usually implement virtual tables for their methods, classes and other entities relationshop. For better understanding such programs it is vital to see this relationship loaded as a types and indicated in disassembly view. See #6851 to check other tools and scripts available for this task, articles about vtables structure and requirements for radare2.^
Currently not always Unicode characters shown in the canvas (which is used for drawing graph nodes)
See META Unicode support
Bindiffing has been a known feature of radiff2, but it has been unmaintained for years.
Radiff2 related issues
Avoid overlapping edges, currently the ascii art graphs does not overlap nodes, but some edge lines are passing thru.
Being able to select multiple nodes in the graph and group them to colorize them and specify a name for them. #2952
This task is necessary when node grouping or layout have changed, this information can be stored in projects by just reusing the
age commands to recreate a graph and feeding the body of the nodes in base64.
Same goes for the visual panels mode. we need a way to save/restore the panels.^
Radare2 already supports disassembling WebAssembly but analysis is barely implemented. See wasm and wasmdec as a good examples. Implementing
pdc (pseudocode) on top of this analysis is also a good idea.
Java support has landed in radare2 a long time ago. At the same time it is largely unused, full of bugs and poorly written. Some code (e.g. anal_extra) doesn't really fit its place and can be moved/refactored on top of the modern radare2 architecture design.
Radare2 has support for LUA bytecode disassembly and analysis but lack the proper testing thus can be easily broken. We need to add the proper tests for the architecture in radare2-regressions suite and fix the bugs if they appear.
Currently there is a support already for disassembling Python bytecode. But like with LUA the architecture is largely untested and can be easily broken. We need to add the proper tests and see if there are bugs (and fix them) See universal python disassembler for example and Issue #4228 for current state of it.
Ragg2 - simplistic compiler for C-like syntax into tiny binaries for x86-32/64 and arm. Programs generated by ragg2 are relocatable and can be injected in a running process or on-disk binary file. Fixing ragg2 issues will help a lot for creating small payloads for exploiting tasks.^
Radare2 is being slowly refactored to store all the information about session, user metadata and state of debugger in the SDB - simple key-value database. This work still ungoing. So helping us with a few sdbtization bugs will introduce you into the radare2 codebase structure.
We have decided to not sdbize everything and use RBTree and RDict when necessary. Also, some places in r2 (like the version bin parser) is using Sdb in a very poor way.
Radare2 has its own intermediate language - ESIL, but not yet support it for all architectures. So the task is to add ESIL support to any architecture, which doesn't has it yet. See issues for the related bugs.
This task requires implementing proper support for multibyte characters in RConsCanvas in order to render UTF-8 characters in the graphs for having better ascii-art boxes and lines.
There are lot of missing features in the current PE file parser as you can see in this META Issue.
this requires a refactor of rbin that didnt happened yet. but also, we want to have .NET parser (already commited but not used) for the PE, and rbing back the MSIL disassembler.
There is basic MDMP file format support in radare2. But there is still no support for pagedumps (kernel dumps). It should be properly parsed, added ability to automatically load PDB symbols, improved autoanalysis and entry-point searching. Also the support of MDMP files can be improved
Currently radare2 supports PCAP file format opening, but original idea was to be able to load it as debugging session. For example we record the session between GDB and gdbserver into the PCAP file, then we would be able to open this file as a record & replay session. See issue for more details.
Current version of r2 is able to load ART and AOT binaries, but we are not yet able to extract all the information that lives in there
multidex is improtatn feature to support. as well as the feature of loading a jar (and resolve all symbols of all bins, etc.)
Currently radare2 supports many different debugging modes and protocols, but still have many issues to fix. See "debugger" and "debug-info" labels for more information.
Radare2 already supports basic "Record and Replay" feature, but the support is still very basic and quite unstable. See issue #8198 for more information. See also issue #8996 for adding the reverse continue/step support via gdb:// (GDB remote) protocol. See also Debugger Data Model article about same feature in WinDbg.
Take ideas from Androguard, and be able to follow execution flow paths to understand which permissions are used in a specific region of code, how to reach a specific activity, etc.
debugserver -x springboard and such to spawn apps from the backboard otherwise they get killed.
Currently iOS native debugger cannot step, continue and set a breakpoint. See #3461
There are valabind generated bindings and we want them fixed, also merge r2pipe asyncronous and synchronous bindings. See radare2-bindings issues^
radeco is a binary analysis framework based on radare. This year, we lan to push it further to implement a working decompiler within radare2.
Below are some tasks which help new contributors get start with radeco-lib:
Issue#114. Standardize register usage and structs in radeco using arch-rs repo:
Moving forward, we want to standardize and share common structs across
radare-rust repositories. arch-rs is an effort towards this and defines
architecture related structs. We want to replace the current
in radeco with structs from arch-rs.
Issue#117. Parsing text-based radeco IR to Graph based IR: This would allow us to write IR to files and load them up at a later point in the analysis. It would be nice if we could do this with a parser generator in rust, such as lalrpop
Issue#118. Implement a simple type system: Currently, we have the ability to mark nodes as either a scalar (not an address) or a reference (pointer/reference). We'd like to take this a step further and be able to assign primitive (C like) types for nodes to make the IR more expressive.
Issue#119. Restore / Update / Improve CLI tool (aka. minidec/radeco): Minidec currently uses the old deprecated containers. This should be ported to use the latest container systems.
Issue#120. Make accessing bindings in
RadecoFunction more ergonomic:
RadecoFunction is not elegant. We should
improve support for this. This should also be extended to improve accessing of
type and other node information related to these bindings.
Issue##46. Port domtree analysis to use
petgraph domtree construction:
Dominator tree construction was one of the first analysis implemented in
radeco. This needs some love. As such, it is works but is inconsistent with
the other analysis API in radeco. A refactor is needed. petgraph (the graph
library used inside radeco-lib for all graphs) has
added dominator tree construction to its set of graph algorithms. It might be
worthwhile to look into this and ride off their analysis instead of
reimplementing/refactoring this inside radeco.
As always, feel free to ask for help or discuss issues on #radare channel (telegram or irc, ping: @xvilka or @sushant94).^
rune is the radare2 community's own symbolic execution engine written in Rust. rune is currently uses radare2's ESIL as the IR for performing symbolic execution.
Below are some microtasks up for grabs:
Implementing breakpointing and hooks for the engine (#7) : Symbolic execution engines often require user control at different stages of the run. This task would involve setting up the foundations for a breakpointing/hook feature based on ESIL patterns or tokens as required by the consumer. The user would then be given control over the
Context to modify the state as necessary.
Improvement to the CLI : rune currently supports a very basic CLI through the
Interactive Explorer module. We would like to have a more complete set of features implemented for a better user experience.
Implementing a multithreaded model for
Explorer : Currently, rune supports 3 (Interactive, BFS and DFS)
Explorer modules. For the BFS and DFS
Explorer the current implementation is naive and state exploration is done sequentially with states being pushed into a pipeline of sorts. We would like to make full use of Rust's concurrency model and implement multithreaded exploration possible. This is a little advanced task with familiarity with Rust as a pre-requisite.
Incremental solving features for rune (#5) : rune could leverage the use of this z3 feature. This task would involve research and discussion into implementing a PoC and benchmarking the results against a certain set of binaries to observe improvement in performance.
Cutter is the radare2 Graphical User Interface. It is written in Qt and C++ and uses the radare2 API and commands.
You can help the project by completing microtasks:
Add a panel for VTables (#275): VTables are often useful when reversing object oriented applications (C++ mainly). Having a panel to view the VTables would be useful.
Add a panel for getting headers information (#253): Cutter doesn't have any panel to view the header information for the current file. Those information most of the time useful to check and it would be nice to have this implemented in Cutter.
Show manpage/description for given import (#247): Checking manually every manpage on internet can be time consuming. Having this feature in Cutter can save a lot of time for users.
Add basic search capability (#56): Searching information through a binary is very useful. This feature is a must have.
Improve color themes: Currently Cutter has one "default" theme and one dark theme. It would be nice to differentiate the disassembly colors from the interface theme and to add a light theme (so people with a "default" theme being dark can actually use a light theme). Colors improvements for the graph view are interesting too.
For any question related to GSoC, don't hesitate to come on our IRC and/or Telegram channel and ping @xarkes or @Maijin.