Implementing the support for any new architectire counts as a microtask. See New-Architecture label for pending issues. Nevertherless we've chosen a few as the most important ones:
LLVM bitcode is common format of the bytecode, used in many different compilers and tools.
See llvm-bcanalyzer and BitCodeReader.cpp on how to implement its parsing and decoding. See also #3896 for integration with Mach-O format parser.
Modern GPUs are now basically a powerful embedded computer network. Every video card has plenty of different chips and firmwares, fused and loaded, running. Radare2 can provide a convenient interface to reverse engineer and audit these firmwares or loaded programs, as well as help open source video drivers efforts.
SPIR-V is a standardized intermediate language encoded as bytecode, which is used to specify shader programs in the Vulkan graphics API. Its form is similar to LLVM IR.
https://www.khronos.org/spir/ https://github.com/KhronosGroup/SPIRV-Tools
Currently there is support already for disassembling Python bytecode. But like with LUA the architecture is largely untested and can be easily broken. Moreover, the analysis plugin is not implemented, so a lot of information is still missing in the output. We need to add proper tests and see if there are bugs (and fix them).
See universal python disassembler and python cross-version decompiler for example and basic implementation in radare2-extras for current state of it.
Java support landed in radare2 a long time ago. At the same time it is largely unused, full of bugs and poorly written. Some code (e.g. anal_extra) doesn't really fit and can be moved/refactored on top of the modern radare2 architecture design. Many minor fixes and improvements should be done - see the "java" label on GitHub.
It is widely adopted and there are many tools available for decompilation. On the other hand radare2 provides many useful features across all architectures and scripting capabilities, which can help to improve the state of .NET reverse engineering tooling. Currently the most basic MSIL support lives in radare2 extras. It can be revived, improved and enhanced to add newer format of the .NET bytecode. See other tools that work with .NET bytecode:
The current code analysis has many caveats and issues which need addressing. Fixing them and writing more tests is important to stabilize and enhance radare2's analysis engine.
See these issues or the "Analysis" project on our GitHub dashboard.
Currently radare2 has support for heap exploration and analysis, but the feature is still basic and can be improved. Additionally, other allocators can be added (jemalloc, etc.), but this should be done after a proper refactoring, because heap analysis shouldn't depend on the debugger backend, and we may be able to use different heap tools.
The most important part of supporting heap analysis is to create a new subset of commands, and put all heap analyis under data analysis or debugger-wide features, not in the target debugger backend. Many things are also done in C, when they could be solved with format strings.
There are plenty of external scripts and plugins for finding the most probable base for raw firmware images. Opening raw firmwares with radare2 is a common use case, so it makes sense to implement it as a part of radare2 core.
^Anal classes, accessible under the ac
command, are a new feature of r2 which has only recently been merged into master.
They provide a way to both manually and automatically manage and use information about classes in the binary.
Right now, vtables in aCv
only have an address, but no size.
This should be added to the sdb record and also be represented in the size of the flag for the vtable.
Consider the following call: call dword [eax + 0x6c]
Let's assume eax is the base pointer of a vtable we have saved in anal classes and we want to find out the actual address of the called method.
So there should be a command that takes the offset (in this case 0x6c) and looks up the actual destination. It should be possible to call this command with a specific class, so it only looks into its vtable, or without a class, so it gives a list of possible destinations for all vtables that are not too small for the offset (partially requires #12601)
When that is implemented, one could also add a command that does the same thing, but automatically takes the offset from the opcode at the current seek.
Vb
already supports browsing bin classes. The same thing should be implemented for anal classes.
Radare2 has a good support for loading and creating signatures, but it is not yet complete, thus
improving the signature contents (their variables, arguments, types, local flags and comments),
their testing coverage and user interface (commands, reviving rasign2
tool).
Apart from that, better integration with analysis loop is
required for the best results of autoanalysis.
Of course all these features are worthless without the actual signatures provided, thus the task to create the default pack.
Bindiffing has been a known feature of radiff2, but it lacked the attention from developers for years. Improving the output, enhancing visual diffing modes (issue #8115), using analysis results and optimizing speed are the most important things here.
Being able to select multiple nodes in the graph and group them to colorize them and specify a name for them. #2952
This task is necessary when node grouping or layout have changed, this information can be stored in projects by just reusing the agn
and age
commands to recreate a graph and feeding the body of the nodes in base64.
Ragg2 - simplistic compiler for C-like syntax into tiny binaries for x86-32/64 and arm. Programs generated by ragg2 are relocatable and can be injected in a running process or on-disk binary file. Fixing ragg2 issues will help a lot for creating small payloads for exploiting tasks.
^Rafind2 - binwalk parity Various issues to improve rafind2 such as being able to extract known file types automatically and recursively if the file is an archive (a la binwalk).
binwalk -Me
supportRadare2 is being slowly refactored to store all the information about session, user metadata and state of debugger in the SDB - simple key-value database. This work still ungoing. So helping us with a few sdbtization bugs will introduce you into the radare2 codebase structure.
We have decided to not sdbize everything, and use RBTree and RDict when necessary, so be sure to ask developers before starting. Also, some places in r2 (like the version bin parser) are using SDB in a poor way. Fixing those cases counts too.
Radare2 has its own intermediate language - ESIL, but not yet support it for all architectures. So the task is to add ESIL support to any architecture, which doesn't has it yet. See issues for the related bugs.
Implementing the support for any new file format counts as a microtask. See New File-Format label for pending issues. Nevertherless we chosen a few as the most important ones:
There are lot of missing features in the current PE file parser as you can see in this META Issue.
this requires a refactor of rbin that hasn't happened yet, but also, we want to have a .NET parser (already commited but not used) for PE, and bring back the MSIL disassembler.
^Currently radare2 has harcoded debugging profiles and Cutter blindly uses them for both native and remote debugging. It is vital for easier improvement of debugging on various platforms and working with custom or less popular targets to be able to change the register/stack profile for either native or a remote debugger. See corresponding issues for radare2 (Issue #) and Cutter (Issue #).
Take ideas from Androguard, and be able to follow execution flow paths to understand which permissions are used in a specific region of code, how to reach a specific activity, etc.
See debugserver -x springboard
and such to spawn apps from the backboard otherwise they get killed.
There are valabind generated bindings and we want them fixed, also merge r2pipe asyncronous and synchronous bindings. See radare2-bindings repository for more information. It has also a different approach - parsing radare2 headers using Clang bindings and generating them without any intermediate files. There is support for Python, Go, Rust and Haskell. It should be improved and better tested - writing autotests will help a lot.
Currently radare2 can use the ida2r2 script to import information from the IDA Pro IDB files. It uses the python-idb library for parsing IDB files without IDA Pro installed. Improving both will allow importing more information - types, variable and argument names, structures and enums, etc is the main goal of this task.
Currently radare2 uses a custom solution for running regression tests. Current testsuite is written in NodeJS, but we are working on migrating it to the lighter and more portable language - V, see, for example CI jobs for BSD and Linux for the V suite. Moreover, it is required to solve numerours issues, along with improving parallel execution and performance. The next interesting idea is to setup and reuse Godbolt compilation engine for generating tests for different compilers and compilation options. There is even a command line tool for interacting with Godbolt - cce.
^Cutter is a Qt and C++ GUI for radare2. Cutter's goal is making an advanced, customizable and FOSS reverse-engineering platform while keeping the user experience at mind. Cutter is created by reverse engineers for reverse engineers.
While it is useful for all the community behind Cutter, working on the interface can be interesting for people that want to gain experience in UI/UX design, or simply in Qt/C++.
Below are some improvements that can be done to the interface:
Also any issue in our issue tracker marked as "Good First Issue" is a good candidate for a microtask.
Another possible choice is to finish the Lighthouse port to Cutter: