Top Related Projects
Framework for lifting x86, amd64, aarch64, sparc32, and sparc64 program binaries to LLVM bitcode
RetDec is a retargetable machine-code decompiler based on LLVM.
A powerful and user-friendly binary analysis platform!
UNIX-like reverse engineering framework and command-line toolset
Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
Quick Overview
Remill is an open-source library for lifting machine code to LLVM bitcode. It supports multiple architectures including x86, x86_64, and AArch64, and can be used for various purposes such as binary analysis, reverse engineering, and program transformation.
Pros
- Supports multiple architectures (x86, x86_64, AArch64)
- Integrates well with LLVM ecosystem
- Actively maintained and regularly updated
- Provides a flexible API for custom use cases
Cons
- Steep learning curve for beginners
- Limited documentation for advanced features
- May require significant computational resources for large binaries
- Dependency on LLVM version can cause compatibility issues
Code Examples
- Lifting x86 assembly to LLVM bitcode:
#include <remill/Arch/X86/Runtime/State.h>
#include <remill/BC/Lifter.h>
int main() {
auto arch = remill::Arch::GetArchitecture(remill::kArchX86);
auto lifter = std::make_unique<remill::InstructionLifter>(arch.get());
std::string assembly = "mov eax, 42";
llvm::LLVMContext context;
auto module = lifter->LiftInstructionToModule(assembly, context);
}
- Analyzing lifted bitcode:
#include <remill/BC/Util.h>
void analyze_bitcode(llvm::Module *module) {
for (auto &function : module->functions()) {
if (remill::IsLiftedFunction(function)) {
// Analyze lifted function
for (auto &block : function) {
// Analyze basic block
}
}
}
}
- Transforming lifted code:
#include <remill/BC/IntrinsicTable.h>
void transform_lifted_code(llvm::Module *module) {
remill::IntrinsicTable intrinsics(module);
for (auto &function : module->functions()) {
if (remill::IsLiftedFunction(function)) {
// Apply custom transformations
// e.g., replace memory intrinsics, optimize branches, etc.
}
}
}
Getting Started
To get started with Remill:
-
Clone the repository:
git clone https://github.com/lifting-bits/remill.git -
Install dependencies (on Ubuntu):
sudo apt-get install build-essential cmake python3-pip pip3 install --user --upgrade pip pip3 install --user --upgrade setuptools wheel -
Build Remill:
cd remill mkdir build && cd build cmake .. make -j$(nproc) -
Include Remill in your project's CMakeLists.txt:
find_package(remill REQUIRED) target_link_libraries(your_target remill)
Competitor Comparisons
Framework for lifting x86, amd64, aarch64, sparc32, and sparc64 program binaries to LLVM bitcode
Pros of McSema
- Supports a wider range of architectures, including x86, x86_64, and ARM
- Provides more comprehensive binary analysis capabilities
- Offers better integration with other binary analysis tools
Cons of McSema
- More complex setup and usage compared to Remill
- Slower lifting process due to its comprehensive nature
- Requires more system resources for operation
Code Comparison
McSema:
#include <remill/Arch/Arch.h>
#include <remill/BC/Util.h>
#include <mcsema/Arch/Arch.h>
#include <mcsema/BC/Util.h>
void LiftFunction(const mcsema::Arch *arch, llvm::Function *func) {
// McSema-specific lifting code
}
Remill:
#include <remill/Arch/Arch.h>
#include <remill/BC/Util.h>
void LiftInstruction(const remill::Arch *arch, llvm::BasicBlock *block) {
// Remill-specific lifting code
}
Both McSema and Remill are part of the lifting-bits project and share some common components. McSema builds upon Remill's foundation, offering more features and broader architecture support at the cost of increased complexity. Remill focuses on providing a simpler, more streamlined approach to instruction lifting, making it easier to use for specific tasks but with more limited capabilities compared to McSema.
RetDec is a retargetable machine-code decompiler based on LLVM.
Pros of RetDec
- More comprehensive decompilation capabilities, supporting multiple architectures and file formats
- Includes a graphical user interface for easier use by non-technical users
- Actively maintained with regular updates and community support
Cons of RetDec
- Slower decompilation process compared to Remill's lifting approach
- Larger codebase and more complex setup, potentially making it harder to integrate into other projects
- May produce less accurate results for certain specific use cases
Code Comparison
RetDec (C++ decompilation output):
int32_t function_401000(int32_t a1) {
int32_t v1 = a1 * 2;
return v1 + 5;
}
Remill (LLVM IR lifting output):
define i32 @sub_401000(i32 %a1) {
%v1 = mul i32 %a1, 2
%result = add i32 %v1, 5
ret i32 %result
}
Both projects aim to analyze binary code, but RetDec focuses on full decompilation to high-level languages, while Remill specializes in lifting machine code to LLVM IR. RetDec offers a more user-friendly approach for general reverse engineering tasks, whereas Remill provides a powerful foundation for advanced binary analysis and transformation tools.
A powerful and user-friendly binary analysis platform!
Pros of angr
- More comprehensive analysis framework with symbolic execution capabilities
- Larger community and ecosystem of plugins/extensions
- Supports a wider range of architectures and binary formats
Cons of angr
- Steeper learning curve due to complexity
- Can be slower for certain types of analysis
- Requires more system resources, especially for large binaries
Code Comparison
angr example:
import angr
proj = angr.Project('binary')
state = proj.factory.entry_state()
simgr = proj.factory.simulation_manager(state)
simgr.explore(find=0x400000)
Remill example:
#include <remill/Arch/X86/Runtime/State.h>
#include <remill/BC/Lifter.h>
auto module = remill::LoadModuleFromFile(arch, bc_file);
auto func = remill::LiftCodeIntoModule(module, addr);
The angr code demonstrates setting up a project and running symbolic execution, while the Remill code shows lifting binary code to LLVM IR. angr provides higher-level abstractions for program analysis, whereas Remill focuses on instruction lifting and translation to LLVM IR.
UNIX-like reverse engineering framework and command-line toolset
Pros of radare2
- Comprehensive reverse engineering framework with a wide range of features
- Large and active community, extensive documentation, and plugins ecosystem
- Supports a vast array of architectures and file formats
Cons of radare2
- Steeper learning curve due to its extensive feature set
- Can be resource-intensive for large binaries or complex analysis tasks
- Command-line interface may be less intuitive for some users
Code comparison
radare2:
r_core_cmd(core, "aaa", 0);
r_core_cmd(core, "pdf @ main", 0);
Remill:
auto module = LoadModuleFromFile(argv[1], &context);
auto program = GenerateProgram(*module);
Key differences
Radare2 is a full-featured reverse engineering framework, while Remill focuses on lifting binary code to LLVM IR. Radare2 offers a broader set of tools for various reverse engineering tasks, whereas Remill specializes in binary-to-IR translation for further analysis or recompilation.
Radare2 is more suitable for interactive analysis and scripting, while Remill is designed to be integrated into larger binary analysis systems. Radare2 has a larger user base and more extensive documentation, but Remill's specialized focus may make it more efficient for certain binary lifting tasks.
Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
Pros of Capstone
- Wider architecture support (x86, ARM, MIPS, PowerPC, etc.)
- More mature and established project with extensive documentation
- Lightweight and easy to integrate into existing projects
Cons of Capstone
- Primarily focused on disassembly, not lifting to intermediate representation
- Less suitable for advanced program analysis tasks
- May require additional tools for more complex reverse engineering workflows
Code Comparison
Capstone (disassembly example):
cs_insn *insn;
size_t count = cs_disasm(handle, code, code_size, address, 0, &insn);
for (size_t j = 0; j < count; j++) {
printf("0x%"PRIx64":\t%s\t\t%s\n", insn[j].address, insn[j].mnemonic, insn[j].op_str);
}
Remill (lifting example):
auto lifted_block = remill::LiftCodeBlock(arch, memory, block_address);
for (const auto &inst : lifted_block->instructions) {
std::cout << inst.Serialize() << std::endl;
}
Remill focuses on lifting machine code to an intermediate representation, which is more suitable for advanced program analysis and transformation tasks. Capstone, on the other hand, excels at disassembly and provides a simpler API for basic instruction decoding across multiple architectures.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Remill 
Remill is a static binary translator that translates machine code instructions into LLVM bitcode. It translates AArch64 (64-bit ARMv8), SPARC32 (SPARCv8), SPARC64 (SPARCv9), x86 and amd64 machine code (including AVX and AVX512) into LLVM bitcode. AArch32 (32-bit ARMv8 / ARMv7) support is underway.
Remill focuses on accurately lifting instructions. It is meant to be used as a library for other tools, e.g. McSema.
Build Status
Documentation
To understand how Remill works you can take a look at the following resources:
- Step-by-step guide on how Remill lifts an instruction
- How to implement the semantics of an instruction
- The design and architecture of Remill
If you would like to contribute you can check out: How to contribute
API Documentation
Generate detailed API documentation using Doxygen:
# Install Doxygen (macOS)
brew install doxygen graphviz
# Install Doxygen (Ubuntu/Debian)
sudo apt-get install doxygen graphviz
# Generate documentation
doxygen
# Open docs/doxygen/html/index.html in your browser
See docs/DOCUMENTATION.md for more details on documentation style and contributing.
Getting Help
If you are experiencing undocumented problems with Remill then ask for help in the #binary-lifting channel of the Empire Hacking Slack.
Supported Platforms
Remill is supported on Linux platforms and has been tested on Ubuntu 22.04. Remill also works on macOS, and has experimental support for Windows.
Remill's Linux version can also be built via Docker for quicker testing.
Dependencies
Remill uses the following dependencies:
| Name | Version |
|---|---|
| Git | Latest |
| CMake | 3.21+ |
| Ninja | 1+ |
| Google Flags | 52e94563 |
| Google Log | v0.7.1 |
| Google Test | v1.17.0 |
| LLVM | 15+ |
| Clang | 15+ |
| Intel XED | v2025.06.08 |
| Python | 3+ |
Getting and Building the Code
We will build the project using the superbuild in dependencies/. For more details on the dependency management system, see Remill Dependency Management.
Clone the repository
git clone https://github.com/lifting-bits/remill
cd remill
Linux/macOS
# Step 1: Build dependencies (including LLVM)
cmake -G Ninja -S dependencies -B dependencies/build
cmake --build dependencies/build
# Step 2: Build remill
cmake -G Ninja -B build -DCMAKE_PREFIX_PATH:PATH=$(pwd)/dependencies/install -DCMAKE_BUILD_TYPE=Release
cmake --build build
Windows (requires clang-cl)
Note: This requires running from a Visual Studio developer prompt.
# Step 1: Build dependencies
cmake -G Ninja -S dependencies -B dependencies/build -DCMAKE_C_COMPILER=clang-cl -DCMAKE_CXX_COMPILER=clang-cl
cmake --build dependencies/build
# Step 2: Build remill
cmake -G Ninja -B build -DCMAKE_PREFIX_PATH:PATH=%CD%/dependencies/install -DCMAKE_C_COMPILER=clang-cl -DCMAKE_CXX_COMPILER=clang-cl -DCMAKE_BUILD_TYPE=Release
cmake --build build
macOS with Homebrew LLVM:
# Install LLVM via Homebrew
brew install llvm@17
LLVM_PREFIX=$(brew --prefix llvm@17)
# Build dependencies with external LLVM
cmake -G Ninja -S dependencies -B dependencies/build -DUSE_EXTERNAL_LLVM=ON "-DCMAKE_PREFIX_PATH:PATH=$LLVM_PREFIX"
cmake --build dependencies/build
# Build remill
cmake -G Ninja -B build "-DCMAKE_PREFIX_PATH:PATH=$(pwd)/dependencies/install" -DCMAKE_BUILD_TYPE=Release
cmake --build build
Linux with system LLVM:
# Build dependencies with external LLVM
cmake -G Ninja -S dependencies -B dependencies/build -DUSE_EXTERNAL_LLVM=ON
cmake --build dependencies/build
# Build remill
cmake -G Ninja -B build "-DCMAKE_PREFIX_PATH:PATH=$(pwd)/dependencies/install" -DCMAKE_BUILD_TYPE=Release
cmake --build build
Top Related Projects
Framework for lifting x86, amd64, aarch64, sparc32, and sparc64 program binaries to LLVM bitcode
RetDec is a retargetable machine-code decompiler based on LLVM.
A powerful and user-friendly binary analysis platform!
UNIX-like reverse engineering framework and command-line toolset
Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot