Vlad Ioan Topan

My playground

Patching (resolving) imports in a PE file (Python/pefile)

leave a comment »


Although pefile is by far the most widely used PE parsing / patching library for Python, the documentation is scarce to say the least, and even code samples performing non-basic tasks are hard to come by, particularly code that exploits the internal structures of pefile to patch the contents of the binary.

Working on a PE malware emulator, one of the challenges is plausibly faking the imports of the target PE, which involves at least resolving the imports of the emulated PE image. Since this somewhat trivial task when parsing PEs manually in C or assembly turned out to require a couple of hours of figuring out the internals of pefile to do it in Python, I might as well document the process to save myself (and hopefully others) the trouble when I try to do it again in a few months.

Technical background

The IAT (import address table) is the structure in which PEs (Windows executable files) store information about functions imported from external libraries (DLLs), but since you’re reading this you most likely already know that – this topic has been done to death by now either way, so going through it again would serve little purpose (check out the classic Iczelion tutorial to freshen up, or Ero Carrera’s PE header format).

Resolving the imports is the process through which, after a file has been loaded in memory by the Windows loader and all the required DLL libraries have been loaded as well, the IAT is filled in with the actual, runtime addresses of the imported functions based on the DLLs’ EATs (export address tables). Specifically, the import lookup table (ILT, it’s RVA is in OriginalFirstThunk) is iterated in parallel with the IAT (RVA in FirstThunk), and based on the DLL name + each function’s name (or ordinal), the IMAGE_THUNK_DATA.AddressOfData field in each IAT entry is set to the VA of the corresponding function’s implementation in memory. As a sidenote, the respective fields are already filled in for the bound imports (similar structures, but pointed to by the IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT data directory).

Annotated code

Getting parsed imports from pefile is straightforward, but getting the file offsets / RVAs to patch is a bit more convoluted; the fact that, for the sake of resilience, pefile appears to only keep either the IAT or the ILT (whichever it successfully manages to parse) complicates matters a bit as well. These are the steps:

  • parse the file, use fast_load=True for performance and explicitly parse imports:
    • pe = pefile.PE(data=data, fast_load=True), pe.parse_data_directories(directories=[1])
  • iterate pe.DIRECTORY_ENTRY_IMPORT, which is a list of ImportDescData (iid); for each:
    • get the DLL name (iid.dll)
    • get the ILT RVA (iid.struct.OriginalFirstThunk) and IAT RVA (iid.struct.FirstThunk)
    • use pe.get_import_table() for each RVA to get the actual table => ilt / iat
    • for each entry (with index idx) in ilt (if ilt is None, use iat):
      • the hint RVA is in ilt[idx].AddressOfData
        • if the hint RVA’s MSB is set, import by ordinal
        • otherwise, import by name (the name is at hint RVA + 2)
      • using the name or the ordinal, find the function’s VA (e.g. by parsing the respective DLL’s export table)
      • the file offset of iat[idx].AddressOfData (which is where the VA goes) can be obtained via iat[idx].get_field_absolute_offset('AddressOfData')

Alternatively, if you’re working in memory directly, use pe.get_memory_mapped_image(ImageBase=...) to get the PE as the loader would set it up in memory and use RVAs directly to edit the buffer.

Full annotated source:

#!/usr/bin/env python3
Sample code which resolves the imports of a PE using pefile.

Author: Vlad Topan (vtopan/gmail)
import pefile

import struct
import sys

data = bytearray(open('test.exe', 'rb').read())    # bytearray to allow modification
pe = pefile.PE(data=data, fast_load=True)
pe.parse_data_directories(directories=[1])  # 1 = 'DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_IMPORT']
bits = 64 if pe.PE_TYPE == pefile.OPTIONAL_HEADER_MAGIC_PE_PLUS else 32

ordinal_flag = 2 ** (bits - 1)
for iid in pe.DIRECTORY_ENTRY_IMPORT:   # array of ImportDescData
    dll_name = iid.dll.decode('ascii', errors='replace')
    ilt_rva = iid.struct.OriginalFirstThunk
    ilt = pe.get_import_table(ilt_rva)
    iat_rva = iid.struct.FirstThunk
    iat = pe.get_import_table(iat_rva)
    if iat is None:
        sys.stderr.write(f'[!] Failed parsing IAT @ RVA 0x{iid.struct.FirstThunk:x}!\n')
    if ilt is None:
        # broken ILT, use IAT as source as well as dest.
        ilt, ilt_rva = iat, iat_rva
    # sample code in pefile.parse_imports(), around line 4046 (as of writing)
    for idx in range(len(ilt)):
        # iterate through both tables, get function name from ILT and place address in IAT
        hint_rva = ilt[idx].AddressOfData
        if hint_rva:            # RVA of IMAGE_IMPORT_BY_NAME, required
            if hint_rva & ordinal_flag:
                # import by ordinal
                ordinal = hint_rva & 0xFFFF
                imp_va = ...    # get the VA from dll_name:ordinal
                # import by name, hint_rva is the RVA of the IMAGE_IMPORT_BY_NAME struct
                hint = pe.get_word_from_data(pe.get_data(hint_rva, 2), 0)
                fun_name = pe.get_string_at_rva(ilt[idx].AddressOfData + 2, pefile.MAX_IMPORT_NAME_LENGTH)
                if not pefile.is_valid_function_name(fun_name):
                    sys.stderr.write(f'[!] Invalid imported function name {fun_name}!\n')
                fun_name = fun_name.decode('ascii', errors='replace')
                imp_va = ...    # get the VA from dll_name:fun_name
            # patch the IAT (the entry's .AddressOfData @ the corresponding idx)
            file_offs = iat[idx].get_field_absolute_offset('AddressOfData')
            data[file_offs:file_offs + 4] = struct.pack('<L', imp_va)
            sys.stderr.write(f'[!] AddressOfData @ index {idx} in table @ RVA {ilt_rva} is empty!\n')
# write changed data
open('test-patched.exe', 'wb').write(data)

Written by vtopan

April 12, 2019 at 5:17 PM

Posted in Uncategorized

Tagged with , , , , ,

Writing a hex viewer / editor widget with Python 3 and Qt5 (PySide2)

leave a comment »

What & why

Unlike most other kinds of widgets (which can be found ready-made on the web), hex viewers seem to be in short supply. Needing one for a binary analysis tool I’m working on, I went through the hoops and decided to document the trickier parts of the process. I’ll bird’s-eye-review Python GUIs in general first and get into the specifics of the hex viewer / editor later on; feel free to skip to Designing & implementing the hex viewer / editor if you’re familiar with PySide2.


Like most of the code I write nowadays, all that follows is written in Python 3. Most of the concepts work similarly in Python 2, but, publishing this in late 2018, I hope that’s a moot point.

Python GUIs

Building a GUI in Python nowadays gives you a lot of options (quite against Python’s “one way to do it” mantra, but GUIs have never been Python’s strong point). Some “classic” UI toolkits:

  • TCL/TK, aka tkinter is builtin on Windows (and readily available on other OSes via something analogous to sudo apt install python3-tk); it’s easy to use, but less popular nowadays, so not much of an ecosystem exists
  • wxWidgets / wxPython, a strong contender; it uses native UI components for those OSes which provide them (making it blend in better on Windows and macOS); somewhat less popular than its rival Qt
  • Qt 5 (PySide2 in Python, see below for details) is the most popular option, mostly due to its excellent documentation and immense ecosystem

Some domain-specific and / or higher level options exists as well, among which:

  • pyforms – an interesting attempt to expose the same interface as a GUI (PyQt-based), a terminal console and a web application
  • toga – a “Python native, OS native GUI toolkit” which will likely be a great choice once it matures and forms an ecosystem
  • pywebview – a web-based, lightweight, cross-platform UI

My personal preference is Qt mainly due to the afore-mentioned ecosystem and documentation, which mean most things will be readily available and thoroughly tested / used in production, but I’ve also learned to love its API (which may seem weird initially if it’s your first proper GUI API, but starts to make a lot of sense and becomes very predictable once you get used to it).


The Python bindings for Qt come in two flavors: PyQt and PySide. A while ago they could (in most cases) be used as drop-in replacements for each other, but they grew apart (as most people do) – see this page for a (one-sided) view of the details. The PySide variant is slightly more pythonic (though there’s less of a difference nowadays) and it’s licensed LGPL, so it’s easier (read: cheaper, free in fact) to use in commercial products. The original PySide (for which far more documentation is available online) was a binding to Qt4; the contemporary one is PySide2, which binds to Qt5 – keep this in mind when looking at code examples online, as most will be for Qt4 / PySide.

PySide2 application skeleton

First, install PySide2 (pip3 install pyside2 should work on most platforms). To kick things off, we’ll start with a generic PySide2 application skeleton.

#!/usr/bin/env python3

from PySide2 import QtGui, QtWidgets
from PySide2.QtCore import Qt, SIGNAL

import sys

class MyMainWindow(QtWidgets.QMainWindow):
    def __init__(self):
        self.main = QtWidgets.QWidget()  # central widget

def run_ui():
    app = QtWidgets.QApplication()
    window = MyMainWindow()

if __name__ == '__main__':

Running the above program should produce a GUI titled “myapp”; if it doesn’t, some error message should explain why (probably either PySide2 or Python itself are not installed or the latter is the wrong version).

Hint: Save the source with a .pyw extension on Windows to get rid of the console app that pops up in the background.

The above code can be used as a skeleton for the final app (with proper spacing and docstrings of course, which I’ll ignore for brevity). The imports are all you’ll need to import until the end: – QtWidgets has all the UI components (“widgets”) – QtGui has the other UI stuff like fonts (QFont) and, for some reason, shortcuts (QKeySequence) – Qt has general-purpose constants, like CaseInsensitiveSIGNAL is used to bind components to user interaction events (e.g. receiving focus); note: more “native” alternatives do exist to do that, but signals are an important enough concept in Qt (and GUI programming in general) to be worth a mention

The main window is defined by the MyMainWindow class, inherited from QtWidgets.QMainWindow. Any other widget would have worked, e.g. having a simple button-app: hello = QtWidgets.QPushButton("Hello!"); hello.show() instead of defining a new class (and inheriting from the generic QWidget is sometimes used in examples), but further down the line inheriting the main application window from QMainWindow will become useful. QMainWindow needs a “central widget”, for which we supply a generic QWidget – this would contain all the useful UI components (think of the “main form” in VC or Delphi). Once we define the hex viewer widget, we can alternatively use that as the main window as well (just instantiate it and call .show() on it instead of declaring and .show()ing MyMainWindow).

After defining the main window we: – instantiate an application and set its title – show the main window – pass control to the application’s event handling loop by running app.exec_()

Hooking a keypress event works like this: self.connect(QtWidgets.QShortcut(QtGui.QKeySequence("Esc"), self), SIGNAL('activated()'), sys.exit)

Designing & implementing the hex viewer / editor

The full source code is available for reference on github; to keep things readable I’ll only provide brief snippets throughout the text, but they should match the full source code.

Getting the data to be displayed

The easier thing to get out of the way is the data source: we want the component to receive either a block of binary data (a bytes or bytearray variable) or a filename (and read from / write to that). We create a DataView class which can be initialized with a data or filename variable and with a readonly flag which controls what sort of access we want to the data. At initialization we simply store the values and set a ._data field to None; we’ll lazily map that to a mmaped view of the file on the first access.

We expose the actual data as a .data property (a .data() method decorated with @property). The function checks if the private ._data field is set, and if so, returns that. Otherwise, for files, it opens the file and mmaps it (respecting the readonly flag) and points the .data field to the mmap object; for raw buffers, ._data is pointed at the bytes buffer (transformed into bytearray if readonly is false to allow modification). A .close() method closes the open file and mmap (if any) and points ._data back to None, which means it can be used to release the file handle when not in use (it will be automatically reopened on the next access).

To allow the object to be indexed and sliced directly (as opposed to using the .data property) we override __setitem__, __getitem__ and __len__.

The hex viewer / editor widget


The most common layout of a hex viewer involves three synchronized vertical sections: addresses (we’ll call this “addr”), hex values of each byte (“hex”) and the ASCII interpretation of each byte, where possible (“text”). The number of bytes represented per line should be configurable, but it’s usually a power of two (8 and 16 are the most common values). A status bar (QStatusBar) at the bottom should indicate (at least) the currently selected offset in the file.

For a given GUI window size we’ll have a fixed number of displayable rows based on the font’s max height and a preconfigured number of columns (“cols”), so we can display (at most) a buffer of rows x cols bytes (“perpage”) starting at a given offset in the file (we’ll call that “offs”). We only want to read perpage bytes from the file at a time (imagine viewing a 4 GB file by first loading all of it in memory to see why), and that will be the hex viewer’s curent “data”. This means that whichever component we use to display the data, the vertical scrollbar will need to be controlled separately, based on the actual file size.

After attempting to use three separate components to display the three columns (“addr”, “hex” and “text”) and failing consistently to correctly synchronize the rows (slight padding, border and behavior differences between the component types’ rows make that impossible) I resorted to using a single table component of type QTableView (first I tried QTableWidget which inherits from QTableView and is higher level, but that made simulating the data window very complicated without providing enough benefits). The addresses are rendered as the table’s vertical labels, the first cols columns contain the hex-encoded values of each byte and then cols more columns contain each byte’s textual representation. This simplifies editing individual bytes and synchronizing selections between hex and text.

Qt organizes widgets by grouping them in “layouts” rather than by fixed coordinates inside the window (it can also do that to some level, but prefers not to). We therefore group the “hexview” table and its separate vertical scrollbar (QtWidgets.QScrollBar(Qt.Vertical)) into a horizontal layout (QtWidgets.QHBoxLayout) called “l2”, and then group that and the bottom status bar into a vertical QVBoxLayout which we set as the main layout of our widget.

To proxy the data (the contents) of the hexview table we create a QStandardItemModel object (called “dm”) through which we control the labels displayed in the table (vertical and horizontal labels), the contents of the cells and the colors.

After the UI components are set up in the class initialization, the workhorse is the .jump(offset) method, which loads perpage bytes from the data source and fills in the hexview table, updating the address labels accordingly. It is called initially to populate the hexview and bound to scroll up / down and window resize events to update the data contents when needed.

Bits and pieces

The actual implementation is, as mentioned, available online; I’ll only go through some interesting details and gotchas.

  • figuring the row height and count – we compute this initially and after each resize (we get row_height based on the font height plus padding):

font = QtGui.QFont('Terminal', 8)
metrics = QtGui.QFontMetrics(font)
row_height = metrics.height() + 4
rows = (hexview.height() - hexview.horizontalHeader().height() - 6) // row_height
  • the table by default resizes itself to the cell contents; we don’t want that: hexview.horizontalHeader().setSectionResizeMode(QtWidgets.QHeaderView.Fixed)
  • on selection change (see the .sel_changed() method), to avoid flickering, we compute the difference between the previous selection and the new one and highlight accordingly
  • styling the table is partly done through CSS using pseudo-elements like ::section, ::section:checked and ::section:hover for the header and ::item, ::item:focus and ::item:hover, and partly via directly coloring the background of selected cells using e.g. dm.item(hexview.selectionModel().currentIndex()).setBackground(QtGui.QColor('#FFFF00') and setting the text alignment and brush color when filling in the table cells (in .jump()) – this certainly can be immplemented a bit more cleanly, but either CSS alone or method calling on objects alone don’t allow proper access to all colors
  • the formatting is applied each time the contents change because the actual cell objects are recreated; this can probably be avoided


The implementation is (almost) an MVP at this point – it works, but it does the absolute bare minimum and the UI looks less than great. Popup menus to allow better control over copy & pasting, more information in the status bar etc. would be welcome and will be added someday. It is however the only hex view / edit widget for Qt (or tkinter / wxPython for that matter) easily available online (that I could find), so it has that going for it. Now that this text exists, it can also serve as a (counter?)example on how to do it and hopefully help usher in the appearance of a proper hex widget someday.

Written by vtopan

November 22, 2018 at 5:14 PM

Posted in Uncategorized

Tagged with , , , , ,

Using a dark GTK 2 theme for Sylpheed in Windows

leave a comment »

What & why

The attempt to create similar development environments under Windows and Linux is a steep uphill climb – starting with the essentials (such as the development environment), but getting more and more frustrating as you get into aesthetic. Making something as “basic” as vim (or even better neovim) work (almost) the same cross-OS is a very interesting learning experience, and getting plugins to play nice with both kinds of paths, and library dependencies, and compilers, and linting for 5-10 languages, and code completion, and so on will either kill you or make you a better man (or woman, or helicopter, as the case may be). Making apps look the same on more than one OS is nigh impossible, and today’s topic is turning off the lights in Sylpheed (a marginally popular but very fast email client) under Windows.


Unlike many Windows apps which have some support for skins and/or themes, Sylpheed doesn’t. Under Linux this makes perfect sense, as that task is relegated to the window manager – more precisely to the UI toolkit, in this case GTK+ v2. The upshot of skinning the UI toolkit directly is that (in theory) applications have a unified look & feel. The downside is that there are several UI toolkits in use, and GTK+ alone has two incompatible versions widely employed by apps (2 and 3). This means that most cross-platform UI apps will either choose a toolkit or implement (at least) two versions, usually one for GTK+ and one for Qt. This leads us to the following question:

Which UI toolkit is my cross-platform app using?

Some software will have two binaries, named something along the lines of something-qt.exe and something-gtk.exe. Sylpheed has a single sylpheed.exe binary, so the name itself is of no help. Looking through its imported DLLs however will reveal libgtk-win32-2.0-0.dll, which means it’s GTK+ v2 (the 2.0.0 in the name means v2). If looking at imported DLLs sounds like jargon, just look through the program’s folder and subfolders for files ending in .dll. Qt*.dll means Qt (usually QtCore4.dll for v4 or Qt5Core.dll for v5), libgtk-*-[23].*.dll means GTK. If no such patterns show up, the binary is most likely built against the native Windows GUI (i.e. no cross platform toolkit is used), and if that’s the case, the UI will look “Windows-like” and can be styled using standard Windows themes.

Styling GTK+ 2

The look of GTK+ applications can be styled in great detail, but in incompatible ways between versions 2 and 3. Adding to the challenge, the folder paths where configuration and theme files must be placed are far less obvious on Windows (Linux has several standards on the topic but, although most distros have nuanced interpretations of said standards or just strong historical preferences, it’s generally obvious where configuration files must go).

GTK+ 2 expects a gtkrc file containing the styling configuration to be present in one of several predefined locations (e.g. ~/.gtkrc-2.0). To use a theme, the file must contain an entry like gtk-theme-name = "Sable-NC-3.14" (“Sable-NC-3.14” being the theme’s name), and the ~/.themes folder (or the global /usr/share/themes folder) must contain a Sable-NC-3.14 subfolder which in turn has a gtk-2.0 subfolder containing (at least) a gtkrc file describing the colors, fonts, icons etc. to be used.

On Windows, some apps look for the XDG_CONFIG_HOME environment variable, others don’t. Some use the %USERPROFILE% folder directly (usually C:\Users\<username>), others use the %LOCALAPPDATA% folder (usually C:\Users\<username>\AppData\Local). GTK+ 2 looks for folders named .gtkrc-2.0 and .gtk directly in %USERPROFILE%, so either can be used. The themes go in the C:\Users\<username>\.themes folder.

Guessing expected configuration file paths

But how do you figure out the paths without going through countless obsolete and / or irrelevant documentation pages? The cheating option is to watch Sylpheed’s file access attempts to guess what the expected paths are, and to do that an excellent option is Microsoft’s ProcMon. Download and run it, then ensure only file accesses are logged by leaving only the file cabinet icon highlighted on the toolbar:

Next, start Sylpheed and look for events with the process name sylpheed.exe. To make thing easier, right-click on one of the sylpheed.exe cells in the “Process Name” column and select “Include ‘sylpheed.exe’” – this will filter out events from other processes. To only see paths containing “gtk”, right-click on any path in the “Path” column and choose “Edit Filter ‘…’”, select “contains” instead of “is” from the second drop-down, type gtk in the text box and press “Add”. The file access attempts of interest (the ones inside C:\Users) should look like in the following picture:

The “Result” column should be filled with “NAME NOT FOUND” or “NO SUCH FILE”, but we do get the file / folder names. The configuration file should be either at c:\Users\<username>\gtkrc-2.0 or in a .gtk subfolder in that folder.

Finding a GTK+ 2 dark theme

The Internet has mostly moved on from GTK+ 2 to 3, but since some common apps haven’t, themes can still be found for GTK+ 2. Either try a web search for “gtk 2 dark theme” or try the traditional repository at gnome-look.org. I got the least eye-straining look from the Sable NC theme (the plain one, not the colored ones).

Putting it all together

The end result of the theme search should be an archive (to open *.tar.gz archives try the 7z archiver or WinRAR, or a proper file management tool like Total Commander). The archive should at some level have a gtk-2.0 folder, and that’s what we’re after. Create a c:\Users\<username>\.themes\<theme-name> folder for each theme you want to try out and place the gtk-2.0 subfolder from the theme archive in it. Then create a c:\Users\<username>\gtkrc-2.0 file containing gtk-theme-name = "<theme-name>" and restart Sylpheed. The end result should look something like this:

Sylpheed – Sable NC dark theme

Written by vtopan

November 21, 2018 at 8:03 PM

Posted in Uncategorized

Tagged with , , , , ,

Setting Up a Pentesting Lab

leave a comment »

Setting Up a Pentesting Lab

October, 2015 – version 1.0


There’s a recent surge of information about pentesting on the Internet (as opposed to, say, a few years ago, when finding relevant information took you on an exciting ride through the dark side of the Internet, occasionally grabbing a few parasites along the way). Pentesting is a relatively new business, "extorting" amazing amounts of money from organizations on the premise of bringing safety to their data. I say extorting because on one hand, most evil "hacking" nowadays involves breaking your website rather than your Internet-exposed, non-web servers via vulnerabilities in software out of your control, so the better solution is to permanently audit your code, not to see if it can be broken at a specific moment in time. On the other hand, given the volume of software you now depend on for running an Internet-facing server, the probability of exploitable vulnerabilities being present is about one, and there’s very little you can do about that. Properly configured software and piling on defensive mechanisms would help, but in any large enough network that quicly becomes unfeasible. Still, people are willing to pay astounding amounts of money for the illusion of safety, so here we are.


Pentesting is a code word for hacking, but when a "good guy" does it (in order to assess the security status of a network). That guy used to be called a "white hat hacker", but the less emphatic "pentester" is now used. The full name "penetration testing" is also used, but the short hand "pentesting" is preferred, probably because it does not contain the word penetration.

A pentesting lab allows you, the "security researcher", to practice exploiting sofware vulnerabilities and to develop exploits for newly found vulnerabilities. It essentially involves:

  • some sort of virtualization solution (usually VirtualBox or VMWare Player, but if you’re ambitious enough, a dedicated host running something like ESXi is better);
  • inside that environment, a variety of vulnerable operating systems ("targets") are deployed; some as old as Windows XP / Ubuntu 6.06 for studying basic buffer overflows or ancient bugs, and some as recent as Windows 8.1 (or even 10) / Ubuntu 15.10 for testing the latest privilege escalation exploits;
  • on the VMs you want vulnerable software (which sometimes is the actual OS, but more often third party software) and debugging tools (such as the Immunity Debugger);
  • an additional VM is often used as a pentesting OS, because having the tools pre-packaged and playing well together out of the box is nice to have; Kali is the most common choice, but other options do exist.

Virtualization software

Installing a free virtualization solution such as VMWare Player or VirtualBox should be easy enough; for the rest of this text I’ll be describing VirtualBox, but the same concepts apply to any other similar software.

Setting up networking

The target VMs and the pentesting VM should be able to connect to each other, but only the pentesting VM may have occasional access to the Internet (for updates), preferrably when not connected to the network containing the target VMs. To achieve this, an internal network is used by configuring the virtual network adapter for each target VM this way: VM right click->Settings…->Network->Attached to: Internal Network; the default "intnet" name may be used, or a custom one can be specified. All VMs connected to an internal network with the same name will "see" each other by magic performed by VirtualBox. They will also get a dynamic, unique IP address via DHCP, if it is enabled for the given internal network. To enable it, run something like this in a console:

"c:\Program Files\Oracle\VirtualBox\VBoxManage" dhcpserver add --netname "intnet" --ip --netmask --lowerip --upperip --enable

Change the "intnet" name to the same one you use for the VMs, and the IP ranges if needed. To make sure it’s created and enabled, run:

"c:\Program Files\Oracle\VirtualBox\VBoxManage" list dhcpservers

Doing this while a target VM is running may be glitchy (the DHCP server won’t work); restarting both the VM and VirtualBox might help with that.

If you want the host to be connected to the same network as the targets (use this option with great care, especially if you run untrusted binaries, such as precompiled exploits, on the target VMs), use host only adapters, which connect the VMs to each other and to the host.

On the VM side of things, enable DHCP networking and make sure the default firewall is disabled (all Windows versions since XP have it on by default), or at least poke holes through it for the apps you want to exploit remotely.

Setting up the target OSes for virtualization

On VMs where they are available, the VirtualBox Guest Additions greatly improve performance. The Guest Additions CD is part of the VirtualBox Extension Pack and must be downloaded separately (due to different licensing); it is available on the VirtualBox download page and is cross platform. After installing the Extension Pack, run the VM and click Devices->Install Guest Additions CD Image…. The process is automatic on Windows; on Linux, you may need to mount the CDROM and run the installation manually.

Shared folders are helpful to install software on the VMs (never give read-write access to entire volumes; create a specific shared folder and use that).

The shared clipboard is useful to copy values and commands to/from the host (Devices->Shared clipboard->Bidirectional).

Virtual machines


Download Kali and install it. Configure Metasploit and you’re good to go.

If your host is also a Linux machine, don’t forget to put your SSH public key in Kali root’s authorized_keys to simplify the process.

De-ICE 1.123

As a first target VM, a pre-built VM specially designed for pentesting such as De-ICE 1.123 is a good choice, as it has a set of vulnerable applications already installed and configured. De-ICE does not need to be installed (the OS runs directly from the ISO image and thus changes to the disk are non-persistent), so the VM does not require an attached hard disk.

Windows XP

As a second VM, Windows XP should be an easy target (both for the installation process, and for any actual pentesting). You need an ISO image of the OS installation disk (as with all the other OSes for the lab). Microsoft provides images for MSDN subscribers (e.g. Windows 8.1), but only for their currently supported OSes (8.1 and 10 at this time), and obviously only if you’ve purchased them. Downloadable ISOs for Windows XP SP3 still exist (e.g. xpsp3_5512.080413-2113_usa_x86fre_spcd.iso), but since Microsoft no longer supports Windows XP, I expect them not to be available for long. Using an old installation CD you have is also an option; use a CD ripping tool to obtain the ISO image.

This version of Windows (particularly pre-SP3) employs very few defenses against software exploitation, so it’s particularly useful for learning simple overflow exploits without having to worry about DEP/ROP.

Ubuntu 15.10

A recent Linux, such as the latest version of Ubuntu, is a good choice as a target VM particularly for web vulnerabilities; Apache and nginx are easy to install/configure. Most public vulnerabilities are in older versions than what you can get from the repos, but recent privilege escalation exploits (LPE) do exist.

More Windows / Linux versions

As OSes progress, developers attempt to make software vulnerabilities harder to exploit; a wider selection of OS versions, particularly for Windows, will allow you to develop exploits from very simple (on the still-used Windows XP) to very complex (involving ROP chains and dealing with ASLR, on post-Windows 7 versions). Evolving versions of the kernel are also available this way, which is particularly relevant to privilege escalation exploits.

Pre-built targets

The go-to pentesting target OS is Metasploitable, which is specially designed for testing the Metasploit framework, but can also be attacked "manually".

Aa growing number of pentesting target OSes (like the previousely mentioned De-ICE 1.123) are collected at vulnhub.com. Some of them (particularly the ones prepared for CTF competitions) are difficult to break, but some are easier. All of them are Linux based, so for exploiting actual vulnerable applications on Windows see the next section.

Software for the VMs

Debugging software

On Windows machines, having debugging software specifically designed for developing exploits is very useful. Immunity Debugger is a very popular choice (32 bits only), often accompanied by mona.py (to install mona, download and copy it to the C:\Program Files\Immunity Inc\Immunity Debugger\PyCommands folder).

Vulnerable applications

Although attacking vulnerabilities you compiled yourself in a test binary does have some pedagogic merit, there’s far more to gain from attacking actual vulnerabilities in public applications. The easiest way to find such apps is to approach the problem from the other end and look for exploits on sites which collect them, and then try to find a version of the application which is vulnerable to that exploit.

  • exploit-db is the best known exploit repository (started from the now defunct Milw0rm archive, and continuously updated since).
    • Some of the exploits have locally-archived copies of the vulnerable apps, you can find them in the page header.
  • PacketStorm is another well known repository, and it allows better exploit filtering. To find remotely-exploitable overflows, for example, build an URL manually by appending the appropriate tags to the find-by-tag URL:

Old versions of applications

After finding out which version of a software you need, getting an actual copy of the installation kit may prove challenging, particularly for very old, not very well known or very large applications. The following software repositories might help, as they retain older versions:

If all else fails, search the web for the specific version (it helps a lot if you can find the actual filename the kit used to have).

Written by vtopan

October 13, 2015 at 9:27 PM

Compiling a tiny executable (or DLL) in FASM (or MSVC) + templates – part II: MSVC

with one comment

Since creating tiny executables and DLLs in FASM was pretty straight-forward, we’ll challenge ourselves a bit: we’ll do it in MSVC. To be fully honest, creating a tiny executable using MSVC is actually more than a bit of a challenge, but it is doable.

An empty “int main(…) {return 0;}” program compiled with Microsoft’s Visual C++ compiler is already around 38 KB (using VS 2008, but other versions should yield similar results) due to the CRT (also calledcrt0, which implements (parts of) the C language runtime) code included in the executable. The CRT is responsible for things such as implementing the C API for working with files (fopen & co.), strings (strcpy & co.) etc., the malloc-based memory allocator, the well-known int main(…) entrypoint, etc.

[Sidenote] All Windows (PE) executables actually have a WinMain entrypoint, where a CRT stub is placed which, among many other things, parses the single-string command line and splits it into “parameters” passed as argv[] to main().

If we remove the CRT, we have to use Windows API functions to replace it (on Windows, obviously).

[Sidenote] We could alternatively link against a dynamic version of the runtime, but the code would still be much bigger than it needs to be, with the added pain of having a dependency on the Microsoft Visual C++ Redistributables, which have caused plenty of annoyance due to the somewhat cryptic error messages reported about some MSVCR##.DLL missing when the packages aren’t installed on the user’s machine.

The file management functions have good equivalents in the Windows API (start from CreateFile). The strings APIs are somewhat harder to find, but present. Memory allocation APIs are plentiful (LocalAlloc or HeapAlloc are good starts, or even VirtualAlloc for more complicated things). The fact that the WinMain() standard entrypoint does not provide “digested” arguments (like C’s argv/argc main() arguments) can also be handled using Windows API calls, but for some reason only the WideChar variant of CommandLineToArgv (i.e. CommandLineToArgvW) is implemented, so we’ll work with WideChar functions in the example.

Let’s create a 64-bit basic DLL loader and a sample DLL. First, the loader’s source code:

#include <windows.h>

    _In_ HINSTANCE hInstance,
    _In_ HINSTANCE hPrevInstance,
    _In_ LPSTR lpCmdLine,
    _In_ int nCmdShow
    WCHAR *cmdline;
    int argc;
    WCHAR **argv;
    /// we aren't interested in any of the standard WinMain parameters
    /// get the WideChar command line, split it
    cmdline = GetCommandLineW();
    argv = CommandLineToArgvW(cmdline, &argc);
    /// assume the first parameter is a DLL name, load it

    /// free the "digested" parameters

In the interest of keeping things simple, this code is fugly and evil (it doesn’t check error codes), but it does work.

We have a standard WinMain entrypoint. We get the WideChar version of the command line using GetCommandLineW(), then
split it using CommandLineToArgvW() into the more familiar (to C devs at least) argc/argv pair. We call LoadLibraryW on
the first argument, which we assume is the DLL name, then free the argv obtained from CommandLineToArgvW() and exit.

The DLL is as basic as possible, just a header and a source (the header is given to serve as a template, it’s not actually necessary in this case).

The win64dll.h file:

#pragma once

#define AWESOME_API __declspec(dllexport)
#define AWESOME_API __declspec(dllimport)

And the .c:

#include "win64dll.h"
#include <windows.h>

    HMODULE hModule,
    DWORD  dwReason,
    LPVOID lpReserved

    switch (dwReason)
        case DLL_PROCESS_ATTACH:
            MessageBox(0, "Hello!", "Hello!", MB_OK);
        case DLL_THREAD_ATTACH:
        case DLL_THREAD_DETACH:
        case DLL_PROCESS_DETACH:
    return TRUE;

Now for the hard part: compiling without the CRT. Since it’s easier to compile from the command line, or at least easier to write command line arguments than project settings/checkboxes in an article, let’s start off with compiling the loader above (presumably called loader.c) with the CRT and go from there:

cl.exe loader.c /link shell32.lib

Assuming your environment is set up correctly (see sidenote below about setting up the environment), this will produce a ~32 KB executable (on VS 2010 at least). The /link shell32.lib parameter is necessary because that lib contains the CommandLineToArgvW function we used. To get rid of the CRT, the /NODEFAULTLIB parameter is used. We’ll also need to explicitly define WinMain as the entrypoint, and add kernel32.lib to the library list, since it’s one of the “default” libs which is no longer automatically linked:

cl.exe loader.c /link shell32.lib kernel32.lib /nodefaultlib /entry:WinMain

[Sidenote] The linker parameters (the ones after /link) are NOT case-sensitive, either /nodefaultlib or /NODEFAULTLIB will work.

This should produce a 2560 byte executable, more than 12 times smaller than the CRT-based one, and doing exactly the same thing.

[Sidenote Q&A] Q. How to tell which .lib files are needed?
A. When the necessary .lib is not given as a command line argument, the compiler complains with an error message such as:

loader.obj : error LNK2019: unresolved external symbol __imp__LoadLibraryW@4 referenced in function _WinMain@16

This means the LoadLibraryW() function is missing it’s .lib; to find it, simply search for the LoadLibraryW text in the lib folder of the SDK (in a path similar to: “C:\Program Files (x86)\Microsoft SDKs\Windows\v7.1\Lib“) and use the lib it’s found in (in this case, kernel32.lib). As an alternative to searching in the (binary) .lib files, you can check the MSDN reference for the given function (search online for something like “MSDN LoadLibraryW”), which gives the lib filename in the “Library” section of the “Requirements” table at the bottom.

Compiling the DLL needs a few more paramters:

cl.exe win64dll.c /link shell32.lib kernel32.lib user32.lib /nodefaultlib /entry:DllMain /dll /out:win64dll.dll
  • user32.lib is needed for MessageBoxA
  • the /dll switch tells the linker to create a DLL
  • the exact name must be specified (via /out:win64dll.dll), because the linker defaults to creating .exe-s

This should yield a 3072 byte DLL. It’s slightly larger than the executable because of the export section (and the fact that each section’s size is rounded up to the FileAlignment (PE header) value, which defaults to 0x200 == 512, exactly the difference which we got, which also covers the small difference in size between the actual code produced).

After building the executable and the DLL, running:

loader.exe win64dll.dll

should pop up a “Hello!” message box, then exit.

[Sidenote] Setting up the cl.exe environment involves:

  • adding the VC\BIN subfolder of the MS Visual Studio installation to the %PATH% environment variable (should look like “C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin”)
  • adding to the %INCLUDE% environment variable the INCLUDE subfolders of:
    • the MS Platform SDK installation (e.g. “C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Include”)
    • the MS Visual Studio installation (e.g. “C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\Include”)
  • adding to the %LIB% environment variable the LIB subfolders of the VS and PSDK

The easiest way to do this is via a setupvc.bat file containing:

@echo off
SET VS_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0
SET SDK_PATH=C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A

After adjusting the paths as needed by your setup, run this before running cl.exe from the command line.

The first part of this text deals with tiny executables in FASM.

Written by vtopan

February 17, 2014 at 10:23 PM

Posted in C, programming, Snippets

Tagged with , , , , ,

Compiling a tiny executable (or DLL) in FASM (or MSVC) + templates – part I: FASM

with one comment

Having a tiny do-nothing executable can be useful for many purposes (or at least that’s what my malware-analysis-infested-background pushes me to believe). Nowadays, I use such executables to replace part of the half-a-dozen processes some evil software packages (*cough*Acrobat Reader*cough*) install on my system just to keep themselves updated. The same malware analysis experience kicks in again with an advice at this point: DON’T STOP SOFTWARE FROM UPDATING ITSELF unless you know what you’re doing!

Producing a tiny executable (32-bit PE to be precise) is easy with FASM (get FASM from here) and pretty straight-forward.

For example, save this as win32tiny.fasm:

format PE GUI 4.0
entry start

include 'win32a.inc'

section '.text' code readable executable

    push 0              ;;; pass an ExitCode of 0 to ExitProcess
    call [ExitProcess]

section '.idata' import data readable writeable
library kernel, 'KERNEL32.DLL'
import kernel, ExitProcess, 'ExitProcess'

All this program does does is import the function ExitProcess from kernel32.dll and call it as soon as it is executed.
The “a” in win32a.inc stands for ASCII; we’ll go for the WideChar variant in the 64-bit executable later on.

To compile it, open up a command line (cmd.exe), add FASM’s \include subfolder to the %INCLUDE% environment variable and run FASM.EXE on the above source:

set INCLUDE=%INCLUDE%;c:\path_to_fasm\include
c:\path_to_fasm\FASM.EXE win32tiny.fasm

If all went well, you’ll end up with a win32tiny.exe (which does absolutely nothing) of only 1536 bytes.

[Sidenote] To make compiling sources with FASM easier, add a fasm.bat to your %PATH% environment variable with the following contents:

@echo off
set INCLUDE=%INCLUDE%;c:\path_to_fasm\include
c:\path_to_fasm\FASM.EXE %1

A 64-bit variant of the executable looks very similar:

format PE64 GUI 4.0
entry start

include 'win64a.inc'

section '.text' code readable executable

xor rcx, rcx ;;; pass an ExitCode of 0 to ExitProcess
call [ExitProcess]

section '.idata' import data readable writeable
library kernel32,'KERNEL32.DLL'
import kernel32, ExitProcess,'ExitProcess'

The only differences are the PE64 format (which means the “PE32+” 64-bit format), the 64-bit version of the .inc file and passing the argument to ExitProcess via RCX, as per the Windows 64-bit calling convention (described here, with details about register usage here).

If we wanted to build a tiny DLL, things would actually be easier, since no imported functions are required (the DLL entrypoint simply returns “TRUE” to indicate successful loading):

format PE GUI 4.0 DLL
entry DllMain

include 'win32a.inc'

section '.text' code readable executable

proc DllMain hinstDLL, fdwReason, lpvReserved
mov eax, TRUE

The imports (“.idata”) section is gone, making the DLL only 1024 bytes long. The entrypoint changed to a DLL-specific one (see DllMain for details) and is now a proper function returning control to the loader (as opposed to the executables above, in which the call to ExitProcess makes any code after it irrelevant).

Building a 64-bit DLL simply requires adjusting the format to PE64 and the included header to the 64-bit variant, just like above.

Since we got this far, let’s have a look at a 64-bit DLL which might actually be used as a template, with proper imports and exports:

format PE64 GUI 4.0 DLL
entry DllMain</code>

include 'win64a.inc'

section '.text' code readable executable

proc Howdy
invoke MessageBox, 0, [title], NULL, MB_ICONERROR + MB_OK

proc DllMain hinstDLL, fdwReason, lpvReserved
mov eax, TRUE

section '.idata' import data readable writeable

kernel, 'KERNEL32.DLL',\
user, 'USER32.DLL'

import kernel,\
GetLastError, 'GetLastError',\
FormatMessage, 'FormatMessageA',\
LocalFree, 'LocalFree'

import user,\
MessageBox, 'MessageBoxA'

section '.data' data readable writeable
title db "Hello, world!", 0

section '.edata' export data readable

export 'win64tiny.dll',\
Howdy, 'Howdy'

data fixups
end data

It has an exported function called “Howdy” which shows a message box, and a few more imports (some unused) to show how you can have more than one imported function per DLL. It also uses “invoke” to perform the call to MessageBox to keep things simple. The “data fixups” at the end generates relocation information, without which any real-life DLL would be unusable.

Part two of this text deals with doing this in MSVC.

Written by vtopan

February 16, 2014 at 7:57 PM

Funky Python – code snippets

with 3 comments

Python is a great programming language for a number of reasons, but one of it’s best features is the fact that, as per Python Zen item #13, “There should be one– and preferably only one –obvious way to do it.” Working with the beast for a number of years, however, does expose one to some less pythonic and somewhat quirky design points (maybe even a few gotchas); here are some of them.

Python quirks & gotchas

1. Boolean values in numeric contexts evaluate to 0/1

It’s very intuitive (especially when coming to Python from C) for 0 values to evaluate to False in boolean contexts and non-zero values to True. Having False evaluate to 0 and True to 1 in numeric contexts is however less intuitive (and somewhat useless):

>>> a = [1, 2, 3]
>>> a[True], a[False]
(2, 1)
>>> True + True # this one was rather unexpected...

2. The default argument values are evaluated at the point of function definition in the defining scope

This is probably one of the most frequent gotchas out there:

>>> def a(b=[]):
...     b.append(3)
...     print b
>>> a()
>>> a()
[3, 3]

The proper way to do this is to set b’s default value to None in the declaration and set it to [] inside the function body if it’s set to None:

>>> def a(b=None):
...     if b is None: b = []        
...     b.append(3)

3. *Everything* is an object

Although it’s a fact which may escape even the best of programmers at the beginning, Python is actually object oriented to the core. Despite the fact that it allows you to write procedural (and even functionalish) code, everything is an object. Functions are objects, data types are objects etc. This:


doesn’t work because the period is parsed as part of the numeric token (think of 69.j or 69.e1); this however:


works. Being objects, functions can also have attributes:

>>> def a():
...    print a.x
>>> a.x = 3
>>> a()

This comes in handy e.g. for giving a decorated function the same internal name (for introspection purposes) as the original function:

def print_call_decorator(fun):
    def replacement(*args, **kwargs):
        res = fun(*args, **kwargs)
        print r'Call %s.%s => %s' % (inspect.getmodule(fun).__name__, fun.__name__, res)
        return res
    replacement.__name__ = fun.__name__
    return replacement

4. Generators, sets & dictionaries also have comprehensions (called “displays”)

As you probably know, list comprehensions are a great way to generate a new list from another iterable. But it goes further… Generators, sets and dicts also have something similar, called displays. The basic syntax is this:
generator = (value for ... in ...)
dict = {key:value for ... in ...}
set = {value for ... in ...}


>>> a = ['a', 'b', 'c']
>>> d = {x:a.index(x) for x in a}
>>> d
{'a': 0, 'c': 2, 'b': 1}
>>> d_rev = {d[x]:x for x in d}
>>> d_rev
{0: 'a', 1: 'b', 2: 'c'}

This makes reversing a dictionary for example much cleaner.
What makes displays even more fun are the *list* displays, which are essentially list comprehensions but with unlimited depth; using them to flatten a list of lists would look something like this:

flat = [x for y in lst for x in y]

The x/y order in the example is not a mistake; that’s actually the proper order.

5. GIL: Python threads aren’t

Not on multi-processor machines, at least. Yes, there is a threading module (aptly named), but due to the Global Interpreter Lock, threads of the same (Python) process can’t actually run at the same time. This becomes more of an issue when deploying native-Python servers, as they don’t get any benefit from the number of cores installed on the machine (drastically limiting the number of open sockets a Python process can handle at the same time as opposed to a native one written in C).

6. for has an else clause

…and so do the try…except/finally and while constructs. In all cases, the else branch is executed if all went well (break wasn’t called to stop cycles / no exception occurred). And while the else branch may be useful to perform when you want something to happen only if the cycle construct wasn’t “broken” (the classic example is handling the fact that the cycle hasn’t found the value it was looking for), try doesn’t really need an else clause, as the following are equivalent and the latter seems at least to me more readable:

  • with else:
  • without else:

7. Tuple assignment order

In tuple assignments, the left-values are eval’ed & assigned in order:

>>> a = [1, 2, 3]
>>> i, a[i] = 1, 5 # i is set to 1 *before* a[i] is evaluated
>>> a
[1, 5, 3]

This happens because tuple assignments are equivalent to assigning the unpacked pairs in order; the second line above is therefore equivalent to:

>>> i = 1
>>> a[i] = 5

8. Scope juggling

Inside a function, variables are resolved in the global scope if no direct variable assignment appears in the
function, but are local otherwise (making Python clairvoyant, as it is able to tell that something is going to happen later on inside the function, i.e. a variable will be set). Note that attributes and sequence/dict values can still be set, just not the “whole” variable…

a1 = [1, 2, 3]
a2 = [1, 2, 3]
b = 3
c = 4
def fun():
    global b
    print c # crashes, because c is resolved to the local one (which is not set at this point)
    print b # works, because the global directive above forces b to be resolved to the global value
    a1[0] = 4 # works, because a1 is not directly set anywhere inside the function
    a2[0] = 5 # crashes, because a2 is later on directly set
    c = 10
    b = 11
    a2 = 'something else'

Bonus facts

As a bonus for making it through to the end, here are some lesser known / less frequently pondered upon facts about Python:

  1. a reference to the current list comprehension (from the inside) can be obtained with:
  2. list comprehensions can contain any number of forin levels (this is actually documented). Flattening lists:
    flat = [x for y in lst for x in y]
  3. range() actually builds a list, which can be slow and memory-consuming for large values; use xrange()
  4. modules have a dict attribute .__dict__ with all global symbols
  5. the sys.path list can be tampered with before some imports to selectively import modules from dynamically-generated paths
  6. flushing file I/O must be followed by an os.fsync(…) to actually work:
  7. after instantiation, object methods have the read-only attributes .im_self and .im_func set to the current object’s class and the implementing function respectively
  8. some_set.discard(x) removes x from the set only if present (without raising an exception otherwise)
  9. when computing actual indexes for sequences, negative indexes get added with the sequence length; if the result is still negative, IndexError is raised (so [1, 2, 3][-2] is 2 and [1, 2, 3][-4] raises IndexError)
  10. strings have the .center(width[, fillchar]) method, which padds them centered with fillchar (defaults to space) to the length given in width
  11. inequality tests can be chained: 1 < x < 2 works
  12. the minimum number of bits required to represent an integer (or long) can be obtained with the integer’s .bit_length() method

Written by vtopan

March 17, 2011 at 12:37 AM

Posted in Python, Snippets