Vlad Ioan Topan

My playground

Writing a hex viewer / editor widget with Python 3 and Qt5 (PySide2)

leave a comment »

What & why

Unlike most other kinds of widgets (which can be found ready-made on the web), hex viewers seem to be in short supply. Needing one for a binary analysis tool I’m working on, I went through the hoops and decided to document the trickier parts of the process. I’ll bird’s-eye-review Python GUIs in general first and get into the specifics of the hex viewer / editor later on; feel free to skip to Designing & implementing the hex viewer / editor if you’re familiar with PySide2.

Tools

Like most of the code I write nowadays, all that follows is written in Python 3. Most of the concepts work similarly in Python 2, but, publishing this in late 2018, I hope that’s a moot point.

Python GUIs

Building a GUI in Python nowadays gives you a lot of options (quite against Python’s “one way to do it” mantra, but GUIs have never been Python’s strong point). Some “classic” UI toolkits:

  • TCL/TK, aka tkinter is builtin on Windows (and readily available on other OSes via something analogous to sudo apt install python3-tk); it’s easy to use, but less popular nowadays, so not much of an ecosystem exists
  • wxWidgets / wxPython, a strong contender; it uses native UI components for those OSes which provide them (making it blend in better on Windows and macOS); somewhat less popular than its rival Qt
  • Qt 5 (PySide2 in Python, see below for details) is the most popular option, mostly due to its excellent documentation and immense ecosystem

Some domain-specific and / or higher level options exists as well, among which:

  • pyforms – an interesting attempt to expose the same interface as a GUI (PyQt-based), a terminal console and a web application
  • toga – a “Python native, OS native GUI toolkit” which will likely be a great choice once it matures and forms an ecosystem
  • pywebview – a web-based, lightweight, cross-platform UI

My personal preference is Qt mainly due to the afore-mentioned ecosystem and documentation, which mean most things will be readily available and thoroughly tested / used in production, but I’ve also learned to love its API (which may seem weird initially if it’s your first proper GUI API, but starts to make a lot of sense and becomes very predictable once you get used to it).

PySide2

The Python bindings for Qt come in two flavors: PyQt and PySide. A while ago they could (in most cases) be used as drop-in replacements for each other, but they grew apart (as most people do) – see this page for a (one-sided) view of the details. The PySide variant is slightly more pythonic (though there’s less of a difference nowadays) and it’s licensed LGPL, so it’s easier (read: cheaper, free in fact) to use in commercial products. The original PySide (for which far more documentation is available online) was a binding to Qt4; the contemporary one is PySide2, which binds to Qt5 – keep this in mind when looking at code examples online, as most will be for Qt4 / PySide.

PySide2 application skeleton

First, install PySide2 (pip3 install pyside2 should work on most platforms). To kick things off, we’ll start with a generic PySide2 application skeleton.

#!/usr/bin/env python3

from PySide2 import QtGui, QtWidgets
from PySide2.QtCore import Qt, SIGNAL

import sys

class MyMainWindow(QtWidgets.QMainWindow):
    def __init__(self):
        super().__init__()
        self.main = QtWidgets.QWidget()  # central widget
        self.setCentralWidget(self.main)

def run_ui():
    app = QtWidgets.QApplication()
    app.setApplicationName('myapp')
    window = MyMainWindow()
    window.show()
    sys.exit(app.exec_())


if __name__ == '__main__':
    run_ui()

Running the above program should produce a GUI titled “myapp”; if it doesn’t, some error message should explain why (probably either PySide2 or Python itself are not installed or the latter is the wrong version).

Hint: Save the source with a .pyw extension on Windows to get rid of the console app that pops up in the background.

The above code can be used as a skeleton for the final app (with proper spacing and docstrings of course, which I’ll ignore for brevity). The imports are all you’ll need to import until the end: – QtWidgets has all the UI components (“widgets”) – QtGui has the other UI stuff like fonts (QFont) and, for some reason, shortcuts (QKeySequence) – Qt has general-purpose constants, like CaseInsensitiveSIGNAL is used to bind components to user interaction events (e.g. receiving focus); note: more “native” alternatives do exist to do that, but signals are an important enough concept in Qt (and GUI programming in general) to be worth a mention

The main window is defined by the MyMainWindow class, inherited from QtWidgets.QMainWindow. Any other widget would have worked, e.g. having a simple button-app: hello = QtWidgets.QPushButton("Hello!"); hello.show() instead of defining a new class (and inheriting from the generic QWidget is sometimes used in examples), but further down the line inheriting the main application window from QMainWindow will become useful. QMainWindow needs a “central widget”, for which we supply a generic QWidget – this would contain all the useful UI components (think of the “main form” in VC or Delphi). Once we define the hex viewer widget, we can alternatively use that as the main window as well (just instantiate it and call .show() on it instead of declaring and .show()ing MyMainWindow).

After defining the main window we: – instantiate an application and set its title – show the main window – pass control to the application’s event handling loop by running app.exec_()

Hooking a keypress event works like this: self.connect(QtWidgets.QShortcut(QtGui.QKeySequence("Esc"), self), SIGNAL('activated()'), sys.exit)

Designing & implementing the hex viewer / editor

The full source code is available for reference on github; to keep things readable I’ll only provide brief snippets throughout the text, but they should match the full source code.

Getting the data to be displayed

The easier thing to get out of the way is the data source: we want the component to receive either a block of binary data (a bytes or bytearray variable) or a filename (and read from / write to that). We create a DataView class which can be initialized with a data or filename variable and with a readonly flag which controls what sort of access we want to the data. At initialization we simply store the values and set a ._data field to None; we’ll lazily map that to a mmaped view of the file on the first access.

We expose the actual data as a .data property (a .data() method decorated with @property). The function checks if the private ._data field is set, and if so, returns that. Otherwise, for files, it opens the file and mmaps it (respecting the readonly flag) and points the .data field to the mmap object; for raw buffers, ._data is pointed at the bytes buffer (transformed into bytearray if readonly is false to allow modification). A .close() method closes the open file and mmap (if any) and points ._data back to None, which means it can be used to release the file handle when not in use (it will be automatically reopened on the next access).

To allow the object to be indexed and sliced directly (as opposed to using the .data property) we override __setitem__, __getitem__ and __len__.

The hex viewer / editor widget

Design

The most common layout of a hex viewer involves three synchronized vertical sections: addresses (we’ll call this “addr”), hex values of each byte (“hex”) and the ASCII interpretation of each byte, where possible (“text”). The number of bytes represented per line should be configurable, but it’s usually a power of two (8 and 16 are the most common values). A status bar (QStatusBar) at the bottom should indicate (at least) the currently selected offset in the file.

For a given GUI window size we’ll have a fixed number of displayable rows based on the font’s max height and a preconfigured number of columns (“cols”), so we can display (at most) a buffer of rows x cols bytes (“perpage”) starting at a given offset in the file (we’ll call that “offs”). We only want to read perpage bytes from the file at a time (imagine viewing a 4 GB file by first loading all of it in memory to see why), and that will be the hex viewer’s curent “data”. This means that whichever component we use to display the data, the vertical scrollbar will need to be controlled separately, based on the actual file size.

After attempting to use three separate components to display the three columns (“addr”, “hex” and “text”) and failing consistently to correctly synchronize the rows (slight padding, border and behavior differences between the component types’ rows make that impossible) I resorted to using a single table component of type QTableView (first I tried QTableWidget which inherits from QTableView and is higher level, but that made simulating the data window very complicated without providing enough benefits). The addresses are rendered as the table’s vertical labels, the first cols columns contain the hex-encoded values of each byte and then cols more columns contain each byte’s textual representation. This simplifies editing individual bytes and synchronizing selections between hex and text.

Qt organizes widgets by grouping them in “layouts” rather than by fixed coordinates inside the window (it can also do that to some level, but prefers not to). We therefore group the “hexview” table and its separate vertical scrollbar (QtWidgets.QScrollBar(Qt.Vertical)) into a horizontal layout (QtWidgets.QHBoxLayout) called “l2”, and then group that and the bottom status bar into a vertical QVBoxLayout which we set as the main layout of our widget.

To proxy the data (the contents) of the hexview table we create a QStandardItemModel object (called “dm”) through which we control the labels displayed in the table (vertical and horizontal labels), the contents of the cells and the colors.

After the UI components are set up in the class initialization, the workhorse is the .jump(offset) method, which loads perpage bytes from the data source and fills in the hexview table, updating the address labels accordingly. It is called initially to populate the hexview and bound to scroll up / down and window resize events to update the data contents when needed.

Bits and pieces

The actual implementation is, as mentioned, available online; I’ll only go through some interesting details and gotchas.

  • figuring the row height and count – we compute this initially and after each resize (we get row_height based on the font height plus padding):

font = QtGui.QFont('Terminal', 8)
metrics = QtGui.QFontMetrics(font)
row_height = metrics.height() + 4
rows = (hexview.height() - hexview.horizontalHeader().height() - 6) // row_height
  • the table by default resizes itself to the cell contents; we don’t want that: hexview.horizontalHeader().setSectionResizeMode(QtWidgets.QHeaderView.Fixed)
  • on selection change (see the .sel_changed() method), to avoid flickering, we compute the difference between the previous selection and the new one and highlight accordingly
  • styling the table is partly done through CSS using pseudo-elements like ::section, ::section:checked and ::section:hover for the header and ::item, ::item:focus and ::item:hover, and partly via directly coloring the background of selected cells using e.g. dm.item(hexview.selectionModel().currentIndex()).setBackground(QtGui.QColor('#FFFF00') and setting the text alignment and brush color when filling in the table cells (in .jump()) – this certainly can be immplemented a bit more cleanly, but either CSS alone or method calling on objects alone don’t allow proper access to all colors
  • the formatting is applied each time the contents change because the actual cell objects are recreated; this can probably be avoided

End

The implementation is (almost) an MVP at this point – it works, but it does the absolute bare minimum and the UI looks less than great. Popup menus to allow better control over copy & pasting, more information in the status bar etc. would be welcome and will be added someday. It is however the only hex view / edit widget for Qt (or tkinter / wxPython for that matter) easily available online (that I could find), so it has that going for it. Now that this text exists, it can also serve as a (counter?)example on how to do it and hopefully help usher in the appearance of a proper hex widget someday.

Written by vtopan

November 22, 2018 at 5:14 PM

Posted in Uncategorized

Tagged with , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: