Vlad Ioan Topan

My playground

Setting Up a Pentesting Lab

leave a comment »

Setting Up a Pentesting Lab

October, 2015 – version 1.0


There’s a recent surge of information about pentesting on the Internet (as opposed to, say, a few years ago, when finding relevant information took you on an exciting ride through the dark side of the Internet, occasionally grabbing a few parasites along the way). Pentesting is a relatively new business, "extorting" amazing amounts of money from organizations on the premise of bringing safety to their data. I say extorting because on one hand, most evil "hacking" nowadays involves breaking your website rather than your Internet-exposed, non-web servers via vulnerabilities in software out of your control, so the better solution is to permanently audit your code, not to see if it can be broken at a specific moment in time. On the other hand, given the volume of software you now depend on for running an Internet-facing server, the probability of exploitable vulnerabilities being present is about one, and there’s very little you can do about that. Properly configured software and piling on defensive mechanisms would help, but in any large enough network that quicly becomes unfeasible. Still, people are willing to pay astounding amounts of money for the illusion of safety, so here we are.


Pentesting is a code word for hacking, but when a "good guy" does it (in order to assess the security status of a network). That guy used to be called a "white hat hacker", but the less emphatic "pentester" is now used. The full name "penetration testing" is also used, but the short hand "pentesting" is preferred, probably because it does not contain the word penetration.

A pentesting lab allows you, the "security researcher", to practice exploiting sofware vulnerabilities and to develop exploits for newly found vulnerabilities. It essentially involves:

  • some sort of virtualization solution (usually VirtualBox or VMWare Player, but if you’re ambitious enough, a dedicated host running something like ESXi is better);
  • inside that environment, a variety of vulnerable operating systems ("targets") are deployed; some as old as Windows XP / Ubuntu 6.06 for studying basic buffer overflows or ancient bugs, and some as recent as Windows 8.1 (or even 10) / Ubuntu 15.10 for testing the latest privilege escalation exploits;
  • on the VMs you want vulnerable software (which sometimes is the actual OS, but more often third party software) and debugging tools (such as the Immunity Debugger);
  • an additional VM is often used as a pentesting OS, because having the tools pre-packaged and playing well together out of the box is nice to have; Kali is the most common choice, but other options do exist.

Virtualization software

Installing a free virtualization solution such as VMWare Player or VirtualBox should be easy enough; for the rest of this text I’ll be describing VirtualBox, but the same concepts apply to any other similar software.

Setting up networking

The target VMs and the pentesting VM should be able to connect to each other, but only the pentesting VM may have occasional access to the Internet (for updates), preferrably when not connected to the network containing the target VMs. To achieve this, an internal network is used by configuring the virtual network adapter for each target VM this way: VM right click->Settings…->Network->Attached to: Internal Network; the default "intnet" name may be used, or a custom one can be specified. All VMs connected to an internal network with the same name will "see" each other by magic performed by VirtualBox. They will also get a dynamic, unique IP address via DHCP, if it is enabled for the given internal network. To enable it, run something like this in a console:

"c:\Program Files\Oracle\VirtualBox\VBoxManage" dhcpserver add --netname "intnet" --ip --netmask --lowerip --upperip --enable

Change the "intnet" name to the same one you use for the VMs, and the IP ranges if needed. To make sure it’s created and enabled, run:

"c:\Program Files\Oracle\VirtualBox\VBoxManage" list dhcpservers

Doing this while a target VM is running may be glitchy (the DHCP server won’t work); restarting both the VM and VirtualBox might help with that.

If you want the host to be connected to the same network as the targets (use this option with great care, especially if you run untrusted binaries, such as precompiled exploits, on the target VMs), use host only adapters, which connect the VMs to each other and to the host.

On the VM side of things, enable DHCP networking and make sure the default firewall is disabled (all Windows versions since XP have it on by default), or at least poke holes through it for the apps you want to exploit remotely.

Setting up the target OSes for virtualization

On VMs where they are available, the VirtualBox Guest Additions greatly improve performance. The Guest Additions CD is part of the VirtualBox Extension Pack and must be downloaded separately (due to different licensing); it is available on the VirtualBox download page and is cross platform. After installing the Extension Pack, run the VM and click Devices->Install Guest Additions CD Image…. The process is automatic on Windows; on Linux, you may need to mount the CDROM and run the installation manually.

Shared folders are helpful to install software on the VMs (never give read-write access to entire volumes; create a specific shared folder and use that).

The shared clipboard is useful to copy values and commands to/from the host (Devices->Shared clipboard->Bidirectional).

Virtual machines


Download Kali and install it. Configure Metasploit and you’re good to go.

If your host is also a Linux machine, don’t forget to put your SSH public key in Kali root’s authorized_keys to simplify the process.

De-ICE 1.123

As a first target VM, a pre-built VM specially designed for pentesting such as De-ICE 1.123 is a good choice, as it has a set of vulnerable applications already installed and configured. De-ICE does not need to be installed (the OS runs directly from the ISO image and thus changes to the disk are non-persistent), so the VM does not require an attached hard disk.

Windows XP

As a second VM, Windows XP should be an easy target (both for the installation process, and for any actual pentesting). You need an ISO image of the OS installation disk (as with all the other OSes for the lab). Microsoft provides images for MSDN subscribers (e.g. Windows 8.1), but only for their currently supported OSes (8.1 and 10 at this time), and obviously only if you’ve purchased them. Downloadable ISOs for Windows XP SP3 still exist (e.g. xpsp3_5512.080413-2113_usa_x86fre_spcd.iso), but since Microsoft no longer supports Windows XP, I expect them not to be available for long. Using an old installation CD you have is also an option; use a CD ripping tool to obtain the ISO image.

This version of Windows (particularly pre-SP3) employs very few defenses against software exploitation, so it’s particularly useful for learning simple overflow exploits without having to worry about DEP/ROP.

Ubuntu 15.10

A recent Linux, such as the latest version of Ubuntu, is a good choice as a target VM particularly for web vulnerabilities; Apache and nginx are easy to install/configure. Most public vulnerabilities are in older versions than what you can get from the repos, but recent privilege escalation exploits (LPE) do exist.

More Windows / Linux versions

As OSes progress, developers attempt to make software vulnerabilities harder to exploit; a wider selection of OS versions, particularly for Windows, will allow you to develop exploits from very simple (on the still-used Windows XP) to very complex (involving ROP chains and dealing with ASLR, on post-Windows 7 versions). Evolving versions of the kernel are also available this way, which is particularly relevant to privilege escalation exploits.

Pre-built targets

The go-to pentesting target OS is Metasploitable, which is specially designed for testing the Metasploit framework, but can also be attacked "manually".

Aa growing number of pentesting target OSes (like the previousely mentioned De-ICE 1.123) are collected at vulnhub.com. Some of them (particularly the ones prepared for CTF competitions) are difficult to break, but some are easier. All of them are Linux based, so for exploiting actual vulnerable applications on Windows see the next section.

Software for the VMs

Debugging software

On Windows machines, having debugging software specifically designed for developing exploits is very useful. Immunity Debugger is a very popular choice (32 bits only), often accompanied by mona.py (to install mona, download and copy it to the C:\Program Files\Immunity Inc\Immunity Debugger\PyCommands folder).

Vulnerable applications

Although attacking vulnerabilities you compiled yourself in a test binary does have some pedagogic merit, there’s far more to gain from attacking actual vulnerabilities in public applications. The easiest way to find such apps is to approach the problem from the other end and look for exploits on sites which collect them, and then try to find a version of the application which is vulnerable to that exploit.

  • exploit-db is the best known exploit repository (started from the now defunct Milw0rm archive, and continuously updated since).
    • Some of the exploits have locally-archived copies of the vulnerable apps, you can find them in the page header.
  • PacketStorm is another well known repository, and it allows better exploit filtering. To find remotely-exploitable overflows, for example, build an URL manually by appending the appropriate tags to the find-by-tag URL:

Old versions of applications

After finding out which version of a software you need, getting an actual copy of the installation kit may prove challenging, particularly for very old, not very well known or very large applications. The following software repositories might help, as they retain older versions:

If all else fails, search the web for the specific version (it helps a lot if you can find the actual filename the kit used to have).

Written by vtopan

October 13, 2015 at 9:27 PM

Compiling a tiny executable (or DLL) in FASM (or MSVC) + templates – part II: MSVC

with one comment

Since creating tiny executables and DLLs in FASM was pretty straight-forward, we’ll challenge ourselves a bit: we’ll do it in MSVC. To be fully honest, creating a tiny executable using MSVC is actually more than a bit of a challenge, but it is doable.

An empty “int main(…) {return 0;}” program compiled with Microsoft’s Visual C++ compiler is already around 38 KB (using VS 2008, but other versions should yield similar results) due to the CRT (also calledcrt0, which implements (parts of) the C language runtime) code included in the executable. The CRT is responsible for things such as implementing the C API for working with files (fopen & co.), strings (strcpy & co.) etc., the malloc-based memory allocator, the well-known int main(…) entrypoint, etc.

[Sidenote] All Windows (PE) executables actually have a WinMain entrypoint, where a CRT stub is placed which, among many other things, parses the single-string command line and splits it into “parameters” passed as argv[] to main().

If we remove the CRT, we have to use Windows API functions to replace it (on Windows, obviously).

[Sidenote] We could alternatively link against a dynamic version of the runtime, but the code would still be much bigger than it needs to be, with the added pain of having a dependency on the Microsoft Visual C++ Redistributables, which have caused plenty of annoyance due to the somewhat cryptic error messages reported about some MSVCR##.DLL missing when the packages aren’t installed on the user’s machine.

The file management functions have good equivalents in the Windows API (start from CreateFile). The strings APIs are somewhat harder to find, but present. Memory allocation APIs are plentiful (LocalAlloc or HeapAlloc are good starts, or even VirtualAlloc for more complicated things). The fact that the WinMain() standard entrypoint does not provide “digested” arguments (like C’s argv/argc main() arguments) can also be handled using Windows API calls, but for some reason only the WideChar variant of CommandLineToArgv (i.e. CommandLineToArgvW) is implemented, so we’ll work with WideChar functions in the example.

Let’s create a 64-bit basic DLL loader and a sample DLL. First, the loader’s source code:

#include <windows.h>

    _In_ HINSTANCE hInstance,
    _In_ HINSTANCE hPrevInstance,
    _In_ LPSTR lpCmdLine,
    _In_ int nCmdShow
    WCHAR *cmdline;
    int argc;
    WCHAR **argv;
    /// we aren't interested in any of the standard WinMain parameters
    /// get the WideChar command line, split it
    cmdline = GetCommandLineW();
    argv = CommandLineToArgvW(cmdline, &argc);
    /// assume the first parameter is a DLL name, load it

    /// free the "digested" parameters

In the interest of keeping things simple, this code is fugly and evil (it doesn’t check error codes), but it does work.

We have a standard WinMain entrypoint. We get the WideChar version of the command line using GetCommandLineW(), then
split it using CommandLineToArgvW() into the more familiar (to C devs at least) argc/argv pair. We call LoadLibraryW on
the first argument, which we assume is the DLL name, then free the argv obtained from CommandLineToArgvW() and exit.

The DLL is as basic as possible, just a header and a source (the header is given to serve as a template, it’s not actually necessary in this case).

The win64dll.h file:

#pragma once

#define AWESOME_API __declspec(dllexport)
#define AWESOME_API __declspec(dllimport)

And the .c:

#include "win64dll.h"
#include <windows.h>

    HMODULE hModule,
    DWORD  dwReason,
    LPVOID lpReserved

    switch (dwReason)
        case DLL_PROCESS_ATTACH:
            MessageBox(0, "Hello!", "Hello!", MB_OK);
        case DLL_THREAD_ATTACH:
        case DLL_THREAD_DETACH:
        case DLL_PROCESS_DETACH:
    return TRUE;

Now for the hard part: compiling without the CRT. Since it’s easier to compile from the command line, or at least easier to write command line arguments than project settings/checkboxes in an article, let’s start off with compiling the loader above (presumably called loader.c) with the CRT and go from there:

cl.exe loader.c /link shell32.lib

Assuming your environment is set up correctly (see sidenote below about setting up the environment), this will produce a ~32 KB executable (on VS 2010 at least). The /link shell32.lib parameter is necessary because that lib contains the CommandLineToArgvW function we used. To get rid of the CRT, the /NODEFAULTLIB parameter is used. We’ll also need to explicitly define WinMain as the entrypoint, and add kernel32.lib to the library list, since it’s one of the “default” libs which is no longer automatically linked:

cl.exe loader.c /link shell32.lib kernel32.lib /nodefaultlib /entry:WinMain

[Sidenote] The linker parameters (the ones after /link) are NOT case-sensitive, either /nodefaultlib or /NODEFAULTLIB will work.

This should produce a 2560 byte executable, more than 12 times smaller than the CRT-based one, and doing exactly the same thing.

[Sidenote Q&A] Q. How to tell which .lib files are needed?
A. When the necessary .lib is not given as a command line argument, the compiler complains with an error message such as:

loader.obj : error LNK2019: unresolved external symbol __imp__LoadLibraryW@4 referenced in function _WinMain@16

This means the LoadLibraryW() function is missing it’s .lib; to find it, simply search for the LoadLibraryW text in the lib folder of the SDK (in a path similar to: “C:\Program Files (x86)\Microsoft SDKs\Windows\v7.1\Lib“) and use the lib it’s found in (in this case, kernel32.lib). As an alternative to searching in the (binary) .lib files, you can check the MSDN reference for the given function (search online for something like “MSDN LoadLibraryW”), which gives the lib filename in the “Library” section of the “Requirements” table at the bottom.

Compiling the DLL needs a few more paramters:

cl.exe win64dll.c /link shell32.lib kernel32.lib user32.lib /nodefaultlib /entry:DllMain /dll /out:win64dll.dll
  • user32.lib is needed for MessageBoxA
  • the /dll switch tells the linker to create a DLL
  • the exact name must be specified (via /out:win64dll.dll), because the linker defaults to creating .exe-s

This should yield a 3072 byte DLL. It’s slightly larger than the executable because of the export section (and the fact that each section’s size is rounded up to the FileAlignment (PE header) value, which defaults to 0x200 == 512, exactly the difference which we got, which also covers the small difference in size between the actual code produced).

After building the executable and the DLL, running:

loader.exe win64dll.dll

should pop up a “Hello!” message box, then exit.

[Sidenote] Setting up the cl.exe environment involves:

  • adding the VC\BIN subfolder of the MS Visual Studio installation to the %PATH% environment variable (should look like “C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin”)
  • adding to the %INCLUDE% environment variable the INCLUDE subfolders of:
    • the MS Platform SDK installation (e.g. “C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Include”)
    • the MS Visual Studio installation (e.g. “C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\Include”)
  • adding to the %LIB% environment variable the LIB subfolders of the VS and PSDK

The easiest way to do this is via a setupvc.bat file containing:

@echo off
SET VS_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0
SET SDK_PATH=C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A

After adjusting the paths as needed by your setup, run this before running cl.exe from the command line.

The first part of this text deals with tiny executables in FASM.

Written by vtopan

February 17, 2014 at 10:23 PM

Posted in C, programming, Snippets

Tagged with , , , , ,

Compiling a tiny executable (or DLL) in FASM (or MSVC) + templates – part I: FASM

with one comment

Having a tiny do-nothing executable can be useful for many purposes (or at least that’s what my malware-analysis-infested-background pushes me to believe). Nowadays, I use such executables to replace part of the half-a-dozen processes some evil software packages (*cough*Acrobat Reader*cough*) install on my system just to keep themselves updated. The same malware analysis experience kicks in again with an advice at this point: DON’T STOP SOFTWARE FROM UPDATING ITSELF unless you know what you’re doing!

Producing a tiny executable (32-bit PE to be precise) is easy with FASM (get FASM from here) and pretty straight-forward.

For example, save this as win32tiny.fasm:

format PE GUI 4.0
entry start

include 'win32a.inc'

section '.text' code readable executable

    push 0              ;;; pass an ExitCode of 0 to ExitProcess
    call [ExitProcess]

section '.idata' import data readable writeable
library kernel, 'KERNEL32.DLL'
import kernel, ExitProcess, 'ExitProcess'

All this program does does is import the function ExitProcess from kernel32.dll and call it as soon as it is executed.
The “a” in win32a.inc stands for ASCII; we’ll go for the WideChar variant in the 64-bit executable later on.

To compile it, open up a command line (cmd.exe), add FASM’s \include subfolder to the %INCLUDE% environment variable and run FASM.EXE on the above source:

set INCLUDE=%INCLUDE%;c:\path_to_fasm\include
c:\path_to_fasm\FASM.EXE win32tiny.fasm

If all went well, you’ll end up with a win32tiny.exe (which does absolutely nothing) of only 1536 bytes.

[Sidenote] To make compiling sources with FASM easier, add a fasm.bat to your %PATH% environment variable with the following contents:

@echo off
set INCLUDE=%INCLUDE%;c:\path_to_fasm\include
c:\path_to_fasm\FASM.EXE %1

A 64-bit variant of the executable looks very similar:

format PE64 GUI 4.0
entry start

include 'win64a.inc'

section '.text' code readable executable

xor rcx, rcx ;;; pass an ExitCode of 0 to ExitProcess
call [ExitProcess]

section '.idata' import data readable writeable
library kernel32,'KERNEL32.DLL'
import kernel32, ExitProcess,'ExitProcess'

The only differences are the PE64 format (which means the “PE32+” 64-bit format), the 64-bit version of the .inc file and passing the argument to ExitProcess via RCX, as per the Windows 64-bit calling convention (described here, with details about register usage here).

If we wanted to build a tiny DLL, things would actually be easier, since no imported functions are required (the DLL entrypoint simply returns “TRUE” to indicate successful loading):

format PE GUI 4.0 DLL
entry DllMain

include 'win32a.inc'

section '.text' code readable executable

proc DllMain hinstDLL, fdwReason, lpvReserved
mov eax, TRUE

The imports (“.idata”) section is gone, making the DLL only 1024 bytes long. The entrypoint changed to a DLL-specific one (see DllMain for details) and is now a proper function returning control to the loader (as opposed to the executables above, in which the call to ExitProcess makes any code after it irrelevant).

Building a 64-bit DLL simply requires adjusting the format to PE64 and the included header to the 64-bit variant, just like above.

Since we got this far, let’s have a look at a 64-bit DLL which might actually be used as a template, with proper imports and exports:

format PE64 GUI 4.0 DLL
entry DllMain</code>

include 'win64a.inc'

section '.text' code readable executable

proc Howdy
invoke MessageBox, 0, [title], NULL, MB_ICONERROR + MB_OK

proc DllMain hinstDLL, fdwReason, lpvReserved
mov eax, TRUE

section '.idata' import data readable writeable

kernel, 'KERNEL32.DLL',\
user, 'USER32.DLL'

import kernel,\
GetLastError, 'GetLastError',\
FormatMessage, 'FormatMessageA',\
LocalFree, 'LocalFree'

import user,\
MessageBox, 'MessageBoxA'

section '.data' data readable writeable
title db "Hello, world!", 0

section '.edata' export data readable

export 'win64tiny.dll',\
Howdy, 'Howdy'

data fixups
end data

It has an exported function called “Howdy” which shows a message box, and a few more imports (some unused) to show how you can have more than one imported function per DLL. It also uses “invoke” to perform the call to MessageBox to keep things simple. The “data fixups” at the end generates relocation information, without which any real-life DLL would be unusable.

Part two of this text deals with doing this in MSVC.

Written by vtopan

February 16, 2014 at 7:57 PM

Funky Python – code snippets

with 3 comments

Python is a great programming language for a number of reasons, but one of it’s best features is the fact that, as per Python Zen item #13, “There should be one– and preferably only one –obvious way to do it.” Working with the beast for a number of years, however, does expose one to some less pythonic and somewhat quirky design points (maybe even a few gotchas); here are some of them.

Python quirks & gotchas

1. Boolean values in numeric contexts evaluate to 0/1

It’s very intuitive (especially when coming to Python from C) for 0 values to evaluate to False in boolean contexts and non-zero values to True. Having False evaluate to 0 and True to 1 in numeric contexts is however less intuitive (and somewhat useless):

>>> a = [1, 2, 3]
>>> a[True], a[False]
(2, 1)
>>> True + True # this one was rather unexpected...

2. The default argument values are evaluated at the point of function definition in the defining scope

This is probably one of the most frequent gotchas out there:

>>> def a(b=[]):
...     b.append(3)
...     print b
>>> a()
>>> a()
[3, 3]

The proper way to do this is to set b’s default value to None in the declaration and set it to [] inside the function body if it’s set to None:

>>> def a(b=None):
...     if b is None: b = []        
...     b.append(3)

3. *Everything* is an object

Although it’s a fact which may escape even the best of programmers at the beginning, Python is actually object oriented to the core. Despite the fact that it allows you to write procedural (and even functionalish) code, everything is an object. Functions are objects, data types are objects etc. This:


doesn’t work because the period is parsed as part of the numeric token (think of 69.j or 69.e1); this however:


works. Being objects, functions can also have attributes:

>>> def a():
...    print a.x
>>> a.x = 3
>>> a()

This comes in handy e.g. for giving a decorated function the same internal name (for introspection purposes) as the original function:

def print_call_decorator(fun):
    def replacement(*args, **kwargs):
        res = fun(*args, **kwargs)
        print r'Call %s.%s => %s' % (inspect.getmodule(fun).__name__, fun.__name__, res)
        return res
    replacement.__name__ = fun.__name__
    return replacement

4. Generators, sets & dictionaries also have comprehensions (called “displays”)

As you probably know, list comprehensions are a great way to generate a new list from another iterable. But it goes further… Generators, sets and dicts also have something similar, called displays. The basic syntax is this:
generator = (value for ... in ...)
dict = {key:value for ... in ...}
set = {value for ... in ...}


>>> a = ['a', 'b', 'c']
>>> d = {x:a.index(x) for x in a}
>>> d
{'a': 0, 'c': 2, 'b': 1}
>>> d_rev = {d[x]:x for x in d}
>>> d_rev
{0: 'a', 1: 'b', 2: 'c'}

This makes reversing a dictionary for example much cleaner.
What makes displays even more fun are the *list* displays, which are essentially list comprehensions but with unlimited depth; using them to flatten a list of lists would look something like this:

flat = [x for y in lst for x in y]

The x/y order in the example is not a mistake; that’s actually the proper order.

5. GIL: Python threads aren’t

Not on multi-processor machines, at least. Yes, there is a threading module (aptly named), but due to the Global Interpreter Lock, threads of the same (Python) process can’t actually run at the same time. This becomes more of an issue when deploying native-Python servers, as they don’t get any benefit from the number of cores installed on the machine (drastically limiting the number of open sockets a Python process can handle at the same time as opposed to a native one written in C).

6. for has an else clause

…and so do the try…except/finally and while constructs. In all cases, the else branch is executed if all went well (break wasn’t called to stop cycles / no exception occurred). And while the else branch may be useful to perform when you want something to happen only if the cycle construct wasn’t “broken” (the classic example is handling the fact that the cycle hasn’t found the value it was looking for), try doesn’t really need an else clause, as the following are equivalent and the latter seems at least to me more readable:

  • with else:
  • without else:

7. Tuple assignment order

In tuple assignments, the left-values are eval’ed & assigned in order:

>>> a = [1, 2, 3]
>>> i, a[i] = 1, 5 # i is set to 1 *before* a[i] is evaluated
>>> a
[1, 5, 3]

This happens because tuple assignments are equivalent to assigning the unpacked pairs in order; the second line above is therefore equivalent to:

>>> i = 1
>>> a[i] = 5

8. Scope juggling

Inside a function, variables are resolved in the global scope if no direct variable assignment appears in the
function, but are local otherwise (making Python clairvoyant, as it is able to tell that something is going to happen later on inside the function, i.e. a variable will be set). Note that attributes and sequence/dict values can still be set, just not the “whole” variable…

a1 = [1, 2, 3]
a2 = [1, 2, 3]
b = 3
c = 4
def fun():
    global b
    print c # crashes, because c is resolved to the local one (which is not set at this point)
    print b # works, because the global directive above forces b to be resolved to the global value
    a1[0] = 4 # works, because a1 is not directly set anywhere inside the function
    a2[0] = 5 # crashes, because a2 is later on directly set
    c = 10
    b = 11
    a2 = 'something else'

Bonus facts

As a bonus for making it through to the end, here are some lesser known / less frequently pondered upon facts about Python:

  1. a reference to the current list comprehension (from the inside) can be obtained with:
  2. list comprehensions can contain any number of forin levels (this is actually documented). Flattening lists:
    flat = [x for y in lst for x in y]
  3. range() actually builds a list, which can be slow and memory-consuming for large values; use xrange()
  4. modules have a dict attribute .__dict__ with all global symbols
  5. the sys.path list can be tampered with before some imports to selectively import modules from dynamically-generated paths
  6. flushing file I/O must be followed by an os.fsync(…) to actually work:
  7. after instantiation, object methods have the read-only attributes .im_self and .im_func set to the current object’s class and the implementing function respectively
  8. some_set.discard(x) removes x from the set only if present (without raising an exception otherwise)
  9. when computing actual indexes for sequences, negative indexes get added with the sequence length; if the result is still negative, IndexError is raised (so [1, 2, 3][-2] is 2 and [1, 2, 3][-4] raises IndexError)
  10. strings have the .center(width[, fillchar]) method, which padds them centered with fillchar (defaults to space) to the length given in width
  11. inequality tests can be chained: 1 < x < 2 works
  12. the minimum number of bits required to represent an integer (or long) can be obtained with the integer’s .bit_length() method

Written by vtopan

March 17, 2011 at 12:37 AM

Posted in Python, Snippets

Switching to Linux (1)

leave a comment »

Since my first Linux “experience” (which happened some eight years ago during a CS lab at UTCN), every couple of years I spend somewhere between a few days to several weeks trying to switch over to “the other side”. The reasons behind these (as of yet futile) attempts revolve mostly around the concept of “freedom”, and are beyond the scope of this writing. What I find of greater interest is the evolution of Linux-based OSes, and in particular of their target audience, which shifts more and more from hardcore enthusiasts willing to spend countless hours setting up a new machine toward average computer users (even the ones of the “point-and-clicky” variety).

The “sparse” look

My first Linux (Mandriva) was text-only; X-Window was in an “almost-working” state (crashing often, and even more often not being able to start at all) on most Linux machines I touched back then. Some time later most distributions had GUIs, but all the relevant work was still done beyond the scenes by console programs, which is still the case today. The myriad of “flavors” (and window managers) has made it practically impossible to write even remotely portable GUI interfaces for Linux, so graphic interfaces get “strapped” (pardon the low-brow hint; it truly feels like le mot juste) on console programs.

And that’s precisely how most Linux GUIs look and feel like: painful. Most of the window space is simply wasted: text is larger than necessary (for most people) and controls are separated by vast amounts of empty space, giving the interfaces a very “sparse” look. And deeper on the causality chain of problems is the fact that there simply isn’t that much exposed functionality in most Linux GUIs. Although to a much lesser extent than it was the case years ago, you still have to drill down way beyond the graphical interface in order to accomplish most non-trivial tasks. And then there’s responsiveness. After being spoiled by native graphical interfaces (some optimized to the point of writing machine assembly-level code) with excellent responsiveness (as it is more often than not the case on Windows), the sensible lag experienced on most interactions with Linux GUIs tends to annoy me in a very subtle manner. Then there’s Java and Java-based GUIs, which make Linux GUIs feel lightning fast, but I won’t go there.

Along came Ubuntu

My most recent attempt at Linux started a couple of months ago, but was interrupted by even-more-work-than-usual at the job, and would have been completely forgotten and abandoned if not for a Linux-vs-Windows themed conversation with my coworkers. I complained about most of the things that annoyed me about Linux (no decent music player is near the top of my list), and after getting many answers along the main theme of “software X has come a long way since then“, I decided to actually give it (yet) another chance.

My previous experience with Ubuntu (8.04 I think) had been almost pleasant, by far more so than any other previously-tried flavor (most notable being Mandriva back when it was called Mandrake and Red Hat at home and CentOS at work), so Ubuntu 10.10 felt like the way to go. After some research regarding “the most popular Linux”, Linux Mint popped up as a tempting Ubuntu/Debian-based alternative. The sheer volume of documentation / user-assistance available for the vanilla Ubuntu convinced me to stick with it, and so far it has been the right decision: as I’ve become accustomed when setting up a Linux OS, I’ve had problems requiring “workarounds” from the first day.

The good

In spite of the minor technical “misadventures” during setup, the Ubuntu 10.10 GUI finally feels mature (and almost responsive *enough*). The themes look good, the fonts are readable even at smaller sizes etc. And then there’s the repositories: thanks to my recently-acquired 100Mbit Internet connection, in a few hours after the installation I was already playing a pretty good-looking FPS (Assault Cube) and enjoying it. I’m not much of a gamer, but on one hand I was curious how far free games have come along, and on the other I had a lot of blood-spill-requiring-frustration left over from working out what should have been minor kinks and turned into major research themes.

I actually managed to set up both my PPP Internet connection and a VPN to my workplace without much hassle, which is a notable first. The VPN actually works better than on Windows because I have a convenient checkbox option to only route traffic going towards the VPN server’s network through it (as opposed to manually deleting the route through the VPN server on Windows, because some clever bloke figured I *must* want *all* my Internet traffic to be routed through a gateway which only knows private addresses).

The bad

The NTFS driver (ntfs-3g). It’s not bad per se, in fact it also has “come a long way” and when it works, it works fine. But in one instance it *chose* not to work for me, which I found very frustrating and annoying. My problem (and it seems to be a rather common one) is that on a recently-acquired USB hard-disk Windows appears to have messed up either the partition table or the NTFS filesystem; the problem is that it only appears that way to the ntfs-3g driver. Which is not to say that it’s wrong (from what I could gather, the size of the filesystem is set to a larger value than there actually is room on the disk, the difference being of a few sectors = a few KB); it’s just that Windows doesn’t seem to mind and reads from/writes to the disk without problems. I imagine that if I were to write to those last few KB on the disk the data would be lost, but at least I can access the data on the disk, which ntfs-3g won’t allow, because it wouldn’t mount the disk even in read-only mode. Adding insult to injury, the “Tuxera CTO” (an otherwise friendly and helpful person) suggests (here) that the only solution to ignore the warning is to “change the sourcecode”. Booting back into Windows, backing up the data and reformatting the drive to a smaller size fixed the problem, but it shouldn’t have been necessary, and the “I know what’s right for you better than you ’cause I’m the pro” attitude was somewhat disappointing.

Another problem is the lack of a decent file manager. After using all the “commanders” (Norton Commander, then the amazing Dos Navigator and nowadays Total Commander), I’m used to having a software which can handle all file-related operations (and I do a lot of them for my day job) easily and efficiently. TC, which I wholeheartedly recommend on Windows, handles everything just fine. On Linux, so far I haven’t even been able to find a (GUI) file manager with an actual “brief” view mode; all of them insist on giving me a long line of information about each file, whether I actually need it or not, and waste about two thirds of the available screen space in the process. All the features offered by TC, not to mention the plethora of plugins available for it, are still far, far way. And since we’ve hit the sensitive point of software equivalents for Linux, here’s what I’ve managed to find so far.

Software alternatives for Linux

File manager

As mentioned above, I’m profoundly dissatisfied with what I’ve found so far. MC is a must, but lacks many features. Double Commander seems be the best contender, and is built to be similar to TC (going as far as plugin interchangeability, if only there were any ELF plugins for TC…), which is a plus.

Music player

After finding a decent music player (i.e. one which is stable and has a compact interface like WinAMP and the other *AMPs on Windows) has been a seemingly impossible feat for years, along came Audacious, and all became well.

Image viewer

If good file managers and music players are hard to come by in Linux, image viewers are far more challenging. Neither one seems to grasp the basic concept of viewing a folder of images; all of them insist on “organizing my collection of photos” (’cause it’s trendy to index collections of stuff), and offer either very cumbersome methods of simply browsing image folders, or simply no way at all (except for, of course, indexing/organising the folders into a collection). The excellent XnView image viewer for Windows has a multi-platform version aptly-called XnView MP, with the downside that development is favored for the Windows version and Linux builds don’t come for each version.


I’m still looking into options for a development IDE (for C and Python in particular), with no luck as of yet.

As far as web browsing is concerned, all the relevant alternatives for Windows (Opera, Firefox and Chrome) are present on Linux, and from the order of the above enumeration my browser of choice should be obvious enough.

For an office suite I use OpenOffice on Windows, which is also available on most platforms.

I strongly recommend Guake as a terminal and Gnome Do as a generic application/document opening method.

[To be continued]

Written by vtopan

November 11, 2010 at 2:56 AM

Mafia Wars tips, tricks & fixing problems

leave a comment »

Mafia wars

I, like many others, have fallen victim to Zynga’s Mafia Wars, the “game” that looks like just-another-text-based-massively-multiplayer-something-something not worth clicking into, and yet becomes highly addictive if given the chance (in fact, it’s how I got to have a Facebook account and the only actual use I’ve found for the afore-mentioned account). This is not a rant on it’s impressive success however, but a few useful things I’ve learned about the game.

Job help / boss defeat links

I got to writing this text because one of my Bangkok boss defeat links didn’t publish. And since that would’ve brought me ~500K – 1 million “Baht”, I *had* to figure out how to fix it. Since it might actually be of use to other MW-victims, here goes the info.

Your Facebook ID

You need your Facebook ID; hover over a link to your name/homepage anywhere on Facebook and get it from the link. The link should look something like: http://www.facebook.com/profile.php?id=your-facebook-ID&ref=nf

Bangkok – boss

The Bangkok boss defeat links provide the clickers with B$1000 and 6 XP, and the publisher with $500.000-5.000.000 Baht, so they’re a great thing.

Cool tools

I’ve found some very useful tools for speeding up routine operations (collecting bonuses, sending gifts etc.) in Mafia Wars made by MW-enthusiasts. I’m not sure they respect Zynga’s TOS (actually I have a strong hunch they don’t), so use with care (and measure, to keep the game fun for everyone):

Some of the bookmarklets help you choose what items to buy to improve your attack/defense, find ongoing wars in which you might help or advise you on whom to promote to your Top Mafia to maximize the bonuses.

Misc hints

  • the Zynga toolbar (yes, like everyone else, I hate toolbars) is safe to install and provides an extra “mini energy pack” of 25% of your energy every 8 hours (it also gives an item of less use). To actually get the bonus you have to start MW by clicking the “Play Now” button on the toolbar when the last text on the toolbar says “Ready” in green (instead of a countdown timer in red). For some reason, I also have to login fresh (delete cookies) to Facebook every time for it to work.
    Note: some methods have been published to get the mini pack without the toolbar; the loopholes have been patched, and if new ones will be found, they will be patched as well. Since it’s not that much of a hassle, you’re better off just installing the toolbar.
  • in Bangkok, the final Boss job gives a payout between $500.000-5.000.000 “Baht”, which can really help in starting your businesses, so make sure you publish it and enough friends help you
  • if you’re like me and only use your Facebook account for playing MW, you can find plenty of friends to add to your mafia in the comments on the MafiaWarsFans fan page; following that page can also keep you informed on game news and might get you extra Godfather points, offered there on random occasions by Zynga

Later edit

After Mafia Wars I played Travian, and then Lord of Ultima (which has amazing graphics for a browser game), and then I simply decided my time was pointlessly wasted on them, so I stopped. You can too! 🙂

Written by vtopan

January 25, 2010 at 7:33 PM

Recovering data from a dead Windows (NTFS) disk using Linux

with one comment

At some point in your IT-enthusiast life you must’ve had at least one dead HDD, off of which Windows wouldn’t boot anymore. Up until a while ago, particularly if the partitions were formatted with NTFS, the situation was pretty much hopeless. Nowadays, with very-much-improved NTFS support under Linux (and rather tolerant to faults compared to its native counterpart under Windows), it isn’t always so. If the HDD is in a “coma” (i.e. almost dead, but not still “sort of” kicking), booting off a Linux live CD might still help recover (some of) the data. Basic steps:

  1. Get a Linux live CD distribution which has good built in NTFS support (most of them have basic support by now) and ddrescue
  2. Boot off the live CD and use ddrescue to get a binary image of each partition or mount the partition(s) and copy the files to a safe place
  3. [If using the dd(rescue) approach] mount the images as drives under Windows and copy the files or be brave and mount the partition in a VM and try to actually boot it, at least as far as a command prompt (safe mode) or use a backup/partitioning tool to write the images to another disk

If you’re not paranoid about security (by nature or by job description), i.e. you don’t use EFS for your most sensitive data, you’re pretty much off the hook. If you’ve made the punishable-by-huge-amounts-of-pain mistake of using EFS and your disk crashed, as is my case, hope is as dimmed as the foresight of the folks who designed NTFS and used more than the actual user password to encrypt the data. As it turns out, to decrypt the files you need a certificate which can only be generated on the machine which encrypted the files, which is

Linux live CDs with NTFS support

I’ve tried SystemRescueCd, Trinity Rescue Kit, RIP Linux and plain vanilla Knoppix, and Trinity Rescue Kit appears to be the best: it has ntfstools / Linux-NTFS installed, and it didn’t hang on boot because of the failing HDD (other distros did). As a sidenote, I haven’t managed to boot the GUI (X) of any of the distros, as my laptop monitor/graphics card seems to be uncooperative with the standard drivers/VESA mode, but apart from the visual partition manager, everything works fine from the console anyway.

When choosing a distro, the main points to check are if it has the ntfs-3g driver (as recent a version as possible, as it keeps getting better at a fast pace) and the ntfstools / Linux-NTFS suite I mentioned earlier, especially if you’ve used EFS to encrypt your data (in which case the only viable solution appears to be ntfsdecrypt from that suite, which needs the certificate with which the files were encrypted, which in it’s turn needs you to boot the (dead) machine, but it appears to be the only way to get the data back).

Using dd/ddrescue to recover (NTFS) partitions

dd / ddrescue

The tool to move binary data from one place to another under Linux is dd. It also has a data-recovery-oriented cousin called ddrescue, which basically does the same thing, but is more fault-tolerant.
Basic dd usage:

dd if=/source of=/destination

if stands for input file and of for output file, and neither of them has to be an actual file (in the Windows sense); in the above example, /dev/sda1 is the first partition on the sda disk.
To back up just the MBR of the disk (the first 512 bytes) use:

dd if=/dev/sda of=/mnt/sdb1/saved/mbr.bin bs=512 count=1

This assumes that source disk is sda and that sdb1 is the partition to which you want to back up the data, so in your particular case they may need to be changed. See the next section if you’re not sure which disk is mapped to which name.
ddrescue uses fixed-position input (first) and output (second) arguments:

ddrescue -v /source /destination

The -v option makes ddrescue verbose (i.e. periodically print progress).
Note: by default, dd prints no progress/info until it’s job is finished. To check up on it’s progress, open another console (the terminals are mapped to Alt+N shortcuts in Linux, N >= 1, usually up to 4) and send it the USR1 signal. To do that, first you need to find it’s PID using ps:

ps -A|grep dd

Then, assuming the PID of the dd process is 3456, use kill:

kill -USR1 3456

That won’t actually kill the process, in spite of it’s name; it will just send it the USR1 signal, which makes dd print it’s current status (switch to the dd terminal to see it). The command’s name (“kill”) comes from it’s most frequent usage, which is to send a process the KILL signal (i.e. “kill” it).

Linux drive mapping

Linux maps your disks under /dev with names following the (“regex-like”) pattern [hs]d[abcd]. An h prefix means an (older) IDE disk, meanwhile an s prefix means a serial disk (usually an internal SATA or external USB disk). The individual partitions follow the disk naming + a digit to designate the partition number. So, for example, if you have a SATA disk with two partitions, the disk would be /dev/sda, the first partition would be /dev/sda1 and the second partition /dev/sda2.
To see the available disks/partitions, use ls (the Linux equivalent of dir):

ls /dev/sd*
ls /dev/hd*

To get extended disk info, use hdparm:

hdparm -I /dev/sda

The disks (actually the partitions) found under /dev need to be mounted before the files on them can be read/written; up until that point they are just huge blobs of binary data.

Note: for the rest of this writing, for simplicity’s sake, I’ll assume that sda is the broken disk and it has wto partitions, and that the recovered files/image go to sdb.

There are two ways to mount NTFS partitions: either using the default NTFS driver which comes with mount (ignores many problems, doesn’t care if Windows was improperly stopped & the drive was left “unclean”, read-only mode by default) or the ntfs-3g driver (more sensitive, read-write by default). Use the plain mount for the broken disk and the ntfs-3g version for the drives to which you need read-write access.
First off, you need to make appropriate folders for the partitions to be mounted under; standard practice is to do it under the /mnt folder. e.g.:

mkdir /mnt/sda1
mkdir /mnt/sda2
mkdir /mnt/sdb1

Note that the /mnt folder may not exist, in which case it must be created first: mkdir /mnt
Next, mount partitions from the broken disk (read-only):

mount /dev/sda1 /mnt/sda1
mount /dev/sda2 /mnt/sda2

The syntax of the mount command is straight-forward: mount /what /where; /what is the device, /where is the mount point in the filesystem. It takes other arguments, such as -t type to set the filesystem type, but NTFS is (nowadays) recognized automatically. The naming convention for the mount points is at your choice (you could mount the thing on something like /my/broken/disk/partition/number/1), but sticking to the “standard” /mnt path and using the original device’s name (or the partition letter if you’re more accustomed to that and a lazy typist, such as /mnt/c) is easier, and the help you find on the net will make more sense.
Last step in the mount process: mounting the destination disk in read-write mode (default for ntfs-3g):

ntfs-3g /dev/sdb1 /mnt/sdb1


mount -t ntfs-3g /dev/sdb1 /mnt/sdb1

The syntax is similar to the mount command; to check if the distro you chose has the ntfs-3g command built in, simply try to run it. If it doesn’t, choose another distro.

Copying the data

Run either dd or ddrescue (the latter is preferred if the disk is only partially readable):

dd if=/dev/sda1 of=/mnt/sdb1/saved/part1.bin


ddrescue -n /dev/sdb1 /mnt/sdb1/saved/part1.bin

WARNING: pay attention not to pass as the destination to dd/ddrescue entire disks unless you actually want their contents overwritten (which will be the case when you restore the saved image to a new disk); be sure to add a file name otherwise. The -n option prevents ddrescue from retrying error areas, which is usually what you want. If you have a disk which does yield data after enough retries, don’t use it.

Mounting the (NTFS) partition(s) from Linux/Windows

You can mount the newly backed-up partitions from Linux using the loop feature:

mount -o loop /mnt/sdb1/saved/part1.bin /mnt/part1

The partition can also be mounted directly from Windows using the ImDisk Virtual Disk Driver (free) or using some rather expensive commercial tools (google for alternatives).

Backing up/restoring partitions/whole disks

Alternatively, you can use the dd command to copy the entire disk and write the image to a fresh (identical) disk. Writing an image to a partition/disk using dd simply requires passing the disk as the of argument:
Restoring a partition:

dd if=/mnt/sdb1/saved/part1.bin of=/dev/sdc1

Restoring an entire disk:

dd if=/mnt/sdb1/saved/whole-disk.bin of=/dev/sdc

WARNING: be careful when overwriting raw partition/disk contents; choose other recovery methods unless you understand exactly what you’re doing.

Recovering files from raw data/deleted files: data carving

If the partition table/NTFS structure is broken and you can’t mount the partitions but you can read the binary data, you can use TestDisk to recover some of the files (the ones with a specific structure, such as images and music, are more likely to be found as opposed to, say, plain text files). This is basically the same thing that file recovery programs (such as Recuva) do on the unused space of a disk to recover deleted files.

Recovering EFS encrypted files

As I’ve mentioned in the opening paragraph, to recover EFS encrypted files, even under Linux, you need a recovery certificate. If you don’t have one, EFS file recovery software might help, but I’ve had little luck using them. I know of no open source/free software which does that, so you’ll probably have to use commercial software such as Advanced EFS Data Recovery from ElcomSoft (demo version available). The link called “encrypted file system recovery” from the following section details the process of manually extracting the required information for EFS recovery.

Further reading

Moral of the story

  • ALWAYS BACK UP YOUR IMPORTANT DATA. Seriously. Now. Go get some storage space (USB flash drive, external hard disk, even DVDs if you make a new one often enough, as they tend not to last very long) and copy your data on it. GO!
  • Don’t use EFS under NTFS. Use an alternative encryption solution, e.g. TrueCrypt. There are portable (i.e. works-from-flash-drive) editions of most encryption tools should the need arise, and they are reliable (I’ve used TrueCrypt without problems for quite a while now).
  • If you MUST use EFS, create a recovery certificate using CIPHER /R:filename (details here) and store it in a safe place.

Written by vtopan

November 15, 2009 at 11:51 PM