RAM Scraping: Reading passwords off your browser

In this post I want to point out, how it is possible to recover sensitive information like PayPal passwords from computer memory with a specific example. The idea relies on sensitive data still being stored in the process memory, even after TLS/SSL-encryption.

Intuitive approach
The attacking program is supposed to iterate through all memory pages of interesting processes, where interesting processes are browsers (on Windows being named “iexplore.exe”, “firefox.exe” and “chrome.exe”), in order to scan for essential HTTP parameters like “login_email=” and “login_password=” in order to find HTTP POST data like “login_email=john.doe@example.com&login_password=secret” and extract the email address and the password.

The search algorithm
Although the attack itself is very interesting but not too challenging, let us begin with the somewhat more challenging and – from the perspective of a computer scientist – more elementary part: the search algorithm. As we want to scan one large amount of binary data for a bunch of parameters, an algorithm allowing to scan for multiple words at the same time is required, preferably with a runtime of O(n). The intuitive idea is to create a deterministic finite automaton (DFA) from the given set of words. For example, if we search for the words “ab” and “car”, our automaton might look like this:

AutomatonThe programmatic creation of this automaton is difficult to some extent because it does not only need to match “ab” and “car” but also “cab”. If there is no edge from one state to another, the algorithm continues to search from the initial state. Please be aware, that every state which is not an accepting state still has edges for all characters of its input alphabet to the initial state. Those are simply not shown in the illustration for reasons of readability. This described algorithm is not new but a slight modification of an algorithm being known since 1975 as Aho-Corasick-algorithm.
The automaton is programmatically built by creating a non-deterministic finite automaton (NFA) which is then converted to the desired DFA using the powerset construction method. Conversion takes place so that only one pointer at a time is required to hold the current state which again allows to fulfill the runtime requirement of O(n). Furthermore, we convert the DFA (which is in my implementation based on the map-STL-library) to an array structure so that we are really sure to meet the runtime of O(n) as close as possible in order to make iterations over the maps unnecessary as you can see in the table below.

Array index Input character Projection index
-1 (not an element of the input alphabet)
0x61 (97) a 0
0x62 (98) b 1
0x63 (99) c 2
-1 (not an element of the input alphabet)
0x72 (114) r 3
-1 (not an element of the input alphabet)

The above table basically shows an array from 0x00 (0) to 0xFF (255) holding the indices of the input symbols. Knowing this projection, this allows us to implement an automaton using the following recursive structure:

struct StateArray
{
public:
	StateArray** Transitions;
};

In our case, an accepting state is represented by a NULL-pointer which saves us a byte (which is the size of a boolean) per state. The first asterisk is used to point to a dynamically allocated array which is possible as soon as the cardinality of the input alphabet is known and is based on the table above, the second one indicates a pointer to the successing state. In case of the example automaton, this means that any state of the example automaton is represented by an array consisting of four pointers.

Reading the process memory
To be honest, this is what I consider the ugliest part, although I do not really know the source of my aversion for the Windows API. What I (and many other people) find very surprising, is the fact that the Windows API allows to open and read the process memory of foreign processes without elevated priviliges. Practically, this means that it is possible to just read out any browser’s process memory without requiring administrator privileges. This makes it easy for trojan horses to just read off the unencrypted data. At the moment, I do not wish to further elaborate on this part, as you will be able to inspect the corresponding class (MemorySpy) in the code below.

Video example

Continue reading / Weiterlesen