Understanding RVAs and Import Tables - by Sunshine

Download Example File: tutrvait.zip (including this tut and file discussed in this tut; right click -> save target)

1. Introduction

After having finally understood the section table of a PE, I started to look at the Import Table. In the Import Table is stored which functions from which DLLs are used by the prog. So it's quite interesting but much more complicated than the section table because we have to use RVAs quite often. I will say some words about them before really starting examining the Import Table.
Tool needed is a hex editor (I use Hex WorkShop). I will describe Import Tables in general and after that we will test our knowledge with an example. You should also have some knowledge about PE file-format. I'm also a beginner so don't blame me if not everything is absolutely right. I just want to help other newbies in understanding the PE file-format. Instead I would be happy if someone would tell me what is wrong! Last word here: Sorry for my bad English, it's not my mother tongue! Ok, lets start..

2. RVA

An RVA ("relative virtual offset") is used to describe a memory offset if the base address is unknown. So it's not the same like a file offset! If you have the section table it's easy to calculate a file offset from a RVA. I'll explain with my example file from the zip archive (see above).
The section table of this file looks like the following:

Section Virtual Size Virtual Offset Raw Size Raw Offset Characteristics
.text 0000024C 00001000 00000400 00000400 60000020
.rdata 000001DC 00002000 00000200 00000800 40000040
.data 000000E0 00003000 00000200 00000A00 C0000040

Ok, first you must find out in which section the given RVA is. Then you calculate the file offset like this:
File Offset := RVA - Virtual Offset + Raw Offset.
An Example: Assume we have the RVA 0x11A0. We see that this RVA is in the .text section (0x11A0 is bigger than 1000 and smaller than 2000). Raw offset of the .text section is 0x400. So File Offset is 0x11A0 - 0x1000 + 0x400 = 0x5A0.
If RVA for example is 0x30D2 (it's in the .data section) file offset is 0x30D2 - 0x3000 + 0xA00 = 0xAD2. That's it!
I suggest using an RVA calculator (if you don't have one get the one from my site) because you have to calculate many RVAs into file offsets.

3. Import Table

At Offset +0x80 after the PE Signature there is an RVA to the Import Directory. The Import Directory is an array of so-called IMAGE_IMPORT_DESCRIPTORs. There is one IMAGE_IMPORT_DESCRIPTOR for each dll which is used by our PE file. One IMAGE_IMPORT_DESCRIPTOR looks like:

+0
DWORD
OriginalFirstThunk
+04 DWORD TimeDateStamp
+08 DWORD ForwarderChain
+0C DWORD Name
+10 DWORD FirstThunk

OriginalFirstThunk: it's an RVA which points to an array of IMAGE_THUNK_DATAs. These are also RVAs, one for each imported function. This array never change. Note: Some linker set OriginalFirstThunk to zero, then we use FirstThunk.
TimeDateStamp and ForwarderChain: We don't look at them, these are advanced stuff.
Name: it's an RVA to the name of the dll.
FirstThunk: it's an RVA which points to an array of IMAGE_THUNK_DATAs. These are also RVAs, one for each imported function. This array changes!
(Note: I won't explain why there are 2 arrays because it's not necessary to get the Import Table)

One IMAGE_THUNK_DATA RVA points to a Hint (WORD) which is an index into the exporting DLL's name table, but we don't look at it further. Then it follows the name of the imported function.
One thing I want to mention: Sometimes functions are not imported by name by an ordinal. That is the case when after the hint there is no name but the DWORD 0x8000.

Ok, this was the theory. I just mentioned the most important things, just enough to get the imported functions of a PE file. If you wanna read a detailed description, then you have to read The PE File Format by LUEVELSMEYER!

Now let's start with our example file.. I don't think you really understood everything till now. Perhaps it gets clear if you use your new knowledge.
Ok, open utility.exe with the hex editor. The PE signature is at offset 0xB0. So at offset 0xB0 + 0x80 = 0x130 is the Import Directory RVA. There you find 44200000 which is in the correct order 0x00002044. RVA 0x2044 is file offset 0x844.

00000840 0000 0000 9020 0000 0000 0000 0000 0000 ..... ..........
00000850 8E21 0000 1020 0000 8020 0000 0000 0000 .!... ... ......
00000860 0000 0000 CE21 0000 0020 0000 0000 0000 .....!... ......
00000870 0000 0000 0000 0000 0000 0000 0000 0000 ................

So our Import Table begins at offset 0x0844. One IMAGE_IMPORT_DESCRIPTOR is 0x14 bytes long. So the first IMAGE_IMPORT_DESCRIPTOR is from 0x844 till 0x858 and the second IMAGE_IMPORT_DESCRIPTOR is from 0x858 till 0x86B. Then you see 0x14 bytes long zeros which is the end of our array of IMAGE_IMPORT_DESCRIPTORs. We know now that our file imports functions from 2 different dlls.
Let's have a look at the first one:
The RVA to name of the dll is at offset 0x844 + 0x0C = 0x850. It's 0x0000218E which is file offset 0x98E.

00000980 5570 6461 7465 5769 6E64 6F77 0000 5553 UpdateWindow..US
00000990 4552 3332 2E64 6C6C 0000 7500 4578 6974 ER32.dll..u.Exit
            

The first dll is USER32.DLL.
The RVA to OriginalFirstThunk is at offset 0x844, it's 0x2090 -> file offset is 0x890.

00000890 1821 0000 2421 0000 D620 0000 0A21 0000 .!..$!... ...!..
000008A0 FC20 0000 5C21 0000 6A21 0000 7E21 0000 . ..\!..j!..~!..
000008B0 E820 0000 3621 0000 C420 0000 4A21 0000 . ..6!... ..J!..
000008C0 0000 0000 5800 4372 6561 7465 5769 6E64 ....X.CreateWind


As you see there are 12 RVA between offset 0x890 and 0x8C0 (which means 12 functions are imported by USER32.DLL). The RVA at 0x8C0 is 0, so that's the end of this array. The first RVA is 0x2118 -> file offset 0x918. There you have

00000910 4375 7273 6F72 4100 9B01 4C6F 6164 4963 CursorA...LoadIc
00000920 6F6E 4100 DD01 506F 7374 5175 6974 4D65 onA...PostQuitMe


So Hint is 019B and the name of the function is LoadIconA. That is for the first function of the first dll.
The second RVA is at offset 0x894, it's 2124 -> file offset 0x924. Hint is 0x01DD and name of function is PostQuitMessage. So you check every RVA to get every function name of the first dll.

It's the same with the next dll.
Name is at offset 0x844 + 0x14 + 0x0C = 0x864. RVA there is 21CE -> file offset 0x9CE. As you can see it's KERNEL32.DLL. The OriginalFirstThunk of the second dll is at offset 0x844 + 0x14 = 0x858. RVA is 2080 -> file offset 0x880.

00000880 BA21 0000 A821 0000 9A21 0000 0000 0000 .!...!...!......

There are 3 RVAs which means 3 functions from this dll.
You do the same for every IMAGE_IMPORT_DESCRIPTOR to get all functions from all dlls.

4. Summary

1. Go to offset 0x80 in the Optional Header to get the RVA of the Import Directory.
2. There you find an array of IMAGE_IMPORT_DESCRIPTORs, every array 0x14 bytes long. 0x14 zeros end this array. The number of IMAGE_IMPORT_DESCRIPTORs is the number of imported dlls.
3. At 0x0C in every array you have an RVA which points to the name of the imported dll.
4. At the beginning of every array, you find the OriginalFirstThunk RVA. If it's zero, you use the FirstThunk RVA at offset 0x10 in the array instead.
5. Go there and you find an array of DWORDS, everyone points to a Hint (WORD) and the name of the imported function. 8 zeros end this array.
6. In case that after the Hint the bytes 0x8000 follow, the function is imported by ordinal, so there is no name.

Ok, that's all. And don't forget: There is much more to say about Import Tables and I only mentioned these things required to "read" the Import Table. Nevertheless I hope you were able to follow my tut and understand everything. Perhaps you have to read it some times more. It's also helpful if you read two or three other tuts concerning the same topic in order to really get it.
If you have questions, ideas, suggestions, don't hesitate to mail me.

Sunshine, May 2002


This Site is part of Sunshine's Homepage