Introduction
WAV files are just one example of a large class of files called
structured files. While we will use WAV files as an example, the
techniques discussed here can be applied to all structured files. In my office,
the WAV files are created by Avaya's IP Office, which is a business-class phone
system, which includes the ability to save recordings of phone
conversations. No doubt you have heard the disclaimer "Phone calls may
be recorded for quality assurance". That is exactly what we do.
Notepad files are unstructured; other than the actual characters of the note
meaning something to the reader, the file contents are simply a collection of
bytes carrying no other specific information.
The contents of a structured file are arranged in a
specific defined way. WAV files, image files, Word documents, Access
databases, and Executables are examples of structured files. In addition
to sound, pixels of the image, text of a document, database
objects, and machine instructions there is a lot more information
available.
The structure of a WAV file
Although there is some human-readable text (and a lot of funny characters),
if you open a WAV file in Notepad, the structure may not immediately be
evident. This is an example of what you might see:

If however, the same file is opened in a HEX Editor such as 010 Editor, the
structure becomes apparent:
.png)
This editor has built-in intelligence about many structured file types, so
I did not have to look on the Internet for the structure of a WAV file
(e.g.WaveFormat), which was nice. The image shows the
individual bytes in Hexadecimal notation. At the top right we see the same data
in Ascii. A period is displayed for bytes that don't have an Ascii
equivalent.
The bottom pane shows that the file is made up of five structures. The first
structure is called a WAVRIFFHEADER and it is highlighted. This structure starts
at 0 hex and has a size of C hex (12 decimal). After that comes a FORMATCHUNK
starting at C hex for a length of 18 hex (24 decimal), Then an UNKNOWNCHUNK
(more about this later), a FACTCHUNK, and finally the actual sound bytes in the
DATACHUNK.
If we expand these five structures by clicking the triangle to the left of
the entry, we see:
.png)
Great! Now we can see the actual structure of the file, not just random
bits. Each chunk starts with the name of the chunk, followed by the length of
the chunk, followed by more data. Name and Size would come in handy if I
wanted to process the file a byte at a time. Later we will see a more
efficient way of processing structured files. We can also see that some fields
have a fixed size and others a variable size. For example, the first field
groupID is always 4 bytes and always set to RIFF (there is a finer point here,
but out of scope of this article). The last field "samples" is an example of a
variable-length field: its size depends on the amount of sound contained in the
file, and its size is given by the previous chunkSize field.
In the case of my application, information like number of channels or
sample rate is interesting but not essential. My objective in wanting to know
how to get information from the WAV file is to find out who called
who. I then can associate each phone recording file with the correct
Company and Contact in our custom CRM application.
It turns out this information is available in the UNKNOWNCHUNK. This can
already be seen, to some extent, in the first screenshot above. Unfortunately,
as the name suggests, this was a part of the file that nothing was
known about. It is a vendor-specific chunk that Avaya and others can use, or
not, as they see fit.
Avaya does not publicly document this chunk so now it's time for some
detective work.
.png)
The data shows some numbers that we recognize as internal 3-digit
extensions, or external phone numbers (9 for outside line, followed by a
10-digit phone number), or today's date and time, Most of the texts seem to
fall in 32-byte blocks. A few bytes, like at 008Ch, we have no idea about. After
comparing several files and making some experimental recordings of our own (e.g.
what would happen if we forward an incoming call to another extension) we were
able to derive quite a bit about the structure of this unknown chunk.
First is the ChunkID which is set to ALCH rather than UNKNOWN, probably
because Avaya used to have a phone product named ALCHEMY.
Next is the Chunk Size, showing us this chunk is roughly 3 KB in size.
That is quite large and it tells us if we have a file with only one millisecond
of sound, it will at least be 3 KB in size.
Next is a 32-byte block with the name of the manufacturer. It is always set
to "Avaya" for these phone system WAV files. Your WAV files may have a different
manufacturer, or may not even include the UNKNOWNCHUNK.
Next is another 32-byte block with either the phone extension if it was an
outgoing call, or the phone number of the incoming call.
Next is the name of the user for that extension, or the caller ID of the
incoming call.
After that comes a 4-byte block we were not able to determine what it is. In
our type definition we call it "unknown1".
We were able to recognize several other fields. The details are in the type
definition for the VBA language:
Private Type ALCHCHUNK
chunkID As String * 4 'ALCH (Avaya used to have a product called Alchemy)
chunkSize As Long
manufacturer As String * 32 'Avaya
caller As String * 32 'Extension (out) or Phone number (in)
callerDisplay As String * 32 'User name (out) or caller id max 15 chars (in)
unknown1 As String * 4 'NOTE: binary data
unknown2 As String * 2 'always null?
numberDialed As String * 32
numberForward As String * 32
unknown3 As String * 4 'always null?
callerDisplay2 As String * 32
forwardDisplay As String * 32
unknown4 As String * 32
unknown5 As String * 32
unknown6 As String * 24
unknown7 As String * 32
forwardDisplay2 As String * 32
unknown8 As String * 4 'NOTE: binary data
voice As String * 32 'Voice
Direction As String * 32 'AutoRecordingIncomingUser or AutoRecordingOutgoingUser
date As String * 32
time As String * 32
extension As String * 32
unknown9 As String * 232
machineName As String * 32
unknown10 As String * 2176
End Type
Note the use of fixed size strings for the 32-byte text blocks. A Long in VBA
is four bytes so it can be used for the chunkSize.
With enough of the structure defined to start reading data, we moved on to
enhancing our CRM application.
Document Links Application
We have a database with companies and contacts. By decoding the WAV file
structure and then writing VBA to store each piece in the correct field, we are
able to associate phone calls (documents) with the correct records. In this
screenshot you can see the finished product. On the main form we have a new
"Docs" column with associated documents, including the count of documents. If
the user clicks on the paperclip icon, the Recordings form opens from where
basic information is displayed, and the recording can be played.
.png)
Let's look at how we created this new functionality.
First we need a table to store the Recordings information. Then we need a way
to fill it with available recordings. That's what the "Process Files" button on
the main form does:
Private Sub cmdProcessFiles_Click()
Dim strFile As String
Dim objWav As New clsWavFile
Dim rsRecordings As DAO.Recordset
Set rsRecordings = CurrentDb.OpenRecordset("Recordings", dbOpenDynaset)
strFile = Dir$(RECORDINGS_FOLDER & "*.WAV")
While strFile <> ""
'Check if new file.
rsRecordings.FindFirst "FileName='" & strFile & "'"
If rsRecordings.NoMatch Then
'It's a new file. Read it and save important fields to table.
objWav.Read RECORDINGS_FOLDER & strFile
rsRecordings.AddNew
rsRecordings!CustomerID = GetCustomerID(IIf(objWav.Direction = "I", objWav.FromPhoneNumber, objWav.ToPhoneNumber))
rsRecordings!FileName = strFile
rsRecordings!Direction = objWav.Direction
rsRecordings!FromPhone = objWav.FromPhoneNumber
rsRecordings!ToPhone = objWav.ToPhoneNumber
rsRecordings.Update
End If
'Prepare for next iteration
strFile = Dir$
Wend
'Requery the form so the recordings will show.
Me.Requery
'Final cleanup
rsRecordings.Close
Set rsRecordings = Nothing
Set objWav = Nothing
End Sub
After declaring some variables (we will discuss clsWavFile shortly) in line 6
we open a recordset on the Recordings table.
In line 8 we set up a loop over all *.WAV files in the folder with
recordings.
The Avaya phone system ensures filenames are unique, so in line
11 FindFirst is used to check if the current file has already been
processed. If it's a new file, then in line 14 the WAV File object is used
to read the file.
From line 16 onward a new record is added to the Recordings
table, setting the field values equal to the properties of the WAV File
object. Note that the code here knows nothing about the structure of a WAV file;
it only knows how to use some methods and properties of the WAV File object.
In line 26 we move to the next file, if any.
After some cleanup we are done and the main form will display the latest
recording information!
WAV File class
It is a good idea to encapsulate specific functionality like processing a WAV
file in its own class. When you download and read the code in this class you
will see that there is very little code. This is because of the beauty of
structured files: once you know the structure and have defined it, reading the
many fields of information takes only one line of code, using VBA's "Get"
statement. For example to read the UNKNOWNCHUNK, we have defined the structure
as the Private Type ALCHCHUNK listed above. We then define a variable of that
type:
Private m_ac As ALCHCHUNK
Then we read the entire chunk (all 3 KB) with one line of code:
Get #intFile, , m_ac
The current WAV File class only contains the methods and properties I needed.
However it is very easy to add your own. Say you want to show how many channels
of sound there are, or what the sample rate is. You simply
create additional Property Get procedures and return the values from the
FORMATCHUNK:
Public Property Get SamplesPerSec() As Long
SamplesPerSec = m_fc.dwSamplesPerSec
End Property
Summary
Structured files usually contain lots of information, in addition to the raw
bytes for the file type at hand. We extracted caller information from WAV files
for use in our CRM application. Now that you see how easy it is to
look inside files with a HEX editor, you should try it! It takes a little
time to figure things out but the rewards are great.
Through the link below you can download the sample program in Access
2007 format.
DocumentLinks.zip
(509.34 kb)