BinaryImporter
Author : Ian Wang
Input Types : None
Output Types : VectorType
Date : 11 Jun 2003
Contents
Description of BinaryImporter
The Binary Importer reads a stream of bytes from a input file and converts them into a Vector type. It is a very
powerful tool that can read almost input file formats, but takes a requires a little understanding to be used
correctly. The most important feature is that the data items (bytes) read from the file are considered to form a
table of a specified number of rows and columns. For our examples we shall assume that twenty data items are being
read from the file, as such:
d1 d2 d3 d4 d5 ... d19 d20
This input could be considered to be a data set with 5 columns and 4 rows:
example 1:
d1 d2 d3 d4 d5
d6 d7 d8 d9 d10
d11 d12 d13 d14 d15
d16 d17 d18 d19 d20 (cols=5 rows=4)
Or as a data set with 10 columns and 2 rows:
example 2:
d1 d2 d3 d4 d5 d6 d7 d8 d9 d10
d11 d12 d13 d14 d15 d16 d17 d18 d19 d20 (cols=10 rows=2)
Or as 4 data sets, each with 5 columns and 1 row:
example 3:
d1 d2 d3 d4 d5 (cols=5 rows=1)
+
d6 d7 d8 d9 d10 (cols=5 rows=1)
+
d11 d12 d13 d14 d15 (cols=5 rows=1)
+
d16 d17 d18 d19 d20 (cols=5 rows=1)
The Binary Importer reads one data sets each time it is run, so in the last example data set 1 would be read the
first time Binary Importer was run, set 2 the second time and so on (this assumes that the file is not rewound every
run, see 'Rewind Input Stream').
Once the dimensions of the data set have been specified, Binary Importer can import either rows or columns from that
data set, and this can be all the rows/columns or just selected ones. So, assuming the dimensions from example 1
(cols=5 rows=4), we could just import column 2, or just import row 3:
import column 2:
d2 d7 d12 d17 (cols=5 rows=4)
import row 3:
d11 d12 d13 d14 d15 (cols=5 rows=4)
Alternatively, we could import columns 1-3 (columns 1, 2 and 3):
import columns 1-3:
d1 d6 d11 d16
+
d2 d7 d12 d17
+
d3 d8 d13 d18 (cols=5 rows=4)
In this situation, the columns read in are output by Binary Importer one after each other within the same run. So in
this scenario, a single run of Binary Importer would cause three vectors to be output.
Extending this a little further, we could just import rows 4,1,2 from columns 3+ (3 onwards):
import rows 4,1,2 from columns 3+
d18 d3 d8
+
d19 d4 d9
+
d20 d5 d10 (cols=5 rows=4)
Again in this situation single run of Binary Importer would lead to three vectors being output.
Using BinaryImporter
Once specifying the dimensions of the data sets, and the columns/rows to be imported, is understood, the rest of the
options should make sense. Binary Importer offers the following options:
Filename |
The name of the binary input file. |
Data type |
The type of data to be read in, e.g. an 8 byte Double or a 4 byte Integer (default=Double (8bytes))
|
Bytes per column |
Usually the number of bytes per column will be the same as the data type. For example, if reading a file of
Doubles (8bytes), then column 1 would start at byte 0, column 2 at byte 8, column 3 at byte 16 and so on.
However, if the binary file contains a mix of data types, then the 'one byte per column' option sets column
1 to start at byte 0, column 2 to start at byte 1, column 3 to start at byte 2 and so on (default=Same as
for data type).
|
Extract |
Import either columns or rows (default=Columns. |
Header offset |
The number of bytes that are skipped at the start of the input file (default=0). |
Number of columns |
The number of columns in a data set as described above (default=1). |
Number of rows |
The number of rows in a data set as described above. If the number of rows is not specified then Binary
Importer reads until the end of file is reached (default=1).
|
Extract columns |
The columns to be extracted, e.g. 1,3-12,15+, where 3-12 means extract all columns between 3 and 12
inclusive, and 15+ means extract all columns from 15 until the end of the data set. If not specified then
all columns are extracted (1+). (default=not specified)
|
Extract rows |
The rows to be extracted, e.g. 1,3-12,15+, where 3-12 means extract all rows between 3 and 12 inclusive, and
15+ means extract all rows from 15 until the end of the data set. If not specified then all rows are
extracted (1+). (default=not specified)
|
Reverse byte order |
Whether data is read from high byte to low byte (standard) or from low byte to high byte (reverse) (default=not
reverse)
|
Output on multiple nodes |
Whether each column/row imported is output on a seperate node (multi-node), or output in turn from a single
node (default=output on single node)
|
Header offset every iteration |
Whether the header offset is applied every time the importer is run, or only when the file is first loaded (default=not
offset every iteration)
|
Rewind input stream |
When the file is reset to the start: Every run means that the file is reopened from the start
everytime the importer is run; Automatic means the file is only reopened from the start when the end
is reached; Never means the file is never reopened from the start, once the end is reached further
runs of the importer do nothing. (default=Every run)
|