The examples in this section are divided into three groups: single variable (a simplified case of record style), record style, and spreadsheet style. A review of these examples will provide a good grounding in the use of the Data Prompter and the creation of header files for importing data with the General Array Importer.
The examples in the first group are generally more detailed than those in the second and third groups. Since examples often build on previous examples, it is recommended that you start at the beginning of a group.
The instructional sequence in each example begins with the initial dialog box of the Data Prompter. Most examples use the Data Prompter to create a header file and each example shows the header file produced. (For the syntax of keyword statements in a header file, see 5.3 , "Header File Syntax: Keyword Statements".) The command that invokes the Data Prompter and generates the initial dialog box is:
dx -prompter
It is recommended that you treat the first four examples as a unit and review them in sequence.
This example illustrates how a simple floating-point scalar field, on a regular grid, might be imported from a text file named "record_scalar". The origin of the grid is [1 3 2], with deltas of 0.5, 1, and 0.75 in the x, y, and z directions respectively.
/usr/local/dx/samples/data/record_scalar
Since the data order is Row (i.e., last index varies fastest), the first six data values are associated with positions [1 3 2], [1 3 2.75], [1 3 3.5 ], [1 3 4.25], [1 3 5], and [1 3 5.75 ]. (If the data are stored so that "last index varies slowest," Data order should be set to Column.)
file = /usr/local/dx/samples/data/record_scalar grid = 5 x 8 x 6 format = text interleaving = record majority = row field = field0 structure = scalar type = float dependency = positions positions = regular, regular, regular, 1.0, 0.5, 3.0, 1.0, 2.0, .75 end
Note the information that you have supplied directly (lines 1, 2, and 10). You can visualize the data file using the Visualize Data button in the initial Data Prompter window.
This example involves modifying the header file created in Example 1. The important difference is that the data here is cell-centered (connection dependent): instead of 240 data values (one for each of the 5 x 8 x 6 positions), there are 140 values (one for each of the 4 x 7 x 5 connections). The format is binary.
/usr/local/dx/samples/data/record_depconnections
file = /usr/local/dx/samples/data/record_depconnections grid = 5 x 8 x 6 format = text interleaving = record majority = row field = field0 structure = scalar type = float dependency = connections positions = regular, regular, regular, 1, .5, 3, 1.0, 2.0, .75 end
Note the information that you have supplied directly or changed (lines 1, 2, 3, and 9).
A data file may contain descriptive information in addition to the data to be imported. To import only the data, therefore, it is necessary to "skip" such information when the file is read. The header keyword statement enables you to do just that, by specifying a number of bytes or lines to be skipped or a string to be searched for. For example, suppose the scalar data field of Example 1 had 3 lines of descriptive text preceding the data.
/usr/local/dx/samples/data/record_withheader
file = /usr/local/dx/samples/data/record_withheader grid = 5 x 8 x 6 format = text interleaving = record majority = row header = lines 3 field = field0 structure = scalar type = float dependency = positions positions = regular, regular, regular, 1, .5, 3, 1.0, 2.0, .75 end
Note the addition of a header keyword statement (line 6).
By default, the Data Prompter names data fields in numerical order: field0, field1, and so on. But a data field can be named with a field keyword statement.
Once the data are imported into Data Explorer, you can, for example, extract the name (using the Attribute module) and include it in a caption (using the Caption module). So if there are two types of data (e.g., temperature and pressure), each can be automatically and appropriately labeled with an identifying name, thereby "tagging" the associated data for future reference. As a result, it is also possible to import a field by name when there is more than one field.
For this example suppose that the data in Example 1 are temperature values (see To save the header file).
file = /usr/local/dx/samples/data/record_scalar grid = 5 x 8 x 6 format = text interleaving = record majority = row field = Temperature structure = scalar type = float dependency = positions positions = regular, regular, regular, 1, .5, 3, 1.0, 2.0, .75 end
Note the change in the field keyword statement (line 6).
Being able to derive grid information directly from a data file is particularly useful if you import data with a standard format but with grid dimensions that vary from data set to data set. For example, if the first line of the data file is:
dimensions 100 300
you can use any of the following grid keyword statements to obtain the grid dimensions from the data file.
This statement says
This statement says to skip 11 characters and begin reading.
This statement says to start reading after the string marker "dimensions."
See "grid". See also B.1 , "General Array Importer: Keyword Information from Data Files" in IBM Visualization Data Explorer User's Guide.
Note: This derivation feature is not available with the Data Prompter.
The General Array Importer supports three representations, or "styles," of vector data: record, record-vector, and series-vector. The first two are illustrated here. For the third, see "interleaving".
Which representation matches the data depends on a characteristic called interleaving. In record interleaving, the data for each vector component are stored together in individual blocks (e.g., X0, X1,..., Xn, Y0, Y1,..., Yn). In record-vector interleaving, the components of each vector are stored consecutively (e.g., X0Y0, X1Y1,..., XnYn).
The following pair of examples illustrates the differences between the two representations and between the header files used to import them. The header files are identical in that they both specify a unit 2-vector that parallels the x-axis and is defined on a 5 x 4 regular grid. That is, the data consists of 20 instances of the vector [1 0].
In Example 7, the interleaving style of the data file is record:
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
/usr/local/dx/samples/data/record_vectordata1.
Note: To implement this change, you must click on Modify at the bottom of the dialog box. However, you can delay implementation to Step 5, and implement both steps at the same time.
file = /usr/local/dx/samples/data/record_vectordata1 grid = 5 x 4 format = text interleaving = record majority = row field = field0 structure = 2-vector type = int dependency = positions positions = regular, regular, 0, 1, 0, 1 end
In Example 7, the interleaving style of the data file is record-vector:
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
/usr/local/dx/samples/data/record_vectordata2.
file = /usr/local/dx/samples/data/record_vectordata2 grid = 5 x 4 format = text interleaving = record-vector majority = row field = field0 structure = 2-vector type = int dependency = positions positions = regular, regular, 0, 1, 0, 1 end
Note: If the interleaving is not specified, the default is record-vector.
This example illustrates how a 7-step time series of a single scalar field might be imported. The field is on a regular 5 × 5 grid, the data are connections dependent, and the style is record.
# Time-Series Data Time Step 1 12 9 14 1 10 16 7 20 19 6 11 15 18 8 13 17 Time Step 2 12 9 1 21 10 16 7 1 19 6 11 15 18 8 13 17 Time Step 3 12 1 14 21 10 16 1 20 19 6 11 1 18 8 13 17 Time Step 4 1 9 14 21 16 1 7 20 19 6 1 15 18 8 13 1 Time Step 5 12 9 14 21 1 16 7 20 1 6 11 15 18 1 13 17 Time Step 6 12 9 14 21 10 16 7 20 1 6 11 15 18 1 13 17 Time Step 7 12 9 14 21 10 16 7 20 19 6 11 15 1 8 13 17
/usr/local/dx/samples/data/record_series
Note: The new-line character "\n" must be included in the specification, and the spacing between it and the marker must match that in the data file (e.g., if "Time Step 1" and "\n" are separated by three spaces in the file, they must be separated by three spaces in the specification). This spacing is easily determined in the Data Browser by placing the cursor at each point and reading the corresponding offset value (see Figure 18).
file = /usr/local/dx/samples/data/record_series grid = 5 x 5 format = text interleaving = record majority = row header = marker "Time Step 1 \n" series = 7, 1, 1, separator = lines 1 field = field0 structure = scalar type = float dependency = connections positions = regular, regular, 0, 1, 0, 1 end
Note: For scalar data, as in this example, the interleaving keyword is not required (it defaults to record). However, when series data include vectors, this keyword must be included and the appropriate value specified. For more information, see "interleaving".
The header and end keywords make it possible to combine header information and data in the same file.
Note: Although the General Array Importer is designed to process files that contain both header information and data, the Data Prompter is not. It cannot create them or read them in. Such files, like the one in this example, must be created with an editor.
# The Importer disregards this line, since it is a comment line. grid = 5 x 5 dependency = connections type = byte structure = scalar format = ascii header = marker "Start data \n" end # There may be comments about the data here (e.g., who created it and # when). These will be passed over because of the marker specified # in the header keyword statement. Start data 17811 41218 3 9 1021913 1 71420
The end keyword marks the end of the header section. The header keyword statement specifies "Start data" as the search string and the next line as the start of the actual data. Note that if the data starts on the same line, the new-line character (\n) is not required as part of the marker (see also Step 6 of Example 8 in Enter "Time Step 1 \n").
The "positions" keyword, omitted in this example, defaults to an origin of [0 0] and deltas of [1 1].
To import record-style data, you must set the interleaving keyword to record, record-vector, or series-vector. (When using the Data Prompter, select Block for Field interleaving.) If the data includes vectors, select the appropriate vector interleaving, as discussed in "Examples 6 and 7. Vector Data"; see also "interleaving".
This example illustrates the importation of multiple scalar fields. The grid is 4 × 2 × 3, with an origin of [0 0 0] and deltas of [1 1 1]. The three data variables are scalar. The data file looks like:
Energy 2.158719 1.45419 1.566509 1.551361 2.215095 1.726923 2.080461 1.418617 1.373206 2.231642 1.316575 1.445211 1.673182 1.445737 1.820333 2.167849 1.721611 1.554906 1.604594 2.061092 1.398391 2.062042 1.996196 1.50964 Pressure 34.81398 18.81529 29.65139 42.499 22.96053 31.41604 19.92936 27.79935 26.34873 28.91081 21.17855 28.89354 6.320079 43.9068 6.597938 20.41342 14.83351 43.53309 16.36901 18.19812 4.628566 43.64742 44.99699 26.32183 Temperature 295.3329 302.5431 301.835 296.0127 297.8344 295.5451 301.6786 298.4496 302.0944 296.7458 296.3459 296.4179 303.1223 300.3094 297.9714 300.0774 299.1322 296.9368 302.096 294.8137 300.662 299.5744 304.1986 302.4216
The header file to import this data should look like:
file = /usr/local/dx/samples/data/record_multiscalar grid = 4 x 2 x 3 format = text interleaving = record majority = row header = lines 1 field = Energy, Pressure, Temperature recordseparator = lines 1 end
This example is identical to the preceding one except that each of the data variables is dependent on the connections between data points rather than on their positions. Thus there are only six data values per field (3 × 1 × 2). The data file looks like:
Energy 2.158719 1.45419 1.566509 1.551361 2.215095 1.726923 Pressure 34.81398 18.81529 29.65139 42.499 22.96053 31.41604 Temperature 295.3329 302.5431 301.835 296.0127 297.8344 295.5451
The header file to import this data should look like:
file = /usr/local/dx/samples/data/record_multiscalardepconn grid = 4 x 2 x 3 format = test interleaving = record majority = row header = lines 1 field = Energy, Pressure, Temperature recordseparator = lines 1 dependency = connections end
This example differs from the preceding one in that Energy and Temperature are dependent on the positions of the grid, while Pressure is dependent on the grid elements (connection dependent). The data file looks like:
Energy 2.158719 1.45419 1.566509 1.551361 2.215095 1.726923 2.080461 1.418617 1.373206 2.231642 1.316575 1.445211 1.673182 1.445737 1.820333 2.167849 1.721611 1.554906 1.604594 2.061092 1.398391 2.062042 1.996196 1.50964 Pressure 34.81398 18.81529 29.65139 42.499 22.96053 31.41604 Temperature 295.3329 302.5431 301.835 296.0127 297.8344 295.5451 301.6786 298.4496 302.0944 296.7458 296.3459 296.4179 303.1223 300.3094 297.9714 300.0774 299.1322 296.9368 302.096 294.8137 300.662 299.5744 304.1986 302.4216
The header file looks like the one shown at the top of the next page.
file = /usr/local/dx/samples/data/record_multiscalarmixed grid = 4 x 2 x 3 format = text interleaving = record majority = row header = lines 1 field = Energy, Pressure, Temperature dependency = positions, connections, positions recordseparator = lines 1 end
This example uses the same grid as the previous 3, but here the second data field (velocity) consists of 2-vectors. In Example 4, all the x-components of the 2-vectors are listed first, followed by all the y-components. For example, the x- and y-components of the first 2-vector are 34.81398 and 2.158719, respectively.
Energy 2.158719 1.45419 1.566509 1.551361 2.215095 1.726923 2.080461 1.418617 1.373206 2.231642 1.316575 1.445211 1.673182 1.445737 1.820333 2.167849 1.721611 1.554906 1.604594 2.061092 1.398391 2.062042 1.996196 1.509640 Velocity 34.81398 18.81529 29.65139 42.499 22.96053 31.41604 19.92936 27.79935 26.34873 28.91081 21.17855 28.89354 6.320079 43.9068 6.597938 20.41342 14.83351 43.53309 16.36901 18.19812 4.628566 43.64742 44.99699 26.32183 2.158719 1.45419 1.566509 1.551361 2.215095 1.726923 2.080461 1.418617 1.373206 2.231642 1.316575 1.445211 1.673182 1.445737 1.820333 2.167849 1.721611 1.554906 1.604594 2.061092 1.398391 2.062042 1.996196 1.509640 Temperature 295.3329 302.5431 301.835 296.0127 297.8344 295.5451 301.6786 298.4496 302.0944 296.7458 296.3459 296.4179 303.1223 300.3094 297.9714 300.0774 299.1322 296.9368 302.0960 294.8137 300.662 299.5744 304.1986 302.4216
The header file should look like:
file = /usr/local/dx/samples/data/record_scalarvector1 grid = 4 x 2 x 3 format = text interleaving = record majority = row header = lines 1 field = Energy, Velocity, Temperature structure = scalar, 2-vector, scalar recordseparator = lines 1, lines 0, lines 1 end
Note that the interleaving specified for the vectors (line 4) is record (see "interleaving") and that the record separator (line 9) specifies: one (1) line separating the Energy and Velocity data; no lines separating the records containing the components of the Velocity data; and one (1) line separating the Velocity and the Temperature data (see "recordseparator").
The data values in Example 5 are the same as those in Example 4, but the components of each vector in the Velocity field appear together (e.g., 34.813980 is followed by 2.158719 in the same row):
Energy 2.158719 1.454190 1.566509 1.551361 2.215095 1.726923 2.080461 1.418617 1.373206 2.231642 1.316575 1.445211 1.673182 1.445737 1.820333 2.167849 1.721611 1.554906 1.604594 2.061092 1.398391 2.062042 1.996196 1.509640 Velocity 34.813980 2.158719 18.815290 1.454190 29.651390 1.566509 42.499001 1.551361 22.960529 2.215095 31.416040 1.726923 19.929359 2.080461 27.799351 1.418617 26.348730 1.373206 28.910810 2.231642 21.178551 1.316575 28.893539 1.445211
6.320079 1.673182 43.906799 1.445737 6.597938 1.820333 20.413420 2.167849 14.833510 1.721611 43.533089 1.554906 16.369011 1.604594 18.198120 2.061092 4.628566 1.398391 43.647419 2.062042 44.996990 1.996196 26.321831 1.509640 Temperature 295.332886 302.543091 301.834991 296.012695 297.834412 295.545105 301.678589 298.449585 302.094391 296.745789 296.345886 296.417908 303.122314 300.309387 297.971405 300.077393 299.132202 296.936798 302.096008 294.813690 300.661987 299.574402 304.198608 302.421600
The header file should look like:
file = /usr/local/dx/samples/data/record_scalarvector2 grid = 4 x 2 x 3 format = text interleaving = record-vector majority = row header = lines 1 structure = scalar, 2-vector, scalar field = Energy, Velocity, Temperature recordseparator = lines 1 end
Note that the interleaving specified for the vectors (line 4) has been changed to record-vector and that the record separator (line 9) specifies one (1) line separating successive records.
A deformed regular grid (sometimes referred to as a warped grid) is one in which the positions are irregular but the connections are regular. In this example the grid is 5 × 4. The data consists of three records, the first two of which contain scalar data defined on the grid. The third contains 2-vector values defining the grid positions. The Data Prompter uses the reserved word locations as a field name for the x,y values of the grid positions. The data file contains no descriptive information.
/usr/local/dx/samples/data/record_deformed
file = /usr/local/dx/samples/data/record_deformed grid = 5 x 4 format = text interleaving = record-vector majority = row field = rainfall, temperature, locations structure = scalar, scalar, 2-vector type = float, int, float dependency = positions, positions, positions end
This example illustrating the importation of scattered data differs from Example 6 in only a few details, mainly in specifying the number of data points instead of the dimensions of a data grid.
file = /usr/local/dx/samples/data/record_deformed points = 20 format = text interleaving = record-vector field = rainfall, temperature, locations structure = scalar, scalar, 2-vector type = float, int, float dependency = positions, positions, positions end
The block keyword is used with record-style, fixed-format ASCII data to skip information in a block of data. For example, consider the following data file:
row 1 temperature 39 29 33 56 32 row 2 temperature 32 33 25 33 22 row 3 temperature 31 23 41 53 19 row 4 temperature 43 59 43 21 28 row 5 temperature 23 19 35 46 32
/usr/local/dx/samples/data/block_example.data
file = /usr/local/dx/samples/data/block_example.data grid = 5 x 5 format = text interleaving = record majority = row header = lines 1 field = field0 structure = scalar type = int dependency = positions block = 17, 5, 3 positions = regular, regular, 0, 1, 0, 1 end
The block statement instructs the importer to skip 17 characters and read 5 (temperature) values (per line in this case), reading each value from a field of three characters.
Importing columnar-style data requires setting the interleaving keyword to "field": Activate the Columnar toggle button in the Data Prompter initial dialog box or select "Field" for the Field interleaving option in the full prompter.
This example illustrates the importation of a data file that contains two variables (pressure and velocity) in spreadsheet style. The data are in row majority order (last index varies fastest) and organized in four columns: the first contains the pressure data; the other three, the velocity components. The grid is 5 × 8 × 6.
/usr/local/dx/samples/data/spreadsheet_2var
file = /usr/local/dx/samples/data/spreadsheet_2var grid = 5 x 8 x 6 format = text interleaving = field majority = row field = temperature, velocity structure = scalar, 3-vector type = float, float dependency = positions, positions positions = regular, regular, regular, 0, 1, 0, 1, 0, 1 end
This example differs from Example 6 in the preceding section ("Example 6. Deformed (Warped) Regular Grid") in its data style (spreadsheet), smaller data grid (5 × 4), and number of variables (1). Follow the first 7 steps of that example, except for the following:
/usr/local/dx/samples/data/spreadsheet_deformed.
file = /usr/local/dx/samples/data/spreadsheet_deformed grid = 3 x 4 format = text interleaving = field majority = row field = locations, field0 structure = 2-vector, scalar type = float, float end
This example uses the same data set as Example 2 but treats the values as scattered data points. The data file contains an x,y position followed by a data value. There are no implied connections for these data.
file = /usr/local/dx/samples/data/spreadsheet_deformed points = 12 format = text interleaving = field field = locations, field0 structure = 2-vector, scalar type = float, float end
The layout keyword is used to specify which locations in a data file are to be read, thereby avoiding interspersed text. In the example data file shown here, there are no implied connections between data values.
/usr/local/dx/samples/data/CO2fragment.lis
Note: The asterisk (*) at the beginning of the first data line and the interval scale following the data are for reference purposes only and do not appear in the actual file (see Steps 3 and 5f-j in this example).
VARIABLES AND SPECIFIED RANGES _________________________________________________________________________ EPOCH 01-Jul-1983 00:00:00.000 31-Dec-1987 00:00:00.000 LATITUDE -90.00 90.00 LONGITUD -180.00 180.00 CO2_CONC -10000.0 10000.0 EPOCH LATITUDE LONGITUD CO2_CONC *01-Jul-1983 00:00:00.000 -37.95 77.53 341.4 01-Jul-1983 00:00:00.000 -89.98 -24.80 341.0 01-Jul-1983 00:00:00.000 -7.92 -14.42 343.4 01-Jul-1983 00:00:00.000 -40.68 144.68 -100.0 01-Jul-1983 00:00:00.000 19.52 -154.82 341.9 01-Jul-1983 00:00:00.000 -14.25 -170.57 342.0 01-Jul-1983 00:00:00.000 2.00 -157.30 -100.0 01-Jul-1983 00:00:00.000 55.20 -162.72 335.3 01-Jul-1983 00:00:00.000 -75.67 -27.00 341.7 01-Jul-1983 00:00:00.000 -43.83 -172.63 341.3 01-Jul-1983 00:00:00.000 25.67 -80.17 343.8 01-Jul-1983 00:00:00.000 -4.67 55.17 339.1 01-Jul-1983 00:00:00.000 13.43 144.78 344.0 01-Jul-1983 00:00:00.000 19.53 -155.58 343.5 01-Jul-1983 00:00:00.000 76.23 -119.33 339.8 01-Jul-1983 00:00:00.000 40.05 -105.63 339.5 01-Jul-1983 00:00:00.000 66.00 2.00 338.7 01-Jul-1983 00:00:00.000 -64.92 -64.00 341.4 01-Jul-1983 00:00:00.000 71.32 -156.60 340.1 01-Jul-1983 00:00:00.000 17.75 -64.77 342.3 01-Jul-1983 00:00:00.000 38.75 -27.08 341.1 |-----------skip 33-------------| width 12 | width 12 | width 10 |
file = /usr/local/dx/samples/data/CO2fragment.lis points = 21 format = text interleaving = field header = marker "CO2_CONC \n" field = locations, CO2_concentration structure = 2-vector, scalar type = float, float layout = 33, 12, 0, 10 end
The 21 lines of data in the preceding example represent a portion of a larger file (/usr/local/dx/samples/data/CO2.lis) containing a time series with 53 members.
/usr/local/dx/samples/data/CO2.lis
file = /usr/local/dx/samples/data/CO2.lis points = 21 format = text interleaving = field header = marker "CO2_CONC \n" series = 53, 1, 1 field = locations, CO2_concentration structure = 2-vector, scalar type = float, float dependency = positions, positions layout = 33, 12, 0, 10 end
The General Array Importer assumes that the order of the data it imports is row majority (last index varies fastest). That is, on a 2-dimensional n × m grid, the order of data is:
f(X0,Y0), f(X0,Y1), ..., f(X0,Ym), f(X1,Y0), f(X1,Y1), ...
If the order of data is column majority (first index varies fastest), the order of data is:
f(X0,Y0), f(X1,Y0), ..., f(Xn,Y0), f(X0,Y1), f(X1, Y1), ...
The General Array Importer will accept column-majority data if you select "Column" for the Data order option in the Data Prompter.
The file /usr/local/dx/samples/data/temp_wind.lis. contains data in column majority order. A header file that imports this data is:
file = temp_wind.lis grid = 144 x 73 format = text interleaving = field majority = column header = lines 9 field = temperature, wind_velocity structure = scalar, 2-vector type = float, float dependency = positions, positions layout = 39, 14, 0, 14 positions = regular, regular, -178.75, 2.5, 90.0, -2.5 end
[ OpenDX Home at IBM | OpenDX.org ]