Merge Files

Overview

This application merges sets of sections from one or more dataset files into a new dataset and can be used to reorder the sections as well. The program has two interfaces: a graphical interface which assumes that the datasets can be stacked along either the z, wavelength, or time dimensions (the sets must have the same dimensions in the other four dimensions) or a command line interface which in addition to the capabilities of the graphical interface allows merging of sequences of sections without regard to how they are structured in z, wavelength, or time. In either case, the merge/reorder operation can not be done in place: the output file must be different than any of the input files.

Topics

Overview | Graphical interface | Command line interface

Related Priism Topics

Priism | CopyRegion


Graphical Interface

You can start the graphical interface by selecting the Merge/Reorder option from the File menu in Priism or by entering mergemrc_i at the command line.

The key part of the interface is the central section; it is there that you construct the list of ranges of sections to combine in the output file. Once you have selected that list, enter the name of the output file in the top line of the dialog (or you can press the Output file button to bring up a file selection dialog) and then press the DoIt button at the bottom right to generate the result.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Pixel Mode

When working with the graphical interface, use this menu to modify the numeric format used to store that value at each pixel in the output file. The options are:

Same as first input file
The format used is the one used in the first file listed under Datasets to merge.
Unsigned 8 bit integer
This is also referred to as Byte mode.
Signed 16 bit integer
This is also referred to as Short mode.
Floating-point
Each pixel is stored as a 32 bit IEEE floating-point value.
Complex (2 signed integers)
Each pixel is treated as a complex number; the real and imaginary parts are each stored as 16 bit signed integers.
Complex (2 floating-point values)
Each pixel is treated as a complex number; the real and imaginary parts are each stored as 32 bit IEEE floating-point values.
Unsigned 16 bit integer
This format is new in IVE 4 and is not a standard MRC format.
Signed 32 bit integer
This format is new in IVE 4 and is not a standard MRC format.
Unsigned 4 bit integer
This format is new in IVE 4.4.0 and is not a standard MRC format.

When converting to complex values from a real value, mergemrc sets the imaginary part to zero. When converting from a complex value to a real value, mergemrc discards the imaginary part.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Interleaving

When working with the graphical interface, use this menu to set how the z, wavelength, and time dimensions are arranged in the output file. The options are:

Non-interleaved (ZTW)
Sections with that same time and wavelength indices are adjacent in the output file and appear in order of increasing z index. Given two sections with the same wavelength indices and different time indices, the one with the smaller time index appears first. Given two sections with different wavelength indices, the one with the smaller wavelength index appears first.
Interleaved (WZT)
Sections with that same z and time indices are adjacent in the output file and appear in order of increasing wavelength index. Given two sections with the same time indices and different z indices, the one with the smaller z index appears first. Given two sections with different time indices, the one with the smaller time index appears first.
Interleaved (ZWT)
Sections with that same wavelength and time indices are adjacent in the output file and appear in order of increasing z index. Given two sections with the same time indices and different wavelength indices, the one with the smaller wavelength index appears first. Given two sections with different time indices, the one with the smaller time index appears first.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Extended Header

You can perform two two operations on the extended header when merging and reordering files. The first is, for each section written to the output file, to copy part or all of the extended header entries that were associated with that section. In the graphical interface, enable this operation by turning on the toggle labeled Copy from input data.

The second operation is to reserve space in the output file for an extended header. In the graphical interface, when the toggle labeled Set size per section is off, the space reserved depends on whether or not you have enabled copying of the input extended header. If you have enabled copying, mergemrc reserves sufficient space for that header (based on the number of entries per section in the first file listed in the Datasets to merge section); otherwise, mergemrc reserves no space for the extended header. When the Set size per section toggle is on, mergemrc reserves space for the given number of integers and floating-point values per section. If copying the extended header from the input files and the reserved space is insufficient, mergemrc does not transfer the extra entries for each section. If the reserved space is bigger than the copied extended header or you request not to copy the extended header, mergemrc sets the additional extended header entries to zero.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Datasets to Merge

The merge/reorder process works with a list of regions selected from one or more files. For the graphical interface, each region in the list has an entry in the box in the center of the dialog. When you press the DoIt button, mergemrc merges the regions along either the z, wavelength, or time dimension but preserves the other four dimensions (to set the dimension along which mergemrc merges the regions, use the menu labeled Append datasets along). The order of the regions in the list determines the order in which they appear in the output file. For instance, if mergemrc merges two regions along their wavelength dimension, the n wavelengths in the first region become the first n wavelengths in the output file. The m wavelengths in the second region become the n+1 through n+m wavelengths in the output file.

You can perform five basic operations on the list of regions:

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Append Mode

When combining and reordering datasets with the graphical interface, the selected regions from each dataset must have the same sizes in four of the dimensions: x and y and in two of the other dimensions (z, wavelength, or time). The output file will have the same sizes in these four dimensions, but in the fifth dimension the size will be the sum of the sizes for the fifth dimension for each of the selected regions.

Use the menu labeled Append datasets along to select which dimension should be the fifth dimension along which mergemrc combines the selected regions.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Z, wave, and time sizes

When working with the graphical interface, this read-only field shows the dimensions of the dataset that is the source for the currently selected region. If for some reason the dataset is not readable, this field will be blank.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Z start index, size, step

When working with the graphical interface, this field shows which z sections from source dataset of the currently selected region are to be used. The first value is the index of the first section to use; valid indices range from zero to the z dimension of the dataset minus one (the z dimension is the first value shown in the Z, wave and time sizes field). The second value is the number of sections to use. The third value is the amount by which the z index will be incremented to find the next section that will be read from the dataset. Any value for the increment and any positive value for the size are allowed as long as

     (start index) + (size - 1) * step

falls between zero and the number of z sections in the dataset minus one. In particular, you can use a negative step to reverse the order in which the z sections will appear in the output file or a step of zero to copy a single section to multiple sections in the output file.

The Reverse button next to the field is just a way to change the values in the field to cover the same set of z sections in reverse order.

When you select a new file with the File button or by changing the contents of the field next to that button, mergemrc changes the start index, size, and step to values that would select all z sections in the file: start index will be zero, the size will be the z dimension of the file; and the step will be one.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Wavelength start index, size, step

When working from the graphical interface, this field shows which wavelengths from the source dataset for the currently selected region are to be used. The first value is the index of the first wavelength to use; valid indices range from zero to the number of wavelengths in the dataset (the number of wavelengths is the second value shown in the Z, wave and time sizes field). The second value is the number of wavelengths to use. The third value is the amount by which the wavelength index will be incremented to find the next wavelength that will be read from the dataset. Any value for the increment and any positive value for the size are allowed as long as

     (start index) + (size - 1) * step

falls between zero and the number of wavelengths in the dataset minus one. In particular, you can use a negative step to reverse the order in which the wavelengths will appear in the output file or a step of zero to copy a single wavelength to multiple wavelengths in the output file.

The Reverse button next to the field is just a way to change the values in the field to cover the same set of wavelengths in reverse order.

When you select a new file with the File button or by changing the contents of the field next to that button, mergemrc changes the start index, size, and step to values that would select all wavelengths in the file: start index will be zero, the size will be the number of wavelengths in the file; and the step will be one.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Time start index, size, step

When working from the graphical interface, this field shows which time points from the source dataset for the currently selected region are to be used. The first value is the index of the first time point to use; valid indices range from zero to the number of time points in the dataset (the number of time points is the third value shown in the Z, wave and time sizes field). The second value is the number of time points to use. The third value is the amount by which the time index will be incremented to find the next time point that will be read from the dataset. Any value for the increment and any positive value for the size are allowed as long as

     (start index) + (size - 1) * step

falls between zero and the number of time points in the dataset minus one. In particular, you can use a negative step to reverse the order in which the time points will appear in the output file or a step of zero to copy a single time point to multiple time points in the output file.

The Reverse button next to the field is just a way to change the values in the field to cover the same set of time points in reverse order.

When a new file is selected using the File button or changing the contents of the field next to that button, the start index, size, and step will be changed to values that would select all time points in the file: start index will be zero, the size will be the number of wavelengths in the file; and the step will be one.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


X/Y Range

In the process of merging and or reordering the input sections, a rectangular region of each section can be transfered to the output file with the restriction that the region have the same size and same indices for the lower left corner for all sections. For the graphical interface, when the toggle labeled Use full X/Y from first file is on the x and y indices for the lower left corner are zero, and the size of the region is the size of a section from the first file listed in the Datasets to merge section. Otherwise, the indices for the lower left corner are the values in the X start index and Y start index fields and the x and y dimensions of the rectangle are the values in the X size and Y size fields.

The input files can differ in size, but the selected rectangular region must fall within the bounds of any of the files or mergemrc will report an error and will not generate an output file.

Topics

Overview | Graphical interface | Command line interface | Pixel mode | Interleaving | Extended header | Datasets to merge | Append mode | Sizes | Z range | Wavelength range | Time range | X/Y range


Command Line Interface

The syntax for running the merge/reorder application at the command line is (optional parts are included in brackets):

  mergemrc [-interleave=interleaveType] [-mode=pixelType] \
       [-x=xStart:xSize] [-y=yStart:ySize] \
       [-no_copy_extended] \
       [-extended_size=nIntegers:nFloats] \
       [-specify_out] [-append_waves] [-append_z] [-append_times] \
       outputFile \
       [-f=file] \
       [inputFile1 [-in_sections=start[:size[:step]]] \
       [-in_z=start[:size[:step]]] [-in_w=start[:size[:step]]] \
       [-in_t=start[:size[:step]]] \
       [-out_sections=start[:size[:step]]] \
       [-out_z=start[:size[:step]]] [-out_w=start[:size[:step]]] \
       [-out_t=start[:size[:step]]]] \
       ...

For examples of how mergemrc could be used, see Command Line Examples. Below are descriptions of the input options and arguments:

-interleave
-interleave=interleaveType specifies how the z, wavelength, and time dimensions should be organized when stored in the output file. Valid options for interleaveType are ztw (this is the default when -interleave is not used), wzt, and zwt.
Return to the syntax summary
-mode
-mode=pixelType specifies the numeric format to use for each output pixel. Valid options for pixelType are unsigned8, signed16, float32, complex_signed16, complex_float32, unsigned16, signed32, and unsigned4. When -mode is not used, the program uses the same format as was used in the first input file on the command line.
Return to the syntax summary
-x and -y
-x=xStart:xSize and -y=yStart:ySize can be used to only output a rectangular portion of the input sections. Only pixels whose x index is in the range [xStart, xStart + xSize - 1] and whose y index is in the range [yStart, yStart + ySize - 1] are written to the output file. Indices start from zero. If you do not specify -x, mergemrc assumes a value of zero for xStart and a value equal to the number of pixels in x for xSize. Similarly, mergemrc uses zero for yStart and the number of input pixels in y for ySize if you do not specify -y.
Return to the syntax summary
-no_copy_extended
When -no_copy_extended appears on the command line, mergemrc does not copy the extended header entries from the input files and fills with zeroes any extended header reserved with the -extended_size option. If -no_copy_extended does not appear on the command line, mergemrc transfers the extended header entries for the sections written to the output. When the output extended header has fewer integer entries per section than the input header, mergemrc does not transfer the additional entries in the input header. If the output extended header has more integer entries per section than the input header, mergemrc fills the extra entries with zeroes. mergemrc truncates or pads the floating-point values in the extended header in a similar fashion.
Return to the syntax summary
-extended_size
Use -extended_size=nIntegers:nFloats to reserve space for the extended header where nIntegers is the number of integers per section in the extended header and nFloats is the number of floating-point values per section. When -no_copy_extended appears on the command line and -extended_size does not, mergemrc assumes that nIntegers and nFloats are both zero. If you use -no_copy_extended and -extended_size, the number of extended header entries per section is the same as in the first input file on the command line.
Return to the syntax summary
-specify_out
To use the -out_z, -out_t, -out_w, or -out_sections, you must specify -specify_out on the command line. You may not use -specify_out with -append_waves, -append_z, or -append_times. If you use -specify_out, mergemrc does not transfer wavelength information from the input files to the output file. If you do not use -specify_out, mergemrc, by default, creates an output file with wavelength and time dimensions of 1, appends the input ranges along the z dimension, and does not preserve the wavelength information from the original data.
Return to the syntax summary
-append_z, -append_waves, and -append_times
If you do not use -specify_out mergemrc creates, by default, an output file with wavelength and time dimensions of 1, appends the input ranges along the z dimension, and does not preserve the wavelength information from the original data. You can use -append_waves, -append_z or -append_times to override this default and preserve more of the arrangement of the input ranges. In general, the -append_x option, where x is z, waves, or times, causes mergemrc to combine the input files along the x dimension but leave the other dimensions alone (you can still use the -in_z, -in_w, or -in_t options to select sections from the input). The order in which the input files appear on the command line is important because mergemrc appends the files along the dimension that is changed in that order. Also, the input ranges must have the same size along the other dimensions (this restriction is applied after the -in_z, -in_w, or -in_t options are applied). As an example, if you want to create a file which has wavelength 1 from file A, wavelength 3 from file B, and wavelengths 2 and 3 from file A in that order,
mergemrc -append_waves A -in_w=0:1 B -in_w=2:1 A -in_w=1:2
does the trick as long as A and B have the same dimensions in z and time.
Return to the syntax summary
outputFile
outputFile is the name mergemrc will use use for the output data file. If a file with that name exists, mergemrc will overwrite it; this implies that the merge operation can not be done in place.
Return to the syntax summary
-f
A -f=file option supplies the name of a text file that mergemrc will read for input section ranges. Use a single dash for the file name to cause mergemrc to read the information from standard input. The format of the file is just like specifying the section ranges on the command line except that newlines are treated as spaces and you may not use -f options in the file (i.e. you may not nest -f options). You may use any number of -f options on the command line, and you may mix them with directly specified input ranges.
Return to the syntax summary
inputFile
On the command line, specify an input range of sections by supplying a filename followed directly with options to control the sections to extract and, if you use the -specify_out option, options to specify where to write the output sections.
Return to the syntax summary
-in_z, -in_w, -in_t, and -in_sections
The options to set the input range of sections are -in_sections, -in_z, -in_w, and -in_t. You may not use -in_sections if you use -append_waves, -append_z, or -append_times, and for one range, you may not use -in_sections in combination with -in_z, -in_w, or -in_t. Finally, you may not use one of the -in_z, -in_w, or -in_t options more than once for a single range. With those restrictions, the following are not allowed: The last two can be expressed in alternate ways which are allowed: For a given -in_x option, you may omit the size and step. If you omit the size and step, the range is from the starting point through the maximum index in the dimension with a step of one. If you omit the step, mergemrc will use a step of one. A negative step is allowed (useful for reversing the order); and a zero step is allowed (useful for filling the output with repeated copies of a section or set of sections). If you do not specify any -in_x for a file, mergemrc uses all of its contents; that is equivalent to -in_z=0 -in_w=0 -in_t=0 if you use -append_waves, -append_z, or -append_times or is equivalent to -in_sections=0 otherwise. If you use one or more of -in_z, -in_w, or in_t but omit one or more, mergemrc assumes the omitted ones cover the full range of the file. If you do not use -append_waves, -append_z, or -append_times and you use -in_z, -in_w, or -in_t options for a range, the order in which the -in_x options appear is important since it influences the order in which mergemrc writes the sections to the output file. The first that appears is the innermost in the loop (it varies fastest); the second is the next innermost in the loop, and the third is the outer loop. When ordering is important, you must be careful if you use just one of the in_z, in_w, or in_t options for a range: if at least one of the other dimensions does not have a size of one, the ordering is underdetermined and mergemrc will report an error.
Return to the syntax summary
-out_z, -out_w, -out_t, and -out_sections
You can only use -out_sections, -out_z, -out_w, or -out_t if you have specified -specify_out. All of the -out_x options set where in the output file mergemrc writes the input sections. Like the -in_x options, you may not specify the same -out_z, -out_w, or -out_t more than once per range, and you may not mix -out_sections with -out_z, -out_w, or -out_t for any one range. If you do not supply any -out_x options for a range and you use -specify_out, mergemrc writes the selected input sections in the order they were selected to the output file starting at the first section in the file (i.e. no -out_x options is the same as -out_sections=0). If you specify one or more of the -out_z, -out_w, or -out_t options but omit one or more of the others, mergemrc uses a starting index of zero, a size of one, and a step of one for the missing options. If you omit the step for any of the -out_z, -out_w, or -out_t options mergemrc assumes a step of one; if you omit the size, mergemrc assumes a size of one. For the -out_sections option, an omitted step is equivalent to one and an omitted size is equivalent to the number of input sections in the range.
Return to the syntax summary

Topics

Overview | Graphical interface | Command line interface | Command line examples


Command Line Examples

File a.dat has a series of z-sections as does b.dat. A combined file, c.dat, with the z-sections from a.dat followed by those in b.dat is desired.

        mergemrc c.dat a.dat b.dat

File a.stk has 60 sections representing tilt angles from 2.5 degrees to 65 degrees; file b.stk has 63 sections representing tilt angles from 0 to -65 degrees (in that order). A file, c.stk, is desired which has the tilts in increasing order but excludes the 5th and 30th sections in b.stk since they are corrupt frames.

        mergemrc c.stk b.stk -in_sections=62:33:-1 \
            b.stk -in_sections=28:24:-1 \
            b.stk -in_sections=3:4:-1 \
            a.stk

File a.dat has ten z sections, three wavelengths, and one time. It is to be combined with b.dat which has the same number of wavelengths and times but has 30 sections which should appear contiguous with the ones from the matching wavelength in a.dat.

        mergemrc -append_z c.dat a.dat b.dat

File a.dat was collected by repeatedly moving the stage up and down while collecting data by taking quick exposures at fixed intervals. There are 5 up and down sequences with 40 frames apiece. Because of backlash in the stage motor, the 19th, 20th, 39th, and 40th frames of each of these sequences are not to be included in the final output file which will have the sequences properly grouped in time and reordered in z.

        mergemrc -specify_out b.dat \
	    c.dat -in_sections=0:18 -out_z=0:18 -out_t=0 \
            c.dat -in_sections=37:18:-1 -out_z=0:18 -out_t=1 \
	    c.dat -in_sections=40:18 -out_z=0:18 -out_t=2 \
	    c.dat -in_sections=77:18:-1 -out_z=0:18 -out_t=3 \
	    c.dat -in_sections=80:18 -out_z=0:18 -out_t=4 \
            c.dat -in_sections=117:18:-1 -out_z=0:18 -out_t=5 \
	    c.dat -in_sections=120:18 -out_z=0:18 -out_t=6 \
            c.dat -in_sections=157:18:-1 -out_z=0:18 -out_t=7 \
	    c.dat -in_sections=160:18 -out_z=0:18 -out_t=8 \
            c.dat -in_sections=197:18:-1 -out_z=0:18 -out_t=9

Topics

Overview | Graphical interface | Command line interface | Command line examples