alignframes(1) General Commands Manual alignframes(1) NAME alignframes - Aligns and sums camera movie frames and stacks sums SYNOPSIS alignframes options DESCRIPTION Alignframes aligns frames from direct electron detectors and other cam- eras that can output a set of frames from an acquisition. It can take input from multiple frame files and produce a single stack that is ready to use for tilt series processing. It implements two basic alternatives for an initial alignment that can optionally be refined: aligning each frame to a reference accumulated from previously aligned frames; or solving for the shifts of individual images by fitting to shifts measured between many pairs of frames. The latter approach has several advantages and can be applied in various ways. The advantages are: 1) It involves multiple measurements for every image and thus can average over more information for aligning the first few frames than the cumulative method can. 2) Robust regression is used for the fitting, so a small proportion of bad alignments can be tolerated and should not degrade the result. Thus, this method should be more resistant to occasional failed corre- lations due to fixed pattern noise. 3) The fitting yields an estimate of the residual error for each mea- surement, from which it is possible to derive an error measure that reflects the overall quality of the alignment and that can be used to compare results with different parameters. The fitting to shifts between pairs of frames is done for successive sets of frames, 7 by default. With that setting, each frame is aligned to all preceding frames for the first 7 frames, then the first fit is done and a best shift determined for the first frame. The 8th frame is then aligned to frames 2 through 7, and the next fit yields a shift for the second frame. It is possible to align every frame to every other one and do one fit to find all of the shifts at once, but this approach does not seem to give any advantage in the typical case, and the number of correlations can become quite large because it is proportional to the square of the number of frames. When pairwise fits with sets of more than 7 frames are indicated, another alternative strategy is to do pairwise fits with sets of half the frames (see below). The program allows one to test the quality of the fits with different high frequency filter cutoffs, and even with different binnings. Mul- tiple filters can be tested quickly in one run through the data, but testing multiple binnings require multiple runs through the data. As of IMOD 4.12.19, every fit to the pairwise shifts includes a cross-val- idation step: the fit is done multiple times with one or a few pairwise shifts omitted each time, and the solved frame shifts are used to pre- dict the pairwise shifts that were left out. The result is a mean "leave-out" error, which is a much more reliable indicator than the mean residual error of whether one set of fitting parameters is better than another. The mean residual can be misleadingly low when the ratio of measurements to values being solved for is too low to average out the random error in the values being fit. After the initial alignment, it is possible to realign each frame to a reference consisting of all other aligned frames. (For high-noise data, it is essential to leave the frame being aligned out of the ref- erence, or it would dominate the alignment.) This refinement can help after alignment to a cumulative reference, but has seemed superfluous in most tests with initial alignments from pairwise shifts. All frame data are maintained as Fourier transforms, so each additional alignment only involves one inverse FFT. Frames are shifted into alignment in Fourier space to avoid losses from interpolation, and they are reduced in size for the final sum (if at all) by cropping in Fourier space to avoid aliasing. If you want to compute a Fourier ring correlation, enter either the -lines option with a value of 3 or the -frc option with a filename. Fourier transforms of even and odd frames are summed separately and a Fourier ring correlation is computed. The program reports the frequen- cies at which the FRC crosses 0.5 and 0.25 in cycles/pixel of the summed images, and also reports the mean value around a frequency of 0.25/pixel (half-Nyquist). The FRC is the only tool for comparing results with the cumulative alignment to those with pairwise alignment or for assessing the change from refining a pairwise alignment at the end. The FRC can also be used for validating the choice of filter or binning suggested by trying multiple values. However, changes in the FRC may generally be quite small, so it will usually be helpful to assess a change a parameters with a number of frame stacks. The pro- gram Subtractcurves can help in this assessment as described below. Quite strong high-frequency filtering is needed for typical frame alignments. The filter cutoff is entered in frequency units of the unbinned data so that a particular value has about the same effect at different binnings. The default filter (0.06/pixel) is close to what is typically needed, but smaller values (down to 0.05) or possibly larger values (up to ~0.1) may give better results. Binning (actually antialiased reduction in size) accomplishes most of the removal of high-frequency noise prior to the application of the frequency filter, so these two operations are partly redundant. Most of the motivation for binning is to speed up the alignment; however, after a certain point, additional binning will somewhat reduce the accuracy with which the correlation peak position can be measured. Thus, the program facilitates testing with different binnings, although such testing is probably only needed when getting started with a particular class of data, whereas testing with multiple filters is likely to be used more routinely. Alignment Strategies The big challenges in aligning frames are the low signal-to-noise ratio and interference from fixed pattern noise. There are some significant distinctions between tilt series and single particle data when consid- ering what methods to apply. First, tilt series may have a lower dose per frame, but they are likely to have more features in the image than some single-particle images, and thus more signal to align with. Sec- ond, there can be beam damage such as doming of the ice within a series of single-particle frames, which would not be apparent for a set a tilt series frames with a low dose per tilt image. One can think of a series of possible strategies for dealing with increasingly difficult data: 1) When there is strong signal and no appreciable fixed pattern noise, the simple method of aligning to a cumulative reference of already aligned frames may be adequate. Here, refinement at the end may help if it improves the shifts for the first few frames, which were subject to the noisiest correlations. 2) Images with reasonable SNR and little fixed pattern noise should work with the default method of aligning all pairs among successive sets of 7 frames. 3) If the noise is higher or signal lower, or if there is serious fixed pattern noise, then pairwise alignment among much larger sets of frames will be needed. 3A) For tilt series data, using all pairs of frames would generally be appropriate. 3B) For single particle data, it is better to do pairwise alignment among half the total number of frames, to avoid correlating across large changes in the specimen. However, if fixed pattern noise is a problem, it may be necessary to use all pairs instead, so that some correlations with large shifts will be available for all frames. (Fixed pattern noise makes the correlations unreliable or inaccurate for small shifts; even when the filtered peak at the origin is smaller than the true correlation peak, it can displace the peak position if the two peaks overlap.) 3C) Refinement at the end is risky if there is serious fixed pattern noise. 4) If the noise is just too high for alignments between single frames to work, then grouping can be used. A group size of 3 can help considerably. Refinements can be done at the end by correlating either single frames or, if necessary, just the grouped frames with sums of other frames. Text Output with Single Parameter Settings In cases where there is one set of frames per file, output for a file starts with a line like File 1 (Feb21_10.12.15.mrc): 11 frames When multiple sets of frames are contained in a single input file, the output for each set starts with a line with "Set", such as for a fast- incremental single-exposure tilt series: Set 3 (pilY13_005.tif): 10 frames from 803 to 812 (-54.0 deg) When there is only one filter and binning being used, the following two lines present summary statistics for the results of the doing robust fitting to the shifts for each set of pairwise alignments. All of the values and distances are in unbinned pixels. The first line has these items: Weighted residual mean (abrreviated as Wgt resid mean): The mean residual error value, averaged over all of the fits. This is a weighted error, so aberrant shifts that are completely down-weighted are not reflected here. mean max: The mean value of the maximum weighted residual, averaged over all the fits max max: The maximum weighted residual seen in any of the fits leave-out error (abbreviated as l-o err): The mean leave-out error from the cross-validation fits The second line has these items: Max unweighted resid mean: The average of the maximum unweighted residual values seen in the fits max: The maximum unweighted residual seen in any fit Dist: Raw sum of distances from one frame position to the next smoothed: Sum of smoothed distances. If spline smoothing is used, this is the distance for the smoothed shifts that are used to sum the images. Otherwise, this is based on a local polynomial smoothing that is not very good. When the -lines option is entered with a 3, there is also a summary of the output from Fourier ring correlation. This line reports the fre- quencies (in cycles/pixel) at which the curve crosses below 0.5, 0.25, and 0.125. The last number on the line is the mean value of the curve around 0.25/pixel (half-Nyquist). If the sum is reduced in size, these frequencies are in terms of the binned pixels. The most important values on the first line are the weighted residual mean and the leave-out error. After each fit, the shift between each pair of images is computed from the solved shifts of the individual images, and the residual for that shift is the difference between the computed and measured values, multiplied by the weighting factor applied to the measurement. The leave-out errors were explained above, except for the fact that the error for each predicted pairwise shift is multiplied by the same weighting factor that was obtained for that shift in the solution with all data included. The means of both of these errors give some indication of how accurate the shifts should be, on average, but the leave-out error is a more reliable indicator. Good values for the mean residual are in the range of 0.05 to ~0.3, but some sets may give values in the range of 1-3. The latter cases are a sign that one should try analyzing a higher number of pairwise shifts or even try grouping. Using more pairwise shifts will generally increase the mean residual but will allow greater averaging over these random errors and thus reduce the leave-out error; grouping should improve the residuals and leave-out errors. The maximum residual values on the first line should not be many times larger than the mean; these values indicate that there may be a shift in error by that amount. The Max unweighted resid mean on the second line reflects how often there are bad shifts measured between pairs of images. With no bad shifts, it should be not very much bigger than the maximum weighted residual; values above a few pixels may indicate that there are bad correlations in most fits. Reducing the filter cutoff, reducing the maximum allowed shift, and grouping may improve this value; increasing the number of pairs being fit will increase the ability of the program to reject the bad shifts as outliers. (As long as the weighted maximum residuals are not high, the program is already able to reject the bad shifts.) The distance values on the second line reflect the total specimen move- ment, and comparison between raw and smoothed distances gives some indication of how jittery the solved shifts are. They become more important when trying different filter settings, as explained below. Text Output with Tests of Binning and/or Filters If you specify either multiple filters with the -vary option or multi- ple binnings with the -test option, there will be output similar to what was just described for each condition being tested. An initial line for each condition shows the binning, the filter cutoff value, and the sigma for the filter falloff. The latter is varied in proportion to the cutoff value. When there are multiple filters, the program will compose a "hybrid" solution that is based on the filter that gives the lowest leave-out error after each fit to a set of pairwise shifts. The set of results from this hybrid solution appear after the ones for the various fil- ters, with an initial line showing Hybrid results, bin =. This solu- tion is not used by default, only if the -hybrid option is given. After all of the results, there will be a line indicating which condi- tion is considered best, such as: File 1: Best at bin = 8 rad2 = 0.060 sig 0.009 mean res = 0.121 l-o = 0.180 However, the selection of a best solution must be treated with caution. If fixed pattern noise is significant, the fit may appear to improve dramatically with high filter cutoffs. One sign that fixed pattern noise could be taking over is a substantial decline in the distance travelled with higher filter settings; the program will not consider these filters to be better if the decline is too great. If the program is run on more than one file, then at the end there will be a report of the number of times each combination of binning and fil- ter cutoff gave the best solution. For example: Number of times each condition is best (rad2 in parentheses): bin = 6 3 (0.050) 1 (0.060) 0 (0.080) bin = 8 3 (0.050) 1 (0.060) 1 (0.080) This indicates that a cutoff of 0.05 is generally better than 0.06, and that there is not much difference between binnings 6 and 8. Not all combinations of binning and filter cutoff are meaningful. Fil- ter cutoffs that are at or above the Nyquist frequency for a particular binning will have little or no effect. Here are the Nyquist frequen- cies for common binnings: Binning Nyquist frequency 2 0.25 3 0.167 4 0.125 6 0.083 8 0.062 Support for Frame Files with Extended Header Data If the frame files have an extended header (as files saved by UCSFtomo do), the program will look for several features: 1) If the header is large enough to contain a gain reference, then these data will be extracted and used to gain-normalize the frames, unless a different gain reference is supplied with the -gain option. The reference will be assumed to be in the correct orientation to apply to the frames; but if it is not, the -rotation option can be used to reorient it. 2) If the header appears to contain valid tilt angles, the program will use these values to break the frames into separate sets for align- ing and summing, unless the -break option is given. It will recognize one place where there are two sets at the same angle and make two sums there, provided that there are at least 5 sets of frames of the same size prior to that place. This, if you use -frame to try aligning a subset of tilt angles right around the starting point of a tilt series, you may have to use -break to prevent these two images from being com- bined. The tilt angles will be placed into the extended header of the output file unless provided by another source (the -tilt or -stack options). 3) If the extended header contains valid entries for pixel size and tilt axis rotation angle, the pixel size will be placed into the stan- dard header location and a title will be added with the rotation angle. Note that although multiple files can still be entered when there is an extended header, the program will insist that their properties match. Dose Weight Filtering The program can do dose weight-filtering of frames before summing for sets of frames from either tilt series or single-particle acquisition. The filter is as described in Grant and Grigorieff, 2015 (DOI:10.7554/eLife.06980) and the implementation follows that in their "unblur" program. At any frequency, the filter follows an exponential decay with dose, where the exponential is of the dose divided by 2 times a "critical dose" for that frequency. This critical dose was empirically found to be approximated by a * k^b + c, where k is fre- quency; the values of a, b, c in that paper are used by default but can be modified with the -critical option. This filter function is applied for all frequencies (complete attenuation above the "optimal dose" is no longer considered appropriate). For each set of frames, the program will apply a weighting that changes through the set of frames as dose accumlates. If just an overall dose for the image is specified, this dose is divided equally among the frames. However, if frames are not of equal duration, it is possible to specify the different doses of the frames. For a tilt series, the weighting also changes through the frame sets according to the amount of dose prior to each set of frames. A variety of options are described below for providing dose information to the program. The ".mdoc" file from SerialEM can be a convenient source of such information, particularly if frames are not of equal duration. If an .mdoc is supplied for single particle data with the -dfile option, it can contain entries for many images; the program will pick out the dose information that applies to the frame sets being aligned. For tilt series, the .mdoc provides cumulative dose informa- tion directly if created by SerialEM in 2017 or later; if that informa- tion is missing, it can be derived from the date-time stamps of the frames. When the -adjust option is used to write a new .mdoc file, the program makes several changes to make the new file usable or more robust for dose weighting. If there are dose entries but no cumulative (prior) dose entries, prior dose entries will be added. If some tilt images are subsequently removed with Excludeviews, the .mdoc file made in that process will still be valid for dose weighting, whereas prior dose values derived from date-time stamps instead would then be incorrect. If you enter a fixed total dose per frame stack with -dto- tal, either because there is no dose information in the .mdoc or because you wish to override the dose value in the file, the program will add or replace both the dose per tilt image and the prior dose entries so that they are correct for the new value. Again, this allows Excludeviews to be used. Computer and GPU Memory Usage The amount of computer memory required for this processing depends mostly on the size of the images and whether all of the frames will be held in memory until all alignments are completed. Each frame held in memory requires 4 times as many bytes as pixels. Frames will be held in memory if only one binning is used and any of the following options are used: multiple filters (unless the -hybrid option is used to allow the final setting of one shift after each pairwise fit), pairwise alignment among all frames, refinement after initial alignment, or smoothing of shifts (which occurs by default with 15 or more frames). Otherwise, the number of frames held until the end will equal the num- ber of frames used for pairwise fits (plus the group size minus one, if grouping). However, when more than one binning is tested, the program will not hold any frames but instead read them in again on a second pass. The amount of memory needed for the binned images being aligned can also become large if the binning is small. These images are all held until the end if there is refinement of the initial alignment; other- wise the number retained will be the number used for pairwise fits. When using a GPU, the size of unbinned images dominates the usage, but space is needed for only 3 or 4 image-sized arrays. This can approach 1 GB for ~8K images. The frames are actually held in computer memory until their shifts are finalized. Binned images for alignment will also consume significant amounts of the GPU memory; these images stay on the GPU instead of in computer memory, and the requirements there are the same as they would be in computer memory. If binning is only to ~2Kx2K and there are 60 frames, with refinement at the end, then 1 GB would be needed; fewer frames or more binning (both typical) would result in half this requirement for the aligned frames. Thus, 2GB of memory should suffice for typical usage, but 4 GB would handle almost any anticipated need. OPTIONS Alignframes uses the PIP package for input (see the manual page for pip). Options can be specified either as command line arguments (with the -) or one per line in a command file (without the -). Options can be abbreviated to unique letters; the currently valid abbreviations for short names are shown in parentheses. INPUT CONTROL OPTIONS These options specify the input to the program or are related to input options -input (-in) OR -InputFile File name Input file with images to correlate. Non-option arguments will also be used for input files, with those entries used after any names entered with this option. If -foutput is entered, all non-option arguments will be used for input files; otherwise all but the last will be. Input files need not be entered if an .mdoc file is entered with the names of the frame files. (Suc- cessive entries accumulate) -output (-ou) OR -OutputImageFile File name If this option is not entered, the last non-option argument will be used for this output file. An output file is required unless -nosum is entered. -list (-lis) OR -ListOfInputFiles File name Name of file with list of input files, one per line. Filenames entered this way are equivalent to ones entered with -input or as non-option arguments; the latter two entries cannot be used along with a file list. -break (-br) OR -BreakFramesIntoSets Integer If the input consists of a series of single-frame files, this option must be used to combine them into one or more sets of frames to be aligned and summed. Additionally, the option can be used to break a file with many frames into multiple sets of frames, each of which will be aligned and summed. The input frames (either the whole collection of single-frame files, or the frames in one multi-frame input file) will be divided into groups of the given size, with any extra frames distributed among the initial groups. For example, for 50 single-image files or one file with 50 frames, and an entry of 8, there will be 6 summed images, with 9 frames in the first two and 8 frames in the rest. There must be at least as many single-image files, or as many frames in each input file, as the number given. For single-image files, this option cannot be entered with the -frame or -assess options. For multi-frame files, the option cannot be entered with -assess and will work with -frame. In either case, the option should work with -stack or -mdoc unless frame filenames are being taken from the mdoc file. Frame files with tilt angles in the extended header will automatically be broken into sets by tilt angle, so this option is not needed in that case. -saved (-sa) OR -SavedFrameListFile File name Name of file with list of frames saved from a fast-incremental single exposure tilt series where blanked frames were not saved. There must be only a single frame input file and it must have the same number of frames as lines in this list file. Every set of contiguous values is assumed to be from one tilt angle. There are several changes in IMOD 4.10.36: 1) A negative entry means that the frame should be skipped; this should occur only at the start or end of a frame set. 2) The frame numbers need not increase within a frame set, so the list could use the same number for all frames in a set. 3) The frames numbers need not be sequential within a frame set; there can be gaps as large as the value entered with the -gap option, and the program will not start a new frame set at such a gap. 4) The program will auto- matically eliminate the first or last frame if there is a large enough gap between it and the adjacent frame and the average size of frame sets is at least 4, and if there are no negative values in the file. SerialEM has to write these frames, and as of 3.8beta7 it will mark them with negative frame numbers, but this automatic detection should work with older frame lists. If data are lost from entire tilt angles because they fell below the threshold for saving, Alignframes can also use relative starting and ending frame numbers from a tilt angle file or an .mdoc file to sort out which angles were lost (this information is output by SerialEM as of 12/16/19). This entry cannot be used with -frame, -assess, or -break. -gap OR -MaxGapWithinFrameSet Integer Maximum size of a gap in sequential frame numbers allowed before starting a new frame set, when using a saved frame list. The value is thus the number of frames that might have been lost within a frame set because the threshold was set too high during acquisition. The default is 0. -skip (-sk) OR -SkipFileChecks Skip initial check that all input files have the same size and data mode; this check can take significant time with many non- MRC single-frame files. This option is allowed only with sin- gle-frame input files. -stack (-sta) OR -CorrespondingStack File name Name of image stack of sums corresponding to the input files, such as a tilt series where each image is a sum of unaligned frames. This file will be used for the basic header information of the output file, thus preserving titles and extended header data. -mdoc (-mdo) OR -MetadataFile File name Name of a metadata autodoc (mdoc) file with a section for each input file to be aligned and stacked. This file is an alterna- tive way to get basic header information for the output file, as well as tilt angles into the extended header. In addition, if there are no input filenames entered as arguments, input file- names will be obtained from all of the sections in the .mdoc file with "SubFramePath" entries. However, the paths in those entries are ignored; the frame files must all be in the current directory unless the -path option is entered with an alternative path. This capability is useful for bidirectional tilt series or if a Record image was acquired more than once at a tilt angle, since only the frame file for the last Record image will be used. If input filenames are entered as arguments, there must be at least as many sections in the .mdoc file as input files if tilt angles are to be obtained from the .mdoc file (i.e., if -tilt is not entered). The exception to this is when a saved frame list file is entered with the single input frame stack from a tilt series. In this case, the .mdoc file from SerialEM is a frame stack mdoc file with a "FrameSet" section followed by a brief section for each tilt, with the angle, expected relative starting and ending frames, and dose informa- tion. -path (-pat) OR -PathToFramesInMdoc Text string Current path to the frame files listed in an .mdoc file, when these are being used as the input filenames. If this option is not entered, the program must be run in the directory where the frames are located to access files listed in an .mdoc file. -ignore (-ig) OR -IgnoreZvaluesInMdoc Take sections in order from the .mdoc file instead of by Z value. With this option, .mdoc file sections can be removed or rearranged to control which frame files are stacked. Otherwise, sections must exist for all Z values being accessed, starting at 0. -adjust (-ad) OR -AdjustAndWriteMdoc Correct entries in the input .mdoc file for changes in image size, binning, pixel size, data mode, or dose, and write a new file with the name of the output file plus .mdoc. This option has no effect unless an .mdoc is entered. If tilt series data are processed in order by tilt angle, the new file will be writ- ten in the new order and it can be used later for dose-weighting in Mtffilter. If the dose entries in the .mdoc file are superseded by a fixed total dose entry with -dtotal, the Expo- sureDose lines will be modified or added to contain the new dose value, and PriorRecordDose lines will be modified or added as well. -reorder (-reo) OR -ReorderByTiltAngle Integer Process sets of frames in order by increasing or decreasing tilt angle. Enter 0 for no reordering, 1 to reorder from negative to positive unless the angles already decrease monotonically, 2 to reorder always from negative to positive, -1 to reorder from positive to negative unless angles already increase monotoni- cally, or -2 to reorder always from positive to negative. The default is 1 unless -ignore is entered, in which case it is 0 and this option cannot be entered. -pixel (-pi) OR -PixelSize Floating point Pixel size in nanometers. A pixel size is needed for dose weighting. This entry is needed only if the pixel size in the image file header or files entered with -stack or -mdoc is incorrect; it overrides any other source of pixel size. -eer (-ee) OR -EERSuperResZSumPadding Three integers Amount of super-resolution to retain (0 for none, 1 for 2x, or 2 for 4x), an entry controlling the summing of frames into succes- sive groups, and the amount of "padding" for row/column defects when reading from an EER file. The super-resolution reduction and the Z summing both occur in the TIFF reading module and these two entries determine what size the EER file appears to be to Alignframes, both the size in X and Y and the number of frames. A positive Z summing value specifies the number of frames to sum into each group, where the groups become the frames that Alignframes reads. A negative value directly speci- fies the number of grouped frames for the reading module to pro- duce. All pixel-related entries and outputs from Alignframes are in terms of the pixels returned to the program from that module, not the original 4X super-resolution pixels. When the summing value does not evenly divide the total number of frames, the specified frame summing is the maximum summing that will occur, and the frames will be distributed as evenly as possible, with the summing lower by 1 at the beginning of the stack. The default for super-resolution is 1, or a value set with the envi- ronment variable IMOD_DFLT_EER_SUPER_RES. The default for sum- ming is 10, or a value set with the environment variable IMOD_DFLT_EER_Z_SUMMING. When a gain reference is provided with a defect list included, the third entry controls the kind and extent of pad- ding around row and column defects; -1 can be entered to use the default padding. A number less than 10 specifies the additional physical pixels to be corrected for extreme super-resolution bias by averaging over the super-resolution pixels in a physical pixel in the direction perpendicular to the defect. When not reading with antialiasing, a value of 1 appears to be sufficient and is used when the entry is -1. A value over 10 specifies 10 times the number of pixels to widen the defect by on each side. When reading with antialiasing, a value if 20 (widening by 2 pixels on each side) is necessary and is used when the entry is -1. If you need to adjust the defect list more than that, you will have to run Clip with the -ed option to output the defect list in SerialEM format, modify that, and enter it here with the -defect option. -aaeer (-aa) OR -ReadEERWithAntialiasing Integer Type of antialias filter to use when reading an EER file. This filtering is done by adding a packet of 100 counts to the image for each electron event. The packet is centered on the super- resolution location of the event and distributed among 16 pixels of the image being composed according to the filter function. Enter 0 to disable this filtering, 1 for a Lanczos 2 filter, or 2 for a Mitchell filter (which should not be as good). The default is 1. -super (-su) OR -SuperGainFactorFile File name File with factors for adjusting the gain reference for an EER file to correct for bias among the super-resolution pixels within each physical pixel. These factors can be calculated from all the pixels in one or more EER files using the "clip supergain" command, which produces the file to be input here. The adjustment is quite minor when reading with 2x2 super-reso- lution, where the factors range up to ~1.7%, but may be more significant with 4x4 super-resolution, where the factors range up to ~7%. See the FILES section of the Clip man page for the format of this file. OUTPUT CONTROL OPTIONS These options specify the output of the program or are related to output options -binning (-bin) OR -AlignAndSumBinning Two integers Image reductions to apply when aligning and when summing. The default for summing is 1, and the default for aligning is chosen by seeing which binning out of 2, 3, 4, 6, or 8 brings the size being correlated closest to 1250, or to 1560 for frames in a size range that could come from a K3 camera. Enter a negative number for the alignment binning to use the default instead of having to specify it. If -test is entered with one or more bin- nings, an entry for alignment binning is ignored. -target (-tar) OR -TargetAlignSize Integer Use a reduction factor that brings images for alignment close to the given size. The default is 1250 pixels, or 1560 pixels for frames recognized as from a K3 camera. -frames (-fra) OR -StartingEndingFrames Two integers First and last frame in each file to align and sum, numbered from 1. The default is to do all the frames. The starting frame number must be no bigger than the smallest number of frames in any file. -partial (-par) OR -PartialFrameThresholds Two floats Relative thresholds for skipping one frame at the start and the end of a frame set, when using a saved frame list. The values must be less than 1 and greater than 0. The mean of the middle frame of the set is taken as a reference; the first frame is skipped if its mean is less than the first threshold times the reference; the last frame is skipped if its mean is less than the second threshold times the reference. A higher threshold might be appropriate for the first frame if there tends to be excessive drift then. The only reason to skip a partial frame at the end of a set is to avoid including a frame with insuffi- cient signal-to-noise ratio. There is no test for statistical significance, so do not set the threshold so high (e.g., 0.99) that you risk dropping frames just due to random variations in the means. -drift (-dr) OR -DriftLimitDistAndNumber Two floats Maximum shift between frames above which to drop initial frames of the set, and either the maximum fraction or the maximum num- ber of frames to drop. The shift is in unbinned pixels. If a maximum fraction is entered, it must be between 0 and 0.5. With this option, the program will do the alignment, then if initial shifts are above threshold, it will redo the alignment with those frames dropped. -sets (-se) OR -RangeOfSetsToDo Two integers First and last frame set or file to process, numbered from 1. The default is to process all input data. The file or set num- bers correspond to the numbers shown in the text output and apply after reordering by tilt angle, if any. This option is here for convenience in testing and assessment, and it may not work in all cases. The correct tilt angles should be placed in the the file header, but an adjusted .mdoc file will not be cor- rect. -ddrop (-dd) OR -DropAndReplacementDoses Two floats Dose in an .mdoc file below which the frame set will be dropped, and a dose to assume when recomputing the prior accumulated dose. Some .mdoc files show dose at the camera instead of dose to the specimen and can include dose values of 0 where the beam is occluded by a grid bar. When doing dose-weighting, zero doses will produce an error and this option is necessary to avoid them. In any case, the option will also trim out images with nominal doses below the threshold. If the second value is greater than 0.01, the PriorRecordDose entries in the .mdoc will be ignored and accumulated doses will be recomputed, with each dose below the threshold replaced by the maximum of the replace- ment value and the nominal dose. A replacement value comparable to the typical dose would be appropriate if the specimen area is above the grid bar when the grid bar occludes the beam. If all the dropped frames are at the end of the tilt series, the replacement entry does not matter. This option cannot be entered with the -sets or -mdrop options. -mdrop (-mdr) OR -DropSetIfMeanBelow Floating point Exclude frame sets with a mean in the .mdoc file below the given value. This option provides a way to trim out images where the beam was occluded when the dose entries are not based on dose at the camera. It cannot be entered with the -sets or -ddrop options. -mode (-mo) OR -ModeToOutput Integer Mode for output image file: 0 for bytes, 1 or 6 for signed or unsigned integers, or 2 or 12 for 32-bit or 16-bit floating point. The default is to use the mode of the input file unless is it 0, in which case the default is to use mode 1; however, the default mode of floating point output for MRC files is gov- erned by the value of environment variable IMOD_WRITE_FLOATS_16BIT. -scale (-sc) OR -ScalingOfSum Floating point Amount to scale summed values before output. The default is no scaling; however, note that reduction of the output size will scale the data up by the square of the reduction factor. Such scaling mimics the summing of counts by binning during data acquisition. -total (-to) OR -TotalScalingOfData Floating point Search the titles of the first input file for a scaling factor, and apply an additional scaling to the summed values to bring the total scaling to the amount entered. If no scaling is found in a title, it is assumed to be 1 and the full scaling specified here will be applied. A default total scaling of 30 or 32 will be applied if the input data consists of bytes or 4-bit values, gain normalization is being applied, and the output mode is not set to 2. A default of 32 is used if the gain reference is clearly from a K3 camera. -meansd (-mea) OR -MeanAndSDtoScaleTo Two floats Scale each summed image to the given mean and standard deviation if the given SD is greater than 0, or just scale to the given mean if SD is 0 or less. -rfsum (-rf) OR -SumRotationAndFlip Integer Rotation and flip operation to apply to sum before output. Enter a number from 0 to 7, chosen by taking the rotation angle counterclockwise divided by 90, plus 4 for a flip around the Y axis before the rotation. (This corresponds to the RotationAnd- Flip property used in SerialEM for several kinds of cameras.) Enter -1 to have this value taken from a "need" entry in a title of the first input file from SerialEM, or from the orientation tag in an EER file, or to treat MRC frame files from FEI soft- ware appropriately (i.e., invert in Y with a value of 6. If there is no "need" entry for a SerialEM file, this option with -1 is ignored, but if there no orientation entry for an EER file, it is an error. -tilt (-til) OR -TiltAngleFile File name File with tilt angles to insert into the header of the output file. The file should have one tilt angle per line, and must have at least as many angles as frame files or sets being stacked. Tilt angles will be placed into the extended header in the UCSF/FEI format, one floating point value per section. With this entry, tilt angles will not be used from a corresponding stack or mdoc file. If this entry is used with a list of saved frames and a single input frame stack, each line of the file can also have the expected relative starting and ending frame num- bers for that tilt following the tilt angle. -axis (-ax) OR -AxisRotationAngle Floating point Rotation angle from the Y axis to the tilt axis, counterclock- wise positive. This angle will be placed into a title in the output file readable by Etomo. It will override an angle from an mdoc file or from a corresponding tilt series. -xfext (-x) OR -TransformExtension Text string Extension for output file(s) with image transformations having shifts in columns 5 and 6. One file will be produced for each input file, with the input file extension replaced by the given extension. These files have the absolute shifts being applied to each frame, in unbinned pixels, not relative shifts between successive frames. -frc OR -FRCOutputFile File name Output file for Fourier ring correlations between sums of even and odd frames, which are computed when a sum is produced. The file will have a series of lines, each with the file number, the frequency at the center of ring, and the correlation coeffi- cient. When a GPU is used, the program may not compute the FRC if there is only enough memory to sum into one buffer on the GPU instead of two. -ring (-ri) OR -RingSpacingForFRC Floating point Spacing between the rings of the Fourier ring correlation, in cycles/pixel of the summed images. The default is 0.005, which is needed for resolving closely spaced CTF oscillations. Smaller values like 0.02 - 0.05 will provide more averaging for situations with more widely spaced oscillations. See the sec- tion below, Evaluating and Visualizing Differences with FRC Curves. -evenodd (-ev) OR -EvenAndOddSumOutput Integer Output two additional files, with sums of even and odd frames, where frames are numbered from 0. Thus, if there is an odd num- ber of frames, the "even" sum will have one more frame than the "odd" sum. If the output file name does not have an extension of 4 or fewer characters, names will be the output filename with "_even" and "_odd" appended. Otherwise, with an entry of 1, an output name of "rootname.ext" will give names "root- name_even.ext" and "rootname_odd.ext". With an entry greater that 1, the program will use names appropriate for a dual-axis tilt series if the root of the output filename ends in "a" or "b". For example, an output file name of "setnamea.ext" will give names of "setname_evena.ext" and "setname_odda.ext". -lines (-lin) OR -LinesOfAlignSummary Integer Number of lines summarizing the fitting and FRC results. Set to 3 for full output, 2 to eliminate the FRC line, or 1 for a con- densed output with the weighted mean residual, maximum of maxi- mum weigted residual, and the raw and smoothed distances. When using the -saved option to enter a saved frame list file, the output will also include the shifts of the first four frames; enter a negative value to suppress this output. -plottable (-pl) OR -PlottableShiftFile File name Filename for output file with raw and smoothed shifts in unbinned pixels. The smoothed shifts will be from spline smoothing if it was done, otherwise from local polynomial smoothing. The shifts will be put into the file one per line, starting with a type number of 10 times the file number for the raw shifts or that value plus 1 for the smoothed shifts (e.g., 10 and 11 for the first file). -nosum (-nos) OR -NoSumsOutput Do alignments without making a summed image; no output filename should be entered. PREPROCESSING OPTIONS These options provide for initial processing of the data before aligning -titles (-tit) OR -RefAndDefectFromTitles Normalize the data using a gain reference and possible defect file listed in the header titles of the first frame file from SerialEM, or use a gain reference listed in the metadata of an EER file. For a SerialEM file, the title with the reference name must contain "ref" or "Ref" and end in ".mrc", ".dm4", or ".tif". The title with the defect name must contain "defect" and end in ".txt". Each file must exist in the directory with the frame file. This entry also causes the rotation and flip operation found in a title to be applied, so -rotation -1 need not be entered. An entry with the -gain, -defect, or -rotation option overrides the respective information found in a title. For an EER file it will be an error if the metadata is not found or does not contain a gain reference name. Use "header -t 65001 filename" to see the metadata in an EER file. -gain (-gai) OR -GainReferenceFile File name Gain reference for normalizing unprocessed or dark-subtracted frames. The gain reference should be a floating point file with a mean of 1. If this option is entered, it supercedes a gain reference found in the extended header of the frame files. If the gain reference is a TIFF file or the frame input is an EER file, then the gain is allowed to be exactly 2 or 4 times smaller than the frames. In that case, the gain reference is expanded by replicating the value for a pixel to all the super- resolution pixels within it. -rotation (-ro) OR -RotationAndFlip Integer Rotation and flip operation that needs to be applied to the gain reference to match the orientation of the frames being cor- rected. Enter a number from 0 to 7 by taking the rotation angle counterclockwise divided by 90, plus 4 for a flip around the Y axis before the rotation. (This corresponds to the RotationAnd- Flip property used in SerialEM for a K2 camera, but it is also possible to save such frames without the rotation and flip.) Enter -1 or -2 to have this number taken from an "r/f" entry in the title of the first input file, in which case it is an error for the "r/f" entry to be absent. Entering -2 is equivalent to entering -1 for the -rfsum option and makes it take the rotation and flip for output from a "need" entry in that title unless the -rfsum option was entered. A negative value is ignored for EER files to accommodate the default option in the Alignframe inter- face in Etomo. -dark (-da) OR -DarkReferenceFile File name Dark reference to be subtracted before multiplying by a gain reference when the frames are saved as unprocessed data. -defect (-def) OR -CameraDefectFile File name File of camera defects to correct. The defect file is put out by SerialEM for versions of DigitalMicrograph from GMS 2.3.1 and higher when frames are not gain-normalized. The program will determine the binning of the image relative to these defect coordinates by assuming that the images are more than half the camera size. It will decide to scale the coordinates in the defect list up by 2 if necessary for super-resolution frames. These decisions will be reported and can be overridden with the next two options. If this option is entered, it supercedes the defect list found in a TIFF gain reference file. -double (-do) OR -DoubleDefectCoords Scale camera defect coordinates by 2 if they are not already scaled. This option should not be needed. -imagebinned (-im) OR -ImagesAreBinned Floating point Binning of images, which could be needed for defect correction if frames are not bigger than half the camera size. -truncate (-tru) OR -TruncateAbove Floating point Replace values above a limit with the mean of surrounding val- ues. The mean is taken from pixels in a 7x7 area, excluding the center 9. Enter a positive number to specify an absolute limit in counts that applies to all frames being processed. Enter a negative number to specify the number of standard deviations above the mean at which to truncate (e.g., -8 for 8 SD's above the mean). This limit will be determined separately for each frame file or set of frames, using the first frame of the set. When the saved option is entered with a saved frame list file, the limit is determined separately again for the second frame and that limit is used for the rest of the frames in the set, in case the first frame is a partial exposure. ALIGNMENT OPTIONS These options control the strategy and main parameters of the align- ment -pair (-pai) OR -PairwiseFrames Integer Number of frames or groups to use in successive pairwise align- ments, or 0 to use alignment to a cumulative reference of already-aligned frames. The default is 7. With an entry of -1 or a value equal to or bigger than the number of frames, the program will align all pairs of frames or groups and do a single fit. With an entry of -2, -3, or -4, it will do pairwise align- ments among sets of one-half, one-third, or one-fourth of the frames or groups, but with a minimum of 7 included. -reverse (-rev) OR -ReverseOrder Reverse order of processing and start with the last image. This should make very little different when using pairwise align- ments, but is a potentially useful option when using alignment to a cumulative reference, unless there is substantial fixed pattern noise. -shift (-shi) OR -ShiftLimit Integer Limit on distance to search for correlation peak, in unbinned pixels. If the previous frame was aligned to the same reference being aligned to, the center of the region searched corresponds to the peak position for the previous frame. The default is 20. -group (-gr) OR -GroupSize Integer Number of frames to sum for correlations between groups of frames; such groups are needed when correlations between single frames are too noisy to give reliable results. Since correla- tions are done only between non-overlapping groups, grouping reduces the number of measured shifts from which each frames's shift can be determined. Frames will be grouped in one of two ways: in non-overlapping blocks, or in successive overlapping groups, referred to as "slide grouping". The latter is used only when the total number of frames is large enough to allow a linear equation to be fit to the shifts; for example, this requires 8 frames for a group size of 3. With slide grouping, each frame will have a different shift. If the program has to drop back to block grouping, all frames in a block will have the same shift. -radius2 OR -FilterRadius2 Floating point High spatial frequencies in the cross-correlationd will be attenuated by a Gaussian curve that is 1 at this cutoff radius and falls off above this radius with a standard deviation speci- fied by FilterSigma2. Unlike in other applications, this value is entered in frequency units (1/pixel) of the input frames, not of the reduced images being correlated. It is scaled by the reduction before being applied to the reduced image, which means that a particular value will give about the same amount of fil- tering regardless of the binning. The default is 0.06. -vary (-va) OR -VaryFilter Multiple floats Set of radius2 filter values to test. This option can be entered separately for each binning, but that should not be nec- essary, for two reasons. First, because these values are in unbinned frequency units, each one would have about the same effect for the different binnings. Second, there is little cost to applying extra filters, because different filters are applied to a small subarea of an unfiltered correlation. Sigma2 will automatically be set for each filter so that it is in the same ratio to the particular radius2 value as the basic sigma2 is to the basic radius2 value. Thus, to provide a different set of sigma2 values for these filters, you need to enter -radius2 or -sigma2. (Successive entries accumulate) -hybrid (-hy) OR -UseHybridShifts Derive a set of shifts while alignments are being done by using the results from the best filter after each individual fit. By default, when given multiple filters to test, the program will decide on the best overall filter after all fits are done and use the shifts from that filter. This option will reduce memory requirements, unless the alignment is being refined at the end. -refine (-ref) OR -RefineAlignment Integer Refine an initial alignment based on pairwise correlations by correlating each frame with an aligned sum of all but that frame. The entry gives the maximum number of iterations that will be run, but iterations will stop if the biggest change in shift falls below a threshold. -rgroup (-rg) OR -RefineWithGroupSums When using group sums for the initial alignment, refine the alignment with group sums as well, instead of single frames. This may be needed if the signal-to-noise ratio is too low even for the correlation between a single frame and the sum of other frames. The shifts converge more slowly in this case, so more iterations may be needed. -stop (-sto) OR -StopIterationsAtShift Floating point Maximum change in shift at which to stop iterating the refine- ment of initial shifts by correlating with the sum of frames. The default is 0.1. -rrad2 (-rr) OR -RefineRadius2 Floating point High frequency filter cutoff (radius 2) for refining the align- ment. The default is to use the same filter that was used to obtain the alignment, or the filter that gave the best overall error value when a hybrid alignment was used. -smooth (-sm) OR -MinForSplineSmoothing Integer Smooth the shifts with a spline curve whose smoothing parameter is found with generalized cross-validation, but only if the num- ber of frames is at least as big as the entered value. This method requires a fairly large number of frames to be reliable; the documentation for the cross-validation code being used sug- gests 20 frames may be needed. Smoothing should not be used with less than 10 frames. For numbers between 10 and ~20, a minority of images may come out slightly worse with smoothing, so it would be advisable to evaluate results with and without smoothing, such as with an FRC. The default is currently 20; enter 0 to disable smoothing. -gpu (-gp) OR -UseGPU Integer Use the GPU (graphical processing unit) for computations if pos- sible; enter 0 to use the best GPU on the system, or the number of a specific GPU (numbered from 1). If GPU memory is a limita- tion, the program will prioritize forming the sum on the GPU over doing the alignment there, and will compute odd and even sums as the lowest priority. If alignment becomes possible on the GPU only by deferring the summing, and if CPU memory is suf- ficient for that, then it will keep the entire stack of frames in memory and sum them after aligning. -memory (-mem) OR -MemoryLimitGB Multiple floats Limit on CPU memory usage, and optionally a limit on GPU memory usage, in gigabytes or as a negative fraction of total memory. Enter a value > 0.05 to specify that number of GB, or a value between -0.05 and -0.95 to specify between 0.05 and 0.95 for total memory. The default for the CPU limit is 3/4 of system memory for system memory less than 16 GB, half of system memory for system memory more than 24 GB, and 12 GB between those lim- its or when system memory cannot be determined. When the memory usage would exceed the limit for a set of input frames, the pro- gram will run through the data in two passes, one to get the alignment and one to make the sum. This may not be possible if the -assess option is used. The default GPU limit is 0.85 times the GPU memory. DOSE WEIGHT FILTERING OPTIONS These options control "dose weighting", which filters out high fre- quencies as a function of the dose already applied to a cryo-specimen. -dtype (-dty) OR -TypeOfDoseFile Integer This option both enables dose weighting and indicates what kind of file is being provided with the dose information. Type 4 indicates key-value information in the autodoc format, such as an .mdoc file produced by SerialEM. If an .mdoc file has been entered with the -mdoc option, this is the file that will be used, and no filename can be entered with the -dfile option. The other four types are simple text files with a line for each output image (i.e. each set of frames being aligned). These types are: 1: A single value per line, just the dose for each image; 2: Two values, the prior accumulated dose followed by the image dose; 3: The prior dose followed by the cumulative dose at the end of that image; 5: One or more pairs of entries indicating the dose per frame and the number of frames with that dose. For example, "0.5 15 0.75 10 1.0 15" specifies a dose of 0.5 for 15 frames, 0.75 for 15 frames, and 1.0 for 15 frames. The total number of frames on such a line must match the number of frames in the set. For these four types of text files, if the files or frame sets are being processed in order by tilt angle, the lines should be in the same order as the processing would occur without this reordering, i.e., the original order of the data, or of the Z values in an mdoc file. -dfile (-dfi) OR -DoseWeightingFile File name Name of file with dose information for dose weighting. This entry is required if -dtype is entered, unless a 4 is entered there and a file was entered with the -mdoc option. -dtotal (-dto) OR -FixedTotalDose Floating point Total dose for each set of frames, in electrons/square Angstrom. This option independently enables dose weighting and cannot be entered with -dtype or -dframe. -dframe (-dfr) OR -FixedFrameDoses Floating point List of frame doses and numbers that applies to all sets of frames. The list is a set of paired numbers: a dose in elec- trons/square Angstrom and the number of frames taken at that dose (e.g., enter "2,10,4,5" for 10 frames of 2 e/A^2 and 5 frames of 4 e/A^2). This option independently enables dose weighting and cannot be entered with -dtype or -dtotal. -dprior (-dp) OR -InitialPriorDose Floating point Dose applied before any of the images in each frame set were taken; this value will be added to all the prior dose values (if any), however they were obtained. -bidir (-bid) OR -BidirectionalNumViews Integer Number of views in the first part of a bidirectional tilt series, where the order of images in the input file is inverted from their order of acquisition. This entry is essential for a bidirectional series if a fixed dose is entered with -dfixed or if a dose file of type 1 or 5 is entered. It is ignored for types 2 and 3. It is not needed for type 4 and an .mdoc file entered with the -mdoc option, where the correct doses are available either from PriorRecordDose entries or from analysis of DateTime entries. However, it would be needed for an .mdoc file entered with -dfile unless the -accum option is entered with a 2 to use prior dose information from the .mdoc file. -accum (-ac) OR -DoseAccumulates Integer This option can be used to override the program's assumptions about whether images are from a tilt series where dose should be accumulated between successive images. By default, dose accumu- lates whenever -stack, -tilt, or -saved is entered, or when -mdoc is entered with an .mdoc file having tilt angles that vary more than 1 degree; otherwise it does not. When there is such an .mdoc file, the cumulative dose information is taken from either the prior Record dose (if present) or the sum of doses of earlier images, even when only a subset of frame sets (images) are being used. However, if an .mdoc file is entered as a dose file, the dose will not accumulate. For this option, enter 0 to prevent accumulating dose or 1 or 2 to accumulate dose. The latter will have the same effect except when a subset of images is being aligned and an .mdoc file is entered with -dose instead of -mdoc. In that case 1 will make it accumulate dose by summing the doses over the images being aligned, while 2 will make it use cumulative dose information from the whole set of images in the .mdoc file. -normalize (-nor) OR -NormalizeDoseWeighting Normalize the dose weight filters so that they add up to to 1.0 within each set of frames. This option is designed for tilt series where the true dose weighting of the tilt images is to be done later (e.g., with Mtffilter). It compensates for the difference in dose effect among the frames of an individual image, which can be significant for some early images if the dose is high enough. However, it results in no net filtering of the summed tilt image, leaving this dose weighting for a later stage. If this option is applied to a single-particle frame stack, the high frequencies will be boosted in early frames to compensate for their being attenuated in later frames. -volt (-vo) OR -Voltage Integer Microscope voltage in kV; this value must be either 200 or 300. The default is 300; if 200 is entered, the computed critical and optimal doses are multiplied by 0.8. -optimal (-op) OR -OptimalDoseScaling Floating point Factor by which to scale the computed optimal and critical doses that determine how much to attenuate a spatial frequency for a particular dose. Enter a factor greater than or less than 1 to indicate that the specimen is more or less resistant to damage than the equations indicate. Another use for this entry would be to adjust the critical dose for a voltage other than 200 and 300 kV. -critical (-c) OR -CriticalDoseFactors Three floats Replacement factors a, b, and c in the equation critical_dose = a * k**b + c where k is frequency in reciprocal Angstroms. The default fac- tors are directly from Grant and Grigorieff and the unblur pro- gram. Enter 0 for any of the factors to use the default for that factor. -unweight (-un) OR -UnweightedOutputFile File name File name for output of summed images that have not been dose weighted in addition to the dose-weighted images placed in the main output file. OPTIONS FOR ASSESSING ALIGNMENT AND OTHER TEST PURPOSES This section includes options that are useful for assessing alignment parameters (mainly binning and filters) and advanced options that should not need adjustment. -test (-te) OR -TestBinnings Multiple integers Set of binnings at which to test pairwise alignments. Each bin- ning involves a separate pass through the frames of an input file, plus another pass to make a sum at the end with the best binning. -assess (-as) OR -AssessWithFrames Two integers Starting and ending frame to use during testing of parameters with multiple binnings or filters, numbered from 1. The default is to use all frames. -good (-go) OR -GoodEnoughError Floating point Combined error measure that is sufficient to stop testing bin- nings. This error is a weighted sum of two measures: the mean of the weighted mean residual errors from the set of fits, and the maximum weighted residual seen in any fit. The latter is weighted by the value entered by -weight option. -weight (-w) OR -MaxResidualWeight Floating point Weighting applied to maximum weighted residual from all fits when combining with the mean weighted residual to obtain a sin- gle error measure. The default is 0.1. -trim (-tri) OR -TrimFraction Floating point Fraction of image size to trim off each edge for correlations. The default is 0.02, which is the same amount as the padding. -taper (-tap) OR -TaperFraction Floating point Fraction of image size to taper on each edge for correlations. The image is tapered down to the mean at its edge. This taper- ing is usually an important component for reliable correlations. The default is 0.1. If this fraction is set to 0 and there is no trimming, the program obtains the FFT for cross-correlation by extracting it from the FFT of the full image instead of by reducing, padding and tapering the image and taking an FFT of that. The padding/tapering extent of the full image is increased from 2% on each edge to 5% in this case. -antialias (-an) OR -AntialiasFilter Integer Type of filter for image reduction when trimming or tapering. The standard values of 1 to 6 are available as in Newstack, with 1 corresponding to binning. The default is 4 for a Mitchell filter, which seems to be optimal on average for this application. -radius1 OR -FilterRadius1 Floating point Low spatial frequencies in the cross-correlations will be atten- uated by a Gaussian curve that is 1 at this cutoff radius and falls off below this radius with a standard deviation specified by FilterSigma2. Spatial frequency units range from 0 to 0.5. This option is here for the sake of completeness; use Filter- Sigma1 instead of this entry for more predictable attenuation of low frequencies. -sigma1 OR -FilterSigma1 Floating point Sigma value to filter low frequencies in the correlations with a curve that is an inverted Gaussian. This filter is 0 at 0 fre- quency and decays up to 1 with the given sigma value. However, if a negative value of radius1 is entered, this filter will be zero from 0 to |radius1| then decay up to 1. The default is 0.03, expressed in frequency units (1/pixel) of the reduced images being correlated. -sigma2 OR -FilterSigma2 Floating point Sigma value for the Gaussian rolloff below and above the cutoff frequencies specified by FilterRadius1 and FilterRadius2. Like radius 2, this value is entered in unbinned frequency units and will be scaled by the reduction being applied for alignment. The default is 0.0086. -kfactor (-k) OR -KFactorForFits Floating point K factor for robust fitting to pairwise alignments. The default is 4.5; a smaller value will down-weight more outliers in the fits. -debug (-deb) OR -DebugOutput Integer This entry is a sum of flags for particular kinds of output: the last digit controls the printed output from the program (1 for basic output, 2 for more verbose output from each fit and dose weighting filters); 10 will give timing output, 100 will give output of cross-correlations during initial alignment; 1000 will give output of refining correlations; 10000 will give output of the sums of odd and even frames. Images are output with names "faimg-n.mrc" where n is increased sequentially. -flags (-fl) OR -FlagsForGPU Integer Flags to control which kinds of initial processing occur on the GPU. These flags are for testing and supersede the decisions that the program would make based on the memory available. Enter a sum of 1 to do noise-taper-padding of full images on the GPU 10 to do reduction and taper-padding of alignment images on the GPU 100 to stack full-sized, unpadded frames on the GPU, provided that both the noise padding and reduction padding are to be done there 1000 to do gain normalization, defect correction, and trunca- tion on the GPU, provided that noise padding or reduction pad- ding are to be done there 10000 times the maximum number of frames to stack on the GPU -shrmem (-shr) OR -ShrMemTest Use shrmemframe to do align frames (special Windows build only) -help (-he) OR -usage Print help output -StandardInput Read parameter entries from standard input EXAMPLES The example commands below would all be entered in one line, or as mul- tiple lines with a backslash at the end of every line except the last, as they are shown here due to line length limits. Most options could be abbreviated more than they are. Frames from Tilt Series An image from a tilt series with ~4Kx~4K images might align well with a command as simple as alignframes Feb21_10.43.50.mrc Feb21_10.43.50_ali.mrc where you can add "-gpu 0" to use a GPU on this or any of the following commands. This will use a default binning of 3 and filter cutoff of 0.06/pixel. If you already know that you want a different binning (say, 4) or filter (say, 0.05), then the easiest way to enter this is with alignframes -bin 4,1 -vary 0.05 Feb21_10.43.50.mrc \ Feb21_10.43.50_ali.mrc where using -vary will scale the sigma for the filter rolloff automati- cally. If the frame files for your tilt series list in order from one high tilt to the other and the tilt series file is available (say, cell4.mrc), alignframes -bin 4,1 -vary 0.05 -stack cell4.mrc Feb21_*.mrc \ cell4_ali.mrc will process the entire tilt series and move information from the header of the tilt series file into the new file. However, if the tilt series was taken bidirectionally and the frames files do not list in order, or if the there were any duplicate images taken, then you want to supply the .mdoc file instead of the stack and input files: alignframes -bin 4,1 -vary 0.05 -mdoc cell4.mrc.mdoc cell4_ali.mrc which will process the frames in the right order and place essential information in the output file header. To explore filter settings, it is best to run on a collection of files, perhaps about 20. Suppose "Feb21_10.4*.mrc" lists the desired number of files, you could explore a range of filters at binning 3 with alignframes -vary 0.05,.06,.08,.1 -nosum Feb21_10.4*.mrc The summary at the end will tell which filter gave the lowest mean residual. Be sure to scan through the results and look at whether either the mean residual or the distance moved declines a lot with increased filtering, which may be a sign that fixed pattern noise is a problem. Also note whether the maximum unweighted residual is high. If it goes over 5-10 pixels, you can problem reduce the occurrence of bad fits by restricting the allowed shift with "-shift 10" or even "-shift 5". Obviously, you do not want to set this limit lower than the maximum possible shift from one frame to the next. Evaluating and Visualizing Differences with FRC Curves Suppose you want to use FRCs to evaluate the difference between two conditions, such as filter 0.05 versus 0.06, or with and without refinement. You need to make output sums to get an FRC. alignframes -vary 0.05 -frc sample.frc Feb21_10.4*.mrc sample.mrc alignframes -vary 0.05 -ref 5 -frc sample-refine.frc \ Feb21_10.4*.mrc sample.mrc The ".frc" files have all of the FRC curves for a run, each with a "type" number equal to its file number in the alignframes text output. You can plot one curve (say, the fourth one) with onegenplot -ty 4 -sym 0 sample.frc To compare FRC's, run subtractcurves -ave 10 -rad 2 sample-refine.frc sample.frc \ refine-diff.dat which will subtract each pair of corresponding curves and average over 10 points to reduce the noise in the differences. You could have avoided the averaging here with the option "-ring .05" to alignframes, which will make the FRC curve much less noisy in Onegenplot. (The default FRC output is good for seeing the CTF oscillations in single- particle data, but a larger ring size will often be more appropriate for tilt series.) You could examine each difference in a separate graph with onegenplot -ty 1 -sym 0 refine-diff.dat & onegenplot -ty 2 -sym 0 refine-diff.dat & etc. But to assess the overall benefit of the difference in conditions, it is more efficient to look at a lot of points at once: onegenplot -ty 1,2,3,4,5,6,7,8,9,10 refine-diff.dat & onegenplot -ty 11,12,13,14,15,16,17,18,19,20 refine-diff.dat & etc. When looking at these curves, treat any improvements below about 1.3 times the cutoff frequency with caution because they could arise from overfitting. Improvements past this point would not reflect overfit- ting, although if the fit is locking in on fixed pattern noise, it would boost high frequency correlations. Frames for Single-Particle Reconstruction For single-particle data, the best initial approach is to fit to pair- wise shifts among about half of the frames. So for a set of 34 frames, if you wanted to do an initial assessment with two filters and look at the FRC, alignframes -vary 0.05,.06 -pair 17 -frc test.frc \ Feb15_10.13.15.mrc test.mrc which will use a default binning of 6 for ~8K frames. A more flexible command for fitting to shifts among half the frames, regardless of the exact frame count, is alignframes -vary 0.05,.06 -pair -2 -frc test.frc \ Feb15_10.13.15.mrc test.mrc If you had some indication that fixed pattern noise was a problem, or if the data seemed particularly noisy, giving mean residuals above 1, you could use pairwise shifts among all frames with alignframes -vary 0.05 -pair -1 -frc test.frc Feb15_10.13.15.mrc \ test.mrc For even noisier situations, you can use grouping to reduce the mean residual and improve the fits: alignframes -vary 0.05 -pair -1 -group 3 -refine 5 -frc test.frc \ Feb15_10.13.15.mrc test.mrc where the refinement at the end may or may not be beneficial, depending on just how noisy the single frames are. The option specifies up to 5 iterations of refinement, which is almost always sufficient unless refining the data as groups with the -rgroup option. Processing of Raw Frames If you have collected dark-subtracted data from a K2 camera into MRC or compressed TIFF files, you can process them with alignframes -gain SuperRef_Feb20_20.26.21.dm4 -scale 39.3 -rot -1 \ -defect defects_Feb20_20.26.21.txt -pair 17 Feb21_10.32.56.tif \ Feb21_10.32.56_ali.mrc using the gain reference copied into the data directory and the defect file written there by SerialEM. The option "-rot -1" specifies that the gain reference needs to be rotated by the value for rotation and flip indicated by "r/f" in a label in the file header. MRC files started having this label in SerialEM 3.4, TIFF files in SerialEM 3.5, so check either kind of file by running "header" on the file. If the label does not show "r/f", the number to use in the "-rot" entry would be the RotationAndFlip value in the SerialEM properties file. If frames were saved without rotation, do not include this option. Either the "-scale" option as shown here, or the "-total" option with a total scaling, or the option "-mode 2", should be entered to preserve the precision of the data when they are written. The program will use a default total scaling of 30 when normalizing if the input data are byte values and data are not being written as floating point with "-mode 2". However, it is probably better not to rely on the default and instead scale the electron counts by the same factor that is applied in Seri- alEM (39.3 in this example, a typical value when not dividing by 2). You could add an option such as "-trunc 7" to replace values above 7 with the local mean. Use the command clip hist Feb21_10.32.56.tif to see the distribution of pixel values. There will be a point at which the values stop falling rapidly and then have a long tail; removal of values above there is indicated. Assessing Fixed Pattern Noise This procedure is not convenient, but does work. The first step when fixed pattern noise is suspected to affect the correlations is to out- put the individual shifts with the option "-deb 2". Use only a single filter for simpler output. You will see a series of lines like 1 to 0 -1.95 -0.17 near 0.00 0.00 2 to 0 -1.57 0.02 near -1.95 -0.17 2 to 1 -0.60 -0.77 near 0.00 0.00 3 to 0 -1.79 0.44 near -1.90 -0.30 3 to 1 0.44 -1.32 near -0.28 -0.45 3 to 2 0.36 0.02 near 0.00 0.00 Each line shows a shift between a pair of frames (numbered from 0), then the shift that it was assumed to be "near" when comparing with the maximum allowed shift. If you see many shifts close to 0 (but not exactly 0), then they could be due to fixed pattern noise. (shifts of exactly 0 might indicated thatthe maximum shift was too low. Try with some higher filter values and see if the incidence of near- zero values increases. Set a binning lower than the default (e.g., 4 instead of 6, 2 instead of 3) if necessary to make a filter value be below Nyquist. In the above example, a higher filter gave: 1 to 0 -0.80 -0.22 near 0.00 0.00 2 to 0 -0.71 0.07 near -0.80 -0.22 2 to 1 -0.08 -0.38 near 0.00 0.00 3 to 0 -0.92 0.42 near -0.77 -0.15 3 to 1 0.17 -0.81 near -0.02 -0.15 3 to 2 0.30 -0.29 near 0.00 0.00 To visualize the correlation peak from fixed pattern noise, you need to run the program with options that will make it save unbinned correla- tions with no high-frequency filtering. This is available only when not using a GPU and when specifying more than one filter. alignframes -bin 1,1 -vary 0.05,0.06 -deb 102 -frame 1,5 -nosum \ Feb15_10.13.15.mrc There will be series of lines indicating the names of saved images and the measured shift between a pair of frames. Pick one with more than a few pixels of shift, which may end up being the one between frames 0 and 4. For example: Saved lf correlation image frame 4 in ./faimg-18.mrc mean -0.00 Saved correlation image frame 4 in ./faimg-19.mrc mean -0.03 4 to 0 -0.59 -2.70 near -1.90 0.14 The last line shows the shift between frames 4 and 0, then the shift that it was assumed to be "near" when comparing with the maximum allowed shift. You want to examine the "lf correlation image" in "faimg-18.mrc" ("faimg-19.mrc" is a small subarea with high-frequency filtering, centered on the expected shift, so is not suitable). You can open this file directly in 3dmod, but it will easier to load a cen- tered subarea. Run "header faimg-18.mrc" to see its size, NX and NY. Determine these values: xst = NX / 2 - 200 xnd = NX / 2 + 199 yst = NY / 2 - 200 ynd = NY / 2 + 199 and open the file with 3dmod -x xst,xnd -y yst,ynd faimg-18.mrc Open a slicer window and close it; this will place the current point marker on the middle pixel. Zoom the Zap window up to about 6 and turn off the high-quality display (the checkerboard) to see individual pix- els. Adjust black and white levels to 0 and 255. The degree to which this central pixel stands out indicates the amount of fixed pattern noise. Increase the black level to see just how much it stands out. If you can see 4 brighter pixels around the central one, the situation is particularly bad. You may wonder where the real correlation peak went! It is very dif- fuse and the high-frequency filtering is essential. Use Edit-Image- Process to open the processing dialog and select the Fourier filtering panel. Set the high frequency cutoff to your filter value and the falloff to one-seventh of that and press "Apply" to filter. The real peak will now be prominent. If it is close to the origin, you might see its position change with different filters as it merges with the fixed pattern peak. AUTHOR David Mastronarde SEE ALSO Email bug reports to mast at colorado dot edu. IMOD 5.0.1 alignframes(1)