findsection(1) General Commands Manual findsection(1) NAME findsection - Locates section boundaries in tomograms SYNOPSIS findsection [options] tomogramFile(s) DESCRIPTION Findsection will analyze an entire tomogram, or separate samples from a tomogram, to determine the boundaries of sectioned material. It can create a simple model of straight lines along the top and bottom sur- faces, suitable for use in Tomopitch. It can also create a model with contours along the surface, suitable for use with Flattenwarp. It will also estimate the top and bottom Z levels at which various amounts of the material are present, which may be useful for setting the Z limits when combining two tomograms or when trimming a tomogram. In a largely separate mode of operation, it can analyze the structure in a cryo-tomogram to determine the best orientation angles and lower and upper limits in the thickness dimension, then produce a model for Tomopitch. See the EXAMPLES section for command-line entries in some of these dif- ferent cases. Detection of Surfaces in Sectioned Material The detection of surfaces begins with an analysis of the standard devi- ation of image intensity in an array of overlapping boxes. This analy- sis is done simultaneously at a number of scales or binnings. The motivation for this multi-scale analysis is to find and use the scale at which structure is most distinct from background noise. Although the options provide complete control over binning in each dimension, only isotropic binning makes sense from the standpoint of noise sup- pression. However, that does not mean that the boxes themselves need to be cubes; instead, they can be very thin in the thickness dimension to provide the maximum resolution for detecting the section boundary. The boxes also do not need to contain the same number of binned pixels, as long as they are big enough to contain enough pixels for a good estimate of SD at the highest binning. It thus works to have boxes occupy about the same volume at the different scales, so this happens by default. An overlap of 50% between boxes is also a default, so overlap need not be specified. In short, the two important entries that are needed are the number of scales to analyze and the unbinned box size. Four scales are appropriate unless data are already binned. Box sizes of 32x32x1 or bigger are useful. After SD is measured in each box, the program measures the median and the normalized median absolute deviation (MADN) of SD in two regions: a central region where SD appears to be high (which varies in depth across the area), and thin regions near the top and bottom in the depth dimension. The number of boxes, median, and MADN of each regions are shown for each scaling in a table. On the far right of the table is a measure of how distinct the central region is from the edge at that scaling; these numbers are used to pick the best scaling. Further analysis is done by grouping columns of the overlapping boxes into blocks, ideally consisting of about 25 boxes in X and Y. The extent of a block in unbinned pixels is controlled by the -block option, which has situation-dependent defaults. In a block, the median of the SD values is determined at each Z level; this median is insensi- tive to the presence of gold particles outside the section as long as the gold does not contribute to too many boxes. The Z-levels at which the SD falls off the fastest are taken as the boundaries of the section at the X/Y location of the block. The collection of boundary points on each surface is then smoothed with a local robust regression; this smoothing can thus eliminate aberrant points arising either from gold outside the section or a small hole in density at the surface. The smoothed points are used directly if a boundary model is being created. To make a model for tomopitch when there are multiple (sample) tomograms, the program fits a pair of lines to the points on the two surfaces for each sample tomogram. When there is a single tomogram, the program fits a pair of lines at a specified number of locations in Y; the fitted points may extend over more than one block in Y. In either case, the lines are then spread apart so that they contain all of the points, with a small margin added based on how fast density falls at the surface. Regardless of whether there is a model output, when given a single tomogram, the program computes the median Z value at each surface, the Z values that contain all boundaries, and Z limits suitable for auto- matic patch fitting with Autopatchfit. By default, the latter lim- its will contain at least 90% of the boundary points and the limits plus or minus 20 pixels will contain at least 99% of the points. Analysis of High SD Values in Cryo-tomograms When the -high option is entered, the program uses a much less ambi- tious analysis. The computation of SD values in boxes proceeds as usual, but the statistics of the central region are obtained from a sample with a much bigger extent in depth, and there is no attempt to vary the depth of the sample. The measure of the distinctness between center and edge for each scale is the fraction of boxes in the center that deviate by the criterion number of MADN's from the edge median. The best scale is chosen from this distinctness measure. The program takes the lowest and highest box in each column that devi- ates sufficiently from the edge as boundary points, then searches for the rotation angles that minimize a measure of spread. At each rota- tion angle, the program measures the MADN of the Z values separately for the rotated lower and upper boundary points. The two MADN values are normalized by their values at zero rotation and averaged. If a model of gold bead positions is supplied with the -bead option, then the lower and upper Z values of these positions are determined at each rotation, with a certain fraction of most extreme values ignored. The difference between lower and upper Z values is normalized by its value at zero rotation and averaged in to the spread measure. Once the rotation angles are known, the program analyzes a histogram of Z values for the rotated lower boundary points, finding the point where rises above its baseline level when going from low to high Z. It does the same analysis for the upper boundary points, but going from high to low Z. These two boundary values are then spread apart by the amount indicated by the -boost option, if any. If gold bead positions are also available, upper and lower limits are found for the rotated posi- tions and combined with the limits from the boundary points to give the reported boundaries. A model for Tomopitch is using the angles and boundaries. Other methods for analyzing high SD values are still present in the code and there are control values for activating them, but these alter- natives did not perform well and are not described here. OPTIONS Findsection uses the PIP package for input (see the manual page for pip). Options can be specified either as command line arguments (with the -) or one per line in a command file (without the -). Options can be abbreviated to unique letters; the currently valid abbreviations for short names are shown in parentheses. -tomo (-to) OR -TomogramFile File name Name of image file to analyze. At least one image file must be entered either with this option or as a non-option argument. All non-option arguments are taken to be tomograms for analysis. (Successive entries accumulate) -surface (-su) OR -SurfaceModel File name File name for output of a surface model that can be used for flattening in Flattenwarp. -pitch (-pi) OR -TomoPitchModel File name File name for output of boundary model for use with Tomo- pitch(1). Such a model will have two straight lines at each sample position. If multiple tomograms are analyzed, each is assumed to be a separate sample, and there will be a pair of lines for each sample, with the contours assigned to times cor- responding to the sample number. If there is only a single tomogram, then the -samples option must be entered to indicate how many regions in Y to sample. -separate (-se) OR -SeparatePitchLineFits When making a model of paired lines for use in Tomopitch, the default is to fit a pair of parallel lines to the top and bottom surfaces. With this option, lines will be fit separately to points on each surface. This method would be suitable if every sample has material across a wide enough area so that the fit will not be thrown off by a few aberrant points. -samples (-sa) OR -NumberOfSamples Integer Number of positions to sample in a single tomogram to obtain lines for Tomopitch. The position and spacing between these samples is determined by their number, the sample extent, and the total number of blocks of analyzed boxes in Y. -extent (-ex) OR -SampleExtentInY Integer Approximate extent in Y analyzed at sampled positions from a single tomogram, in unbinned pixels. Since a surface position is estimated only for each block of boxes, spanning 100 unbinned pixels by default, the actual extent included in the analysis will be based on an integral number of blocks. The default is to analyze one block at each sample position. When multiple tomogram samples are analyzed, all of the available positions will be used. -high (-hi) OR -HighSDboxCriterion Floating point Determine sample extent and orientation from boxes with SD val- ues that are the given criterion number of MADNs above the edge median. -bead (-be) OR -BeadModelFile File name When finding extent and orientation from boxes with high SD val- ues, a set of fiducial positions will also be taken into account if this option is entered with a model file containing bead positions. -diameter (-di) OR -BeadDiameter Floating point Diameter of beads to assume when including bead positions in the analysis. Half the diameter will be subtracted or added to the lower or upper position based on beads, respectively. The default is 5. -boost (-bo) OR -BoostHighSDThickness Floating point Fraction by which to increase the thickness from the analysis of high SD. The lower and upper boundaries from this analysis will each be moved by half of this amount, or less if that brings one of them to the edge of the volume. If bead positions are being considered also, limits based on them are applied after this increase. -lowest (-l) OR -LowestSDforEdges Scan through Z planes for the ones with the lowest SD values and use these to measure the statistics of the edge, instead of using planes atthe surfaces of the volume. -scales (-sc) OR -NumberOfDefaultScales Integer This option can be used to specify how many default binnings to analyze, instead of entering each one with the -binning option. These binnings are isotropic (the same in each dimension). The default binnings available are 1, 2, 3, 4, 6, 8, 12, 16, 24, 32, 48, and 64. The default is to do a single scale at binning 1. -binning (-bi) OR -BinningInXYZ Three integers Binning in X, Y, and Z for each scale to analyze. Multiple bin- ning entries should be in order by increased binning. This option cannot be entered with -scale. (Successive entries accu- mulate) -size (-si) OR -SizeOfBoxesInXYZ Three integers Size in X, Y, and Z of boxes in which to measure mean and SD, in binned pixels. This option can be entered multiple times, up to once per each scaling, but one entry seems to be sufficient. For scalings past the last one for which a size was entered, the size in each dimension will be set to span about the same extent in unbinned pixels as for the last binning for which size was entered. The entry is required. (Successive entries accumulate) -spacing (-sp) OR -SpacingInXYZ Three integers Spacing in X, Y, and Z between boxes, in binned pixels. This option can be entered multiple times, only once, or not at all; the default is to set the spacing to half of the size. For scalings past the last one for which a spacing was entered, the spacing in each dimension will be set to give the same overlap between boxes as for the last binning for which a spacing was entered. (Successive entries accumulate) -block (-bl) OR -BlockSize Integer Size of block in which to consolidate the boxes for further analysis, in unbinned pixels. If this option is not entered, the program will start with a size of 100 pixels, or 200 if mak- ing a surface model, and then increase the size to get an equiv- alent area if there are too few boxes in one direction (specifi- cally, when using multiple tomogram samples, the size generally gets increased to ~300). If the option is entered, the number is used as is, without such an adjustment. -xminmax (-x) OR -XMinAndMax Two integers Minimum and maximum X coordinate to include in the analysis. The default is to trim off 2.5% of the extent on each end when outputting a surface model, otherwise 5%. -yminmax (-y) OR -YMinAndMax Two integers Minimum and maximum Y coordinate to include in the analysis. If Y is the thickness dimension, the default is to use the whole extent; otherwise the default is to trim off either 2.5% or 5% of the extent on each end, depending on whether a surface model is being made. -zminmax (-z) OR -ZMinAndMax Two integers Minimum and maximum Z coordinate to include in the analysis. If Z is the thickness dimension, the default is to use the whole extent; otherwise the default is to trim off either 2.5% or 5% of the extent on each end, depending on whether a surface model is being made. -flipped (-f) OR -ThickDimensionIsY Integer This option can be used to specify which axis of a single tomo- gram is the thickness dimension, if necessary. The default is to assume that the shortest dimension of Y or Z is the thickness dimension. Multiple tomograms are assumed to be samples as built by Tilt and must have their thickness in Y. -axis (-a) OR -AxisRotationAngle Floating point Rotation angle from Y axis to tilt axis in the raw tilt series, counterclockwise positive. With this entry, the program will avoid analyzing regions outside the area that can be well-recon- structed from the original images. However, the correct region is identified only if the aligned stack and reconstruction were centered on the original tilt series. -tilt (-ti) OR -TiltSeriesSizeXY Two integers Size in X and Y of raw tilt series for volume being analyzed, divided by the binning applied to make its aligned stack. When -axis option is entered, this option should be entered if this size differs from that of the reconstruction. -edge (-ed) OR -EdgeExtentInXYZ Three integers Approximate # of pixels in X, Y, and Z to use for getting sta- tistics about the edge of the volume in the thickness dimension. The default is to use 2.5% of the extent in the thickness dimen- sion and 50% of the extent in the other two dimensions. -center (-ce) OR -CenterExtentInXYZ Three integers Approximate # of pixels in X, Y, and Z to use for getting sta- tistics about the center of the volume. The default is to use 10% of the extent in the thickness dimension and 33% of the extent in the other two dimensions. -control (-co) OR -ControlValue Two floats Parameter number and value for setting algorithm control parame- ters. Parameters and their numbers (and default values in parentheses) are: 1: Minimum # of points for using robust fit to get pitch line on one surface (6) 21: Fraction that the difference between distinguishability of center from edge points must improve to adopt a higher scaling for analysis (0.33) 22: Threshold weight from robust fit for including a point in the final smoothing fit (0.2) 23: Threshold weight from robust fit for counting a point as "good" (0.6) 27: Fraction of depth extent to use for center samples (0.1, or 0.4 if analyzing high SD) 30: Take square root of SD values if > 0 (0, or 1 when analyz- ing high SD) Parameters for finding midpoint 3: Number of edge MADN's above edge median that maximum value must be to proceed (2.) 4: Fraction of maximum - edge difference to achieve (0.5) 5: Number of edge MADNs above edge to achieve as well (3.) 6: Number of box medians that need to be above those criteria (3) Parameters for fitting boundaries of columns 7: Number of center MADN's below the center median for inside median to be too low (5.) 8: Fraction of inside - edge median difference that it must fall toward edge median (0.3) 9, 10: Low and high limits of range of fractions of inside - edge median difference to fit (0.2 and 0.8) 11: Fraction of inside - edge median difference at which to save boundary (0.5) 12: Fraction of difference at which to estimate extra boundary distance for pitch output (0.25) 13: Minimum fraction of boxes in column that must yield bound- aries (0.5) Parameters for checking block thickness 14: Criterion fraction of median thickness for considering block too thin (0.5) 15: Drop a boundary if it is this much farther from local mean than other boundary is (2.) 16: Drop a boundary if its difference from the mean is this fraction of median thickness (0.35) Robust fitting parameters 17: K-factor for the weighting function (4.68) 18: Maximum change in weights for terminatiom (0.02) 19: Maximum change in weights for terminating on an oscilla- tion (0.05) 20: Maximum iterations (30) Parameters for estimate of Z limits for combine with Autopatchfit 24: Fraction for percentile of positions included in the lim- its (0.10) 25: Fraction for lower percentile of positions that can be partly outside the limits (0.01) 26: Number of pixels outside the limits the latter positions can be (20) Parameters for analysis of high SD 28: Fraction of extreme beads to exclude when finding orienta- tion that gives minimum spread (0.04) 29: Basic amount to weight bead separation in the measure of spread; they will also be weighted less if they occupy less lat- eral area (0.33) 31: Type of data to use for analyzing projections of SD at each layer in depth; 1: mean, 2: median, 3: 75th percentile, 4: fraction of boxes above criterion (0) 32: Fraction of the way from baseline to peak for finding ris- ing point of layer projections (0.1) 33: Use extrapolation to baseline rather than point where layer projection crosses the criterion (0) 34: Use second or fourth moment (1 or 2) of layer projection values as measure of spread (0) 35: Fraction of extreme beads to exclude when determining low and high boundaries (0.01) (Successive entries accumulate) -volume (-v) OR -VolumeRootname Text string Root name for output of mean and SD volumes at each scale. Each pixel in such volumes corresponds to an individual box within which mean and SD were measured. The volumes names will have the form "rootname#-scale#.means" and "rootname#-scale#.SDs", where the first # is the tomogram number and the second is the scale index (both numbered from 0). -point (-po) OR -PointRootname Text string Root name for output of models with raw positions along the sur- faces of the section, and with points after smoothing the sur- face. The models will be named "rootname#-colbound.mod" and "rootname#-smooth.mod", respectively, where # is the tomogram number. There will be two scattered point objects, one for each surface. -debug (-de) OR -DebugOutput Integer 1 or 2 for debugging output; 2 gives output about individual smoothing fits. -help (-he) OR -usage Print help output -StandardInput Read parameter entries from standard input EXAMPLES To generate a model to use in Tomopitch(3), given a whole sample tomo- gram that has been binned by 3 or more, still in its original orienta- tion: findsection -scal 2 -size 16,1,16 -block 48 -samp 5 -pitch tomopitch.mod filename.rec To generate a model to use in Tomopitch(3), given three unbinned sam- ples (bot.rec, mid.rec, top.rec) that each have 20 slices: findsection -scal 4 -size 50,1,20 -pitch tomopitch.mod bot.rec mid.rec top.rec To generate a model to use in Flattenwarp: findsection -scal 4 -size 32,32,1 -surf setname_flat.mod -axis -12 setname.rec where the volume should already be post-processed so that Z is the depth dimension, setname is the name of the dataset, and the number after "-axis" should be your tilt axis rotation angle. If the aligned stack or tomogram was bigger or smaller than a full-sized aligned stack in X and Y, then you also need to add "-tilt nx,ny" where "nx" and "ny" are the size the raw tilt series, divided by the binning if any. HISTORY Written by David Mastronarde, September 2014, to replace an earlier Fortran program of the same name. BUGS Email bug reports to mast at colorado dot edu. IMOD 5.2.0 findsection(1)