findsection(1)              General Commands Manual             findsection(1)



NAME
       findsection - Locates section boundaries in tomograms

SYNOPSIS
       findsection [options] tomogramFile(s)

DESCRIPTION
       Findsection will analyze an entire tomogram, or separate samples from a
       tomogram, to determine the boundaries of sectioned material.  It can
       create a simple model of straight lines along the top and bottom sur-
       faces, suitable for use in Tomopitch.  It can also create a model
       with contours along the surface, suitable for use with Flattenwarp.
       It will also estimate the top and bottom Z levels at which various
       amounts of the material are present, which may be useful for setting
       the Z limits when combining two tomograms or when trimming a tomogram.
       In a largely separate mode of operation, it can analyze the structure
       in a cryo-tomogram to determine the best orientation angles and lower
       and upper limits in the thickness dimension, then produce a model for
       Tomopitch.

       See the EXAMPLES section for command-line entries in some of these dif-
       ferent cases.

   Detection of Surfaces in Sectioned Material
       The detection of surfaces begins with an analysis of the standard devi-
       ation of image intensity in an array of overlapping boxes.  This analy-
       sis is done simultaneously at a number of scales or binnings.  The
       motivation for this multi-scale analysis is to find and use the scale
       at which structure is most distinct from background noise.  Although
       the options provide complete control over binning in each dimension,
       only isotropic binning makes sense from the standpoint of noise sup-
       pression.  However, that does not mean that the boxes themselves need
       to be cubes; instead, they can be very thin in the thickness dimension
       to provide the maximum resolution for detecting the section boundary.
       The boxes also do not need to contain the same number of binned pixels,
       as long as they are big enough to contain enough pixels for a good
       estimate of SD at the highest binning.  It thus works to have boxes
       occupy about the same volume at the different scales, so this happens
       by default.  An overlap of 50% between boxes is also a default, so
       overlap need not be specified.  In short, the two important entries
       that are needed are the number of scales to analyze and the unbinned
       box size.  Four scales are appropriate unless data are already binned.
       Box sizes of 32x32x1 or bigger are useful.

       After SD is measured in each box, the program measures the median and
       the normalized median absolute deviation (MADN) of SD in two regions: a
       central region where SD appears to be high (which varies in depth
       across the area), and thin regions near the top and bottom in the depth
       dimension.  The number of boxes, median, and MADN of each regions are
       shown for each scaling in a table.  On the far right of the table is a
       measure of how distinct the central region is from the edge at that
       scaling; these numbers are used to pick the best scaling.

       Further analysis is done by grouping columns of the overlapping boxes
       into blocks, ideally consisting of about 25 boxes in X and Y.  The
       extent of a block in unbinned pixels is controlled by the -block
       option, which has situation-dependent defaults.  In a block, the median
       of the SD values is determined at each Z level; this median is insensi-
       tive to the presence of gold particles outside the section as long as
       the gold does not contribute to too many boxes.  The Z-levels at which
       the SD falls off the fastest are taken as the boundaries of the section
       at the X/Y location of the block.

       The collection of boundary points on each surface is then smoothed with
       a local robust regression; this smoothing can thus eliminate aberrant
       points arising either from gold outside the section or a small hole in
       density at the surface.  The smoothed points are used directly if a
       boundary model is being created.  To make a model for tomopitch when
       there are multiple (sample) tomograms, the program fits a pair of lines
       to the points on the two surfaces for each sample tomogram.  When there
       is a single tomogram, the program fits a pair of lines at a specified
       number of locations in Y; the fitted points may extend over more than
       one block in Y.  In either case, the lines are then spread apart so
       that they contain all of the points, with a small margin added based on
       how fast density falls at the surface.

       Regardless of whether there is a model output, when given a single
       tomogram, the program computes the median Z value at each surface, the
       Z values that contain all boundaries, and Z limits suitable for auto-
       matic patch fitting with Autopatchfit.  By default, the latter lim-
       its will contain at least 90% of the boundary points and the limits
       plus or minus 20 pixels will contain at least 99% of the points.

   Analysis of High SD Values in Cryo-tomograms
       When the -high option is entered, the program uses a much less ambi-
       tious analysis. The computation of SD values in boxes proceeds as
       usual, but the statistics of the central region are obtained from a
       sample with a much bigger extent in depth, and there is no attempt to
       vary the depth of the sample.  The measure of the distinctness between
       center and edge for each scale is the fraction of boxes in the center
       that deviate by the criterion number of MADN's from the edge median.
       The best scale is chosen from this distinctness measure.

       The program takes the lowest and highest box in each column that devi-
       ates sufficiently from the edge as boundary points, then searches for
       the rotation angles that minimize a measure of spread.  At each rota-
       tion angle, the program measures the MADN of the Z values separately
       for the rotated lower and upper boundary points.  The two MADN values
       are normalized by their values at zero rotation and averaged.  If a
       model of gold bead positions is supplied with the -bead option, then
       the lower and upper Z values of these positions are determined at each
       rotation, with a certain fraction of most extreme values ignored.  The
       difference between lower and upper Z values is normalized by its value
       at zero rotation and averaged in to the spread measure.

       Once the rotation angles are known, the program analyzes a histogram of
       Z values for the rotated lower boundary points, finding the point where
       rises above its baseline level when going from low to high Z.  It does
       the same analysis for the upper boundary points, but going from high to
       low Z.  These two boundary values are then spread apart by the amount
       indicated by the -boost option, if any.  If gold bead positions are
       also available, upper and lower limits are found for the rotated posi-
       tions and combined with the limits from the boundary points to give the
       reported boundaries.  A model for Tomopitch is using the angles and
       boundaries.

       Other methods for analyzing high SD values are still present in the
       code and there are control values for activating them, but these alter-
       natives did not perform well and are not described here.

OPTIONS
       Findsection uses the PIP package for input (see the manual page for
       pip).  Options can be specified either as command line arguments
       (with the -) or one per line in a command file (without the -).
       Options can be abbreviated to unique letters; the currently valid
       abbreviations for short names are shown in parentheses.

       -tomo (-to) OR -TomogramFile   File name
              Name of image file to analyze.  At least one image file must be
              entered either with this option or as a non-option argument.
              All non-option arguments are taken to be tomograms for analysis.
              (Successive entries accumulate)

       -surface (-su) OR -SurfaceModel     File name
              File name for output of a surface model that can be used for
              flattening in Flattenwarp.

       -pitch (-pi) OR -TomoPitchModel     File name
              File name for output of boundary model for use with Tomo-
              pitch(1).  Such a model will have two straight lines at each
              sample position.  If multiple tomograms are analyzed, each is
              assumed to be a separate sample, and there will be a pair of
              lines for each sample, with the contours assigned to times cor-
              responding to the sample number.  If there is only a single
              tomogram, then the -samples option must be entered to indicate
              how many regions in Y to sample.

       -separate (-se) OR -SeparatePitchLineFits
              When making a model of paired lines for use in Tomopitch, the
              default is to fit a pair of parallel lines to the top and bottom
              surfaces.  With this option, lines will be fit separately to
              points on each surface.  This method would be suitable if every
              sample has material across a wide enough area so that the fit
              will not be thrown off by a few aberrant points.

       -samples (-sa) OR -NumberOfSamples       Integer
              Number of positions to sample in a single tomogram to obtain
              lines for Tomopitch.  The position and spacing between these
              samples is determined by their number, the sample extent, and
              the total number of blocks of analyzed boxes in Y.

       -extent (-ex) OR -SampleExtentInY   Integer
              Approximate extent in Y analyzed at sampled positions from a
              single tomogram, in unbinned pixels.  Since a surface position
              is estimated only for each block of boxes, spanning 100 unbinned
              pixels by default, the actual extent included in the analysis
              will be based on an integral number of blocks.  The default is
              to analyze one block at each sample position.  When multiple
              tomogram samples are analyzed, all of the available positions
              will be used.

       -high (-hi) OR -HighSDboxCriterion       Floating point
              Determine sample extent and orientation from boxes with SD val-
              ues that are the given criterion number of MADNs above the edge
              median.

       -bead (-be) OR -BeadModelFile       File name
              When finding extent and orientation from boxes with high SD val-
              ues, a set of fiducial positions will also be taken into account
              if this option is entered with a model file containing bead
              positions.

       -diameter (-di) OR -BeadDiameter    Floating point
              Diameter of beads to assume when including bead positions in the
              analysis.  Half the diameter will be subtracted or added to the
              lower or upper position based on beads, respectively.  The
              default is 5.

       -boost (-bo) OR -BoostHighSDThickness    Floating point
              Fraction by which to increase the thickness from the analysis of
              high SD.  The lower and upper boundaries from this analysis will
              each be moved by half of this amount, or less if that brings one
              of them to the edge of the volume.  If bead positions are being
              considered also, limits based on them are applied after this
              increase.

       -lowest (-l) OR -LowestSDforEdges
              Scan through Z planes for the ones with the lowest SD values and
              use these to measure the statistics of the edge, instead of
              using planes atthe surfaces of the volume.

       -scales (-sc) OR -NumberOfDefaultScales       Integer
              This option can be used to specify how many default binnings to
              analyze, instead of entering each one with the -binning option.
              These binnings are isotropic (the same in each dimension).  The
              default binnings available are 1, 2, 3, 4, 6, 8, 12, 16, 24, 32,
              48, and 64.  The default is to do a single scale at binning 1.

       -binning (-bi) OR -BinningInXYZ     Three integers
              Binning in X, Y, and Z for each scale to analyze.  Multiple bin-
              ning entries should be in order by increased binning.  This
              option cannot be entered with -scale.  (Successive entries accu-
              mulate)

       -size (-si) OR -SizeOfBoxesInXYZ    Three integers
              Size in X, Y, and Z of boxes in which to measure mean and SD, in
              binned pixels.  This option can be entered multiple times, up to
              once per each scaling, but one entry seems to be sufficient.
              For scalings past the last one for which a size was entered, the
              size in each dimension will be set to span about the same extent
              in unbinned pixels as for the last binning for which size was
              entered. The entry is required.  (Successive entries accumulate)

       -spacing (-sp) OR -SpacingInXYZ     Three integers
              Spacing in X, Y, and Z between boxes, in binned pixels.  This
              option can be entered multiple times, only once, or not at all;
              the default is to set the spacing to half of the size.  For
              scalings past the last one for which a spacing was entered, the
              spacing in each dimension will be set to give the same overlap
              between boxes as for the last binning for which a spacing was
              entered.  (Successive entries accumulate)

       -block (-bl) OR -BlockSize     Integer
              Size of block in which to consolidate the boxes for further
              analysis, in unbinned pixels.  If this option is not entered,
              the program will start with a size of 100 pixels, or 200 if mak-
              ing a surface model, and then increase the size to get an equiv-
              alent area if there are too few boxes in one direction (specifi-
              cally, when using multiple tomogram samples, the size generally
              gets increased to ~300).  If the option is entered, the number
              is used as is, without such an adjustment.

       -xminmax (-x) OR -XMinAndMax   Two integers
              Minimum and maximum X coordinate to include in the analysis.
              The default is to trim off 2.5% of the extent on each end when
              outputting a surface model, otherwise 5%.

       -yminmax (-y) OR -YMinAndMax   Two integers
              Minimum and maximum Y coordinate to include in the analysis.  If
              Y is the thickness dimension, the default is to use the whole
              extent; otherwise the default is to trim off either 2.5% or 5%
              of the extent on each end, depending on whether a surface model
              is being made.

       -zminmax (-z) OR -ZMinAndMax   Two integers
              Minimum and maximum Z coordinate to include in the analysis.  If
              Z is the thickness dimension, the default is to use the whole
              extent; otherwise the default is to trim off either 2.5% or 5%
              of the extent on each end, depending on whether a surface model
              is being made.

       -flipped (-f) OR -ThickDimensionIsY      Integer
              This option can be used to specify which axis of a single tomo-
              gram is the thickness dimension, if necessary.  The default is
              to assume that the shortest dimension of Y or Z is the thickness
              dimension.  Multiple tomograms are assumed to be samples as
              built by Tilt and must have their thickness in Y.

       -axis (-a) OR -AxisRotationAngle    Floating point
              Rotation angle from Y axis to tilt axis in the raw tilt series,
              counterclockwise positive.  With this entry, the program will
              avoid analyzing regions outside the area that can be well-recon-
              structed from the original images.  However, the correct region
              is identified only if the aligned stack and reconstruction were
              centered on the original tilt series.

       -tilt (-ti) OR -TiltSeriesSizeXY    Two integers
              Size in X and Y of raw tilt series for volume being analyzed,
              divided by the binning applied to make its aligned stack.  When
              -axis option is entered, this option should be entered if this
              size differs from that of the reconstruction.

       -edge (-ed) OR -EdgeExtentInXYZ     Three integers
              Approximate # of pixels in X, Y, and Z to use for getting sta-
              tistics about the edge of the volume in the thickness dimension.
              The default is to use 2.5% of the extent in the thickness dimen-
              sion and 50% of the extent in the other two dimensions.

       -center (-ce) OR -CenterExtentInXYZ      Three integers
              Approximate # of pixels in X, Y, and Z to use for getting sta-
              tistics about the center of the volume.  The default is to use
              10% of the extent in the thickness dimension and 33% of the
              extent in the other two dimensions.

       -control (-co) OR -ControlValue     Two floats
              Parameter number and value for setting algorithm control parame-
              ters.  Parameters and their numbers (and default values in
              parentheses) are:
                1: Minimum # of points for using robust fit to get pitch line
              on one surface (6)
                21: Fraction that the difference between distinguishability of
              center from edge points must improve to adopt a higher scaling
              for analysis (0.33)
                22: Threshold weight from robust fit for including a point in
              the final smoothing fit (0.2)
                23: Threshold weight from robust fit for counting a point as
              "good" (0.6)
                27: Fraction of depth extent to use for center samples (0.1,
              or 0.4 if analyzing high SD)
                30: Take square root of SD values if > 0 (0, or 1 when analyz-
              ing high SD)
                     Parameters for finding midpoint
                3: Number of edge MADN's above edge median that maximum value
              must be to proceed (2.)
                4: Fraction of maximum - edge difference to achieve (0.5)
                5: Number of edge MADNs above edge to achieve as well (3.)
                6: Number of box medians that need to be above those criteria
              (3)
                     Parameters for fitting boundaries of columns
                7: Number of center MADN's below the center median for inside
              median to be too low (5.)
                8: Fraction of inside - edge median difference that it must
              fall toward edge median (0.3)
                9, 10: Low and high limits of range of fractions of inside -
              edge median difference to fit (0.2 and 0.8)
                11: Fraction of inside - edge median difference at which to
              save boundary (0.5)
                12: Fraction of difference at which to estimate extra boundary
              distance for pitch output (0.25)
                13: Minimum fraction of boxes in column that must yield bound-
              aries (0.5)

                     Parameters for checking block thickness
                14: Criterion fraction of median thickness for considering
              block too thin (0.5)
                15: Drop a boundary if it is this much farther from local mean
              than other boundary is (2.)
                16: Drop a boundary if its difference from the mean is this
              fraction of median thickness (0.35)
                     Robust fitting parameters
                17: K-factor for the weighting function (4.68)
                18: Maximum change in weights for terminatiom (0.02)
                19: Maximum change in weights for terminating on an oscilla-
              tion (0.05)
                20: Maximum iterations (30)
                     Parameters for estimate of Z limits for combine with
              Autopatchfit
                24: Fraction for percentile of positions included in the lim-
              its (0.10)
                25: Fraction for lower percentile of positions that can be
              partly outside the limits (0.01)
                26: Number of pixels outside the limits the latter positions
              can be (20)

                     Parameters for analysis of high SD
                28: Fraction of extreme beads to exclude when finding orienta-
              tion that gives minimum spread (0.04)
                29: Basic amount to weight bead separation in the measure of
              spread; they will also be weighted less if they occupy less lat-
              eral area (0.33)
                31: Type of data to use for analyzing projections of SD at
              each layer in depth; 1: mean, 2: median, 3: 75th percentile, 4:
              fraction of boxes above criterion (0)
                32: Fraction of the way from baseline to peak for finding ris-
              ing point of layer projections (0.1)
                33: Use extrapolation to baseline rather than point where
              layer projection crosses the criterion (0)
                34: Use second or fourth moment (1 or 2) of layer projection
              values as measure of spread (0)
                35: Fraction of extreme beads to exclude when determining low
              and high boundaries (0.01) (Successive entries accumulate)

       -volume (-v) OR -VolumeRootname     Text string
              Root name for output of mean and SD volumes at each scale.  Each
              pixel in such volumes corresponds to an individual box within
              which mean and SD were measured.  The volumes names will have
              the form "rootname#-scale#.means" and "rootname#-scale#.SDs",
              where the first # is the tomogram number and the second is the
              scale index (both numbered from 0).

       -point (-po) OR -PointRootname      Text string
              Root name for output of models with raw positions along the sur-
              faces of the section, and with points after smoothing the sur-
              face.  The models will be named "rootname#-colbound.mod" and
              "rootname#-smooth.mod", respectively, where # is the tomogram
              number.  There will be two scattered point objects, one for each
              surface.

       -debug (-de) OR -DebugOutput   Integer
              1 or 2 for debugging output; 2 gives output about individual
              smoothing fits.

       -help (-he) OR -usage
              Print help output

       -StandardInput
              Read parameter entries from standard input


EXAMPLES
       To generate a model to use in Tomopitch(3), given a whole sample tomo-
       gram that has been binned by 3 or more, still in its original orienta-
       tion:
         findsection -scal 2 -size 16,1,16 -block 48 -samp 5 -pitch tomopitch.mod filename.rec

       To generate a model to use in Tomopitch(3), given three unbinned sam-
       ples (bot.rec, mid.rec, top.rec) that each have 20 slices:
         findsection -scal 4 -size 50,1,20 -pitch tomopitch.mod bot.rec mid.rec top.rec

       To generate a model to use in Flattenwarp:
         findsection -scal 4 -size 32,32,1 -surf setname_flat.mod -axis -12 setname.rec
       where the volume should already be post-processed so that Z is the
       depth dimension, setname is the name of the dataset, and the number
       after "-axis" should be your tilt axis rotation angle.  If the aligned
       stack or tomogram was bigger or smaller than a full-sized aligned stack
       in X and Y, then you also need to add "-tilt nx,ny" where "nx" and "ny"
       are the size the raw tilt series, divided by the binning if any.

HISTORY
       Written by David Mastronarde, September 2014, to replace an earlier
       Fortran program of the same name.

BUGS
       Email bug reports to mast at colorado dot edu.



IMOD                                 5.2.0                      findsection(1)