findsection(1) General Commands Manual findsection(1)
NAME
findsection - Locates section boundaries in tomograms
SYNOPSIS
findsection [options] tomogramFile(s)
DESCRIPTION
Findsection will analyze an entire tomogram, or separate samples from a
tomogram, to determine the boundaries of sectioned material. It can
create a simple model of straight lines along the top and bottom sur-
faces, suitable for use in Tomopitch. It can also create a model
with contours along the surface, suitable for use with Flattenwarp.
It will also estimate the top and bottom Z levels at which various
amounts of the material are present, which may be useful for setting
the Z limits when combining two tomograms or when trimming a tomogram.
In a largely separate mode of operation, it can analyze the structure
in a cryo-tomogram to determine the best orientation angles and lower
and upper limits in the thickness dimension, then produce a model for
Tomopitch.
See the EXAMPLES section for command-line entries in some of these dif-
ferent cases.
Detection of Surfaces in Sectioned Material
The detection of surfaces begins with an analysis of the standard devi-
ation of image intensity in an array of overlapping boxes. This analy-
sis is done simultaneously at a number of scales or binnings. The
motivation for this multi-scale analysis is to find and use the scale
at which structure is most distinct from background noise. Although
the options provide complete control over binning in each dimension,
only isotropic binning makes sense from the standpoint of noise sup-
pression. However, that does not mean that the boxes themselves need
to be cubes; instead, they can be very thin in the thickness dimension
to provide the maximum resolution for detecting the section boundary.
The boxes also do not need to contain the same number of binned pixels,
as long as they are big enough to contain enough pixels for a good
estimate of SD at the highest binning. It thus works to have boxes
occupy about the same volume at the different scales, so this happens
by default. An overlap of 50% between boxes is also a default, so
overlap need not be specified. In short, the two important entries
that are needed are the number of scales to analyze and the unbinned
box size. Four scales are appropriate unless data are already binned.
Box sizes of 32x32x1 or bigger are useful.
After SD is measured in each box, the program measures the median and
the normalized median absolute deviation (MADN) of SD in two regions: a
central region where SD appears to be high (which varies in depth
across the area), and thin regions near the top and bottom in the depth
dimension. The number of boxes, median, and MADN of each regions are
shown for each scaling in a table. On the far right of the table is a
measure of how distinct the central region is from the edge at that
scaling; these numbers are used to pick the best scaling.
Further analysis is done by grouping columns of the overlapping boxes
into blocks, ideally consisting of about 25 boxes in X and Y. The
extent of a block in unbinned pixels is controlled by the -block
option, which has situation-dependent defaults. In a block, the median
of the SD values is determined at each Z level; this median is insensi-
tive to the presence of gold particles outside the section as long as
the gold does not contribute to too many boxes. The Z-levels at which
the SD falls off the fastest are taken as the boundaries of the section
at the X/Y location of the block.
The collection of boundary points on each surface is then smoothed with
a local robust regression; this smoothing can thus eliminate aberrant
points arising either from gold outside the section or a small hole in
density at the surface. The smoothed points are used directly if a
boundary model is being created. To make a model for tomopitch when
there are multiple (sample) tomograms, the program fits a pair of lines
to the points on the two surfaces for each sample tomogram. When there
is a single tomogram, the program fits a pair of lines at a specified
number of locations in Y; the fitted points may extend over more than
one block in Y. In either case, the lines are then spread apart so
that they contain all of the points, with a small margin added based on
how fast density falls at the surface.
Regardless of whether there is a model output, when given a single
tomogram, the program computes the median Z value at each surface, the
Z values that contain all boundaries, and Z limits suitable for auto-
matic patch fitting with Autopatchfit. By default, the latter lim-
its will contain at least 90% of the boundary points and the limits
plus or minus 20 pixels will contain at least 99% of the points.
Analysis of High SD Values in Cryo-tomograms
When the -high option is entered, the program uses a much less ambi-
tious analysis. The computation of SD values in boxes proceeds as
usual, but the statistics of the central region are obtained from a
sample with a much bigger extent in depth, and there is no attempt to
vary the depth of the sample. The measure of the distinctness between
center and edge for each scale is the fraction of boxes in the center
that deviate by the criterion number of MADN's from the edge median.
The best scale is chosen from this distinctness measure.
The program takes the lowest and highest box in each column that devi-
ates sufficiently from the edge as boundary points, then searches for
the rotation angles that minimize a measure of spread. At each rota-
tion angle, the program measures the MADN of the Z values separately
for the rotated lower and upper boundary points. The two MADN values
are normalized by their values at zero rotation and averaged. If a
model of gold bead positions is supplied with the -bead option, then
the lower and upper Z values of these positions are determined at each
rotation, with a certain fraction of most extreme values ignored. The
difference between lower and upper Z values is normalized by its value
at zero rotation and averaged in to the spread measure.
Once the rotation angles are known, the program analyzes a histogram of
Z values for the rotated lower boundary points, finding the point where
rises above its baseline level when going from low to high Z. It does
the same analysis for the upper boundary points, but going from high to
low Z. These two boundary values are then spread apart by the amount
indicated by the -boost option, if any. If gold bead positions are
also available, upper and lower limits are found for the rotated posi-
tions and combined with the limits from the boundary points to give the
reported boundaries. A model for Tomopitch is using the angles and
boundaries.
Other methods for analyzing high SD values are still present in the
code and there are control values for activating them, but these alter-
natives did not perform well and are not described here.
OPTIONS
Findsection uses the PIP package for input (see the manual page for
pip). Options can be specified either as command line arguments
(with the -) or one per line in a command file (without the -).
Options can be abbreviated to unique letters; the currently valid
abbreviations for short names are shown in parentheses.
-tomo (-to) OR -TomogramFile File name
Name of image file to analyze. At least one image file must be
entered either with this option or as a non-option argument.
All non-option arguments are taken to be tomograms for analysis.
(Successive entries accumulate)
-surface (-su) OR -SurfaceModel File name
File name for output of a surface model that can be used for
flattening in Flattenwarp.
-pitch (-pi) OR -TomoPitchModel File name
File name for output of boundary model for use with Tomo-
pitch(1). Such a model will have two straight lines at each
sample position. If multiple tomograms are analyzed, each is
assumed to be a separate sample, and there will be a pair of
lines for each sample, with the contours assigned to times cor-
responding to the sample number. If there is only a single
tomogram, then the -samples option must be entered to indicate
how many regions in Y to sample.
-separate (-se) OR -SeparatePitchLineFits
When making a model of paired lines for use in Tomopitch, the
default is to fit a pair of parallel lines to the top and bottom
surfaces. With this option, lines will be fit separately to
points on each surface. This method would be suitable if every
sample has material across a wide enough area so that the fit
will not be thrown off by a few aberrant points.
-samples (-sa) OR -NumberOfSamples Integer
Number of positions to sample in a single tomogram to obtain
lines for Tomopitch. The position and spacing between these
samples is determined by their number, the sample extent, and
the total number of blocks of analyzed boxes in Y.
-extent (-ex) OR -SampleExtentInY Integer
Approximate extent in Y analyzed at sampled positions from a
single tomogram, in unbinned pixels. Since a surface position
is estimated only for each block of boxes, spanning 100 unbinned
pixels by default, the actual extent included in the analysis
will be based on an integral number of blocks. The default is
to analyze one block at each sample position. When multiple
tomogram samples are analyzed, all of the available positions
will be used.
-high (-hi) OR -HighSDboxCriterion Floating point
Determine sample extent and orientation from boxes with SD val-
ues that are the given criterion number of MADNs above the edge
median.
-bead (-be) OR -BeadModelFile File name
When finding extent and orientation from boxes with high SD val-
ues, a set of fiducial positions will also be taken into account
if this option is entered with a model file containing bead
positions.
-diameter (-di) OR -BeadDiameter Floating point
Diameter of beads to assume when including bead positions in the
analysis. Half the diameter will be subtracted or added to the
lower or upper position based on beads, respectively. The
default is 5.
-boost (-bo) OR -BoostHighSDThickness Floating point
Fraction by which to increase the thickness from the analysis of
high SD. The lower and upper boundaries from this analysis will
each be moved by half of this amount, or less if that brings one
of them to the edge of the volume. If bead positions are being
considered also, limits based on them are applied after this
increase.
-lowest (-l) OR -LowestSDforEdges
Scan through Z planes for the ones with the lowest SD values and
use these to measure the statistics of the edge, instead of
using planes atthe surfaces of the volume.
-scales (-sc) OR -NumberOfDefaultScales Integer
This option can be used to specify how many default binnings to
analyze, instead of entering each one with the -binning option.
These binnings are isotropic (the same in each dimension). The
default binnings available are 1, 2, 3, 4, 6, 8, 12, 16, 24, 32,
48, and 64. The default is to do a single scale at binning 1.
-binning (-bi) OR -BinningInXYZ Three integers
Binning in X, Y, and Z for each scale to analyze. Multiple bin-
ning entries should be in order by increased binning. This
option cannot be entered with -scale. (Successive entries accu-
mulate)
-size (-si) OR -SizeOfBoxesInXYZ Three integers
Size in X, Y, and Z of boxes in which to measure mean and SD, in
binned pixels. This option can be entered multiple times, up to
once per each scaling, but one entry seems to be sufficient.
For scalings past the last one for which a size was entered, the
size in each dimension will be set to span about the same extent
in unbinned pixels as for the last binning for which size was
entered. The entry is required. (Successive entries accumulate)
-spacing (-sp) OR -SpacingInXYZ Three integers
Spacing in X, Y, and Z between boxes, in binned pixels. This
option can be entered multiple times, only once, or not at all;
the default is to set the spacing to half of the size. For
scalings past the last one for which a spacing was entered, the
spacing in each dimension will be set to give the same overlap
between boxes as for the last binning for which a spacing was
entered. (Successive entries accumulate)
-block (-bl) OR -BlockSize Integer
Size of block in which to consolidate the boxes for further
analysis, in unbinned pixels. If this option is not entered,
the program will start with a size of 100 pixels, or 200 if mak-
ing a surface model, and then increase the size to get an equiv-
alent area if there are too few boxes in one direction (specifi-
cally, when using multiple tomogram samples, the size generally
gets increased to ~300). If the option is entered, the number
is used as is, without such an adjustment.
-xminmax (-x) OR -XMinAndMax Two integers
Minimum and maximum X coordinate to include in the analysis.
The default is to trim off 2.5% of the extent on each end when
outputting a surface model, otherwise 5%.
-yminmax (-y) OR -YMinAndMax Two integers
Minimum and maximum Y coordinate to include in the analysis. If
Y is the thickness dimension, the default is to use the whole
extent; otherwise the default is to trim off either 2.5% or 5%
of the extent on each end, depending on whether a surface model
is being made.
-zminmax (-z) OR -ZMinAndMax Two integers
Minimum and maximum Z coordinate to include in the analysis. If
Z is the thickness dimension, the default is to use the whole
extent; otherwise the default is to trim off either 2.5% or 5%
of the extent on each end, depending on whether a surface model
is being made.
-flipped (-f) OR -ThickDimensionIsY Integer
This option can be used to specify which axis of a single tomo-
gram is the thickness dimension, if necessary. The default is
to assume that the shortest dimension of Y or Z is the thickness
dimension. Multiple tomograms are assumed to be samples as
built by Tilt and must have their thickness in Y.
-axis (-a) OR -AxisRotationAngle Floating point
Rotation angle from Y axis to tilt axis in the raw tilt series,
counterclockwise positive. With this entry, the program will
avoid analyzing regions outside the area that can be well-recon-
structed from the original images. However, the correct region
is identified only if the aligned stack and reconstruction were
centered on the original tilt series.
-tilt (-ti) OR -TiltSeriesSizeXY Two integers
Size in X and Y of raw tilt series for volume being analyzed,
divided by the binning applied to make its aligned stack. When
-axis option is entered, this option should be entered if this
size differs from that of the reconstruction.
-edge (-ed) OR -EdgeExtentInXYZ Three integers
Approximate # of pixels in X, Y, and Z to use for getting sta-
tistics about the edge of the volume in the thickness dimension.
The default is to use 2.5% of the extent in the thickness dimen-
sion and 50% of the extent in the other two dimensions.
-center (-ce) OR -CenterExtentInXYZ Three integers
Approximate # of pixels in X, Y, and Z to use for getting sta-
tistics about the center of the volume. The default is to use
10% of the extent in the thickness dimension and 33% of the
extent in the other two dimensions.
-control (-co) OR -ControlValue Two floats
Parameter number and value for setting algorithm control parame-
ters. Parameters and their numbers (and default values in
parentheses) are:
1: Minimum # of points for using robust fit to get pitch line
on one surface (6)
21: Fraction that the difference between distinguishability of
center from edge points must improve to adopt a higher scaling
for analysis (0.33)
22: Threshold weight from robust fit for including a point in
the final smoothing fit (0.2)
23: Threshold weight from robust fit for counting a point as
"good" (0.6)
27: Fraction of depth extent to use for center samples (0.1,
or 0.4 if analyzing high SD)
30: Take square root of SD values if > 0 (0, or 1 when analyz-
ing high SD)
Parameters for finding midpoint
3: Number of edge MADN's above edge median that maximum value
must be to proceed (2.)
4: Fraction of maximum - edge difference to achieve (0.5)
5: Number of edge MADNs above edge to achieve as well (3.)
6: Number of box medians that need to be above those criteria
(3)
Parameters for fitting boundaries of columns
7: Number of center MADN's below the center median for inside
median to be too low (5.)
8: Fraction of inside - edge median difference that it must
fall toward edge median (0.3)
9, 10: Low and high limits of range of fractions of inside -
edge median difference to fit (0.2 and 0.8)
11: Fraction of inside - edge median difference at which to
save boundary (0.5)
12: Fraction of difference at which to estimate extra boundary
distance for pitch output (0.25)
13: Minimum fraction of boxes in column that must yield bound-
aries (0.5)
Parameters for checking block thickness
14: Criterion fraction of median thickness for considering
block too thin (0.5)
15: Drop a boundary if it is this much farther from local mean
than other boundary is (2.)
16: Drop a boundary if its difference from the mean is this
fraction of median thickness (0.35)
Robust fitting parameters
17: K-factor for the weighting function (4.68)
18: Maximum change in weights for terminatiom (0.02)
19: Maximum change in weights for terminating on an oscilla-
tion (0.05)
20: Maximum iterations (30)
Parameters for estimate of Z limits for combine with
Autopatchfit
24: Fraction for percentile of positions included in the lim-
its (0.10)
25: Fraction for lower percentile of positions that can be
partly outside the limits (0.01)
26: Number of pixels outside the limits the latter positions
can be (20)
Parameters for analysis of high SD
28: Fraction of extreme beads to exclude when finding orienta-
tion that gives minimum spread (0.04)
29: Basic amount to weight bead separation in the measure of
spread; they will also be weighted less if they occupy less lat-
eral area (0.33)
31: Type of data to use for analyzing projections of SD at
each layer in depth; 1: mean, 2: median, 3: 75th percentile, 4:
fraction of boxes above criterion (0)
32: Fraction of the way from baseline to peak for finding ris-
ing point of layer projections (0.1)
33: Use extrapolation to baseline rather than point where
layer projection crosses the criterion (0)
34: Use second or fourth moment (1 or 2) of layer projection
values as measure of spread (0)
35: Fraction of extreme beads to exclude when determining low
and high boundaries (0.01) (Successive entries accumulate)
-volume (-v) OR -VolumeRootname Text string
Root name for output of mean and SD volumes at each scale. Each
pixel in such volumes corresponds to an individual box within
which mean and SD were measured. The volumes names will have
the form "rootname#-scale#.means" and "rootname#-scale#.SDs",
where the first # is the tomogram number and the second is the
scale index (both numbered from 0).
-point (-po) OR -PointRootname Text string
Root name for output of models with raw positions along the sur-
faces of the section, and with points after smoothing the sur-
face. The models will be named "rootname#-colbound.mod" and
"rootname#-smooth.mod", respectively, where # is the tomogram
number. There will be two scattered point objects, one for each
surface.
-debug (-de) OR -DebugOutput Integer
1 or 2 for debugging output; 2 gives output about individual
smoothing fits.
-help (-he) OR -usage
Print help output
-StandardInput
Read parameter entries from standard input
EXAMPLES
To generate a model to use in Tomopitch(3), given a whole sample tomo-
gram that has been binned by 3 or more, still in its original orienta-
tion:
findsection -scal 2 -size 16,1,16 -block 48 -samp 5 -pitch tomopitch.mod filename.rec
To generate a model to use in Tomopitch(3), given three unbinned sam-
ples (bot.rec, mid.rec, top.rec) that each have 20 slices:
findsection -scal 4 -size 50,1,20 -pitch tomopitch.mod bot.rec mid.rec top.rec
To generate a model to use in Flattenwarp:
findsection -scal 4 -size 32,32,1 -surf setname_flat.mod -axis -12 setname.rec
where the volume should already be post-processed so that Z is the
depth dimension, setname is the name of the dataset, and the number
after "-axis" should be your tilt axis rotation angle. If the aligned
stack or tomogram was bigger or smaller than a full-sized aligned stack
in X and Y, then you also need to add "-tilt nx,ny" where "nx" and "ny"
are the size the raw tilt series, divided by the binning if any.
HISTORY
Written by David Mastronarde, September 2014, to replace an earlier
Fortran program of the same name.
BUGS
Email bug reports to mast at colorado dot edu.
IMOD 5.2.6 findsection(1)