imodholefinder(1) General Commands Manual imodholefinder(1) NAME imodholefinder - Find regularly spaces holes in carbon film SYNOPSIS imodholefinder options input_image output_model DESCRIPTION Imodholefinder finds regular spaced holes in a carbon film in order to locate target positions for cryoEM data acquisition. It is principally designed as a test harness for the Holefinder module shared with Seri- alEM, but may also be useful for offline analysis of target positions. It supports square and hexagonal lattices. The main output is an IMOD model file in which the first object con- tains points for the set of holes found on each section, the second object contains positions located within the regular pattern for which a point was not found with sufficiently low error, and the third object contains points that were found close to some of those missing posi- tions. Each point is in a separate contour, and one general value is stored for each contour. This value will be the relative correlation peak strength by default, but with the intensity option, the value can instead be the mean or standard deviation of pixels within the hole, or the fraction of dark pixels that appear to be outliers. Points below a threshold value can be turned off and deleted using the Bead Fixer mod- ule in 3dmod, and points above a value can also be turned off with the Values panel of the Edit-Objects dialog of the Model View window. However, there is currently no way to delete the points above that threshold in 3dmod. A text file with information about each located point, including all four of those value metrics, can be output with the -summary option. See its description below. When the input images are based on a montage, the raw montage file can also be provided in order to improve the result and overcome bad align- ments between overlapping pieces. In this case, the summary output file will also contain the coordinates within a piece for each point, probably the best form for importing these positions into SerialEM. The program locates holes in images that are scaled down if necessary so that the holes are no bigger than 50 pixels. It starts with a stan- dard method: filtering to reduce noise, application of a Sobel filter and a Canny edge detector, and then a Hough transform of this edge image to find circular features. Filter can be done with either Gauss- ian smoothing with a specified sigma value, by median filtering with a given number of iterations. The Hough transform is implemented by cross-correlation of the edge image with a circular template via Fourier transforms, which is essentially equivalent to but much faster than direct summation and allows further efficiencies when transforms are cached for reuse. To overcome uncertainty in the actual hole diam- eter and variability in the appearance of hole edges, the program does a series of scans with circles of given thickness. By default, it starts with a fairly thick circle and a potentially wide-ranging scan to find the best diameter; it then refines the diameter with three steps using a circle of medium thickness, and finally repeats the process with a thinner circle at the refined size. To improve the result, an average is generated from the edge image at the half of the locations with the strongest correlations, and this edge average is correlated with the edge image to get a new set of positions. The positions obtained from searching for peaks in these cross-correla- tions are constrained to be separated by at least 3/4 of the nominal spacing between holes, which greatly reduces the number of false corre- lation peaks being considered. The next stage is to analyze these position to determine the orientation and spacing of the regular grid and eliminate points that do not fall near positions of that grid. In addition, a robust fit to the positions of up to a 5 by 5 array of points is used to generate a predicted position for each location on the grid if possible, and a point is eliminated if it is not close enough to its respective predicted position. Now that most points can confidently be considered actual holes, an alternative analysis is done if possible by averaging from the 90% of the positions with top corre- lation scores after removing bright outliers, which should correspond to empty holes. In this case the actual image is averaged, and the average template is correlated with that image. The detected points are subjected to the same analysis to determine which ones fall on a regular grid. Results from these two different average template corre- lations are then merged and analyzed again to find the best of both sets of positions. All of the above depends on two critical parameters at the start of the process: the initial filtering and the threshold for the Canny edge detector. The best results are obtained by a brute force method of doing the entire analysis with a range of values for each parameter and picking the values that give the most points. Once the final set of points is obtained, bright outliers are removed, and dark outliers are also removed with a very conservative outlier criterion. When the input file contains montages images and a raw montage file is also supplied, the program follows the above with a simplified analysis of each montage piece. It applies the filter and edge detector with the final parameters used for the full image, then correlates the existing edge average template with the edge image and the existing image average template with the actual piece image. These correlations are separate analyzed for points being on a regular grid, results are merged thet same way as for the full image, and the grid analysis is repeated on the merged points. Outlying bright and dark holes are removed with the same intensity criteria detreimined in the outlier removal for the full image. After all pieces are analyzed, the posi- tions in the pieces are considered in conjunction with those from the full image. Positions within the overlap zones between pieces are sub- stituted from the appropriate piece if possible to correct for any dis- tortions of the hole image in those locations of the full image. Points found in the piece analysis but not in the full image analysis are added to the collection. OPTIONS Imodholefinder uses the PIP package for input (see the manual page for pip). Options can be specified either as command line arguments (with the -) or one per line in a command file (without the -). Options can be abbreviated to unique letters; the currently valid abbreviations for short names are shown in parentheses. -input (-inp) OR -InputImageFile File name Name of input image file. If it is not entered with this option it must be entered with the first non-option argument. -output (-o) OR -OutputModelFile File name Name of output model file. If it is not entered with this option it must be entered with the second non-option argument. -boundary (-bo) OR -BoundaryModel File name Model with contours enclosing areas to analyze, one per section to be constrained. A contour will be applied on the sections where it is drawn. -summary (-su) OR -SummaryOutputFile File name Name of output file for points and values found on each section. The columns of this file are: X position in input image in pixels Y position in input image in pixels Z value (section number, from 0) Correlation peak value, relative to highest one found on sec- tion The mean value within the hole The standard deviation of the pixels in the hole smoothed by 3x3 averaging The fraction of smoothed pixels in the hole considered to be negative outliers The X position index within a regular grid, numbered from 0 The Y position index within a regular grid, numbered from 0 If raw montage entered, the montage piece on which the point is found If raw montage entered, the X coordinate on that piece, in piece pixels If raw montage entered, the Y coordinate on that piece, in piece pixels -diameter (-di) OR -DiameterOfHoles Floating point Nominal diameter of holes. Enter a positive value to specify the diameter in microns, or the negative number of pixels to specify it as pixels -spacing (-sp) OR -SpacingOfHoles Floating point Nominal spacing between holes. Enter a positive value for a spacing in microns, or a negative number of pixels. -hex OR -HexagonalGrid Holes are arranged in a hexagonal rather than a square lattice -error (-e) OR -MaximumError Floating point Maximum error allowed between a point and a prediction of where it should be located, based on fitting to up to a 5x5 array of surrounding points. Enter a positive value for the number of microns, or a negative number of pixels. If this option is not entered, the default is 0.05 times the nominal hole diameter, or a different fraction if entered with -control. -thresh (-thr) OR -ThresholdPercentiles Multiple floats Set of thresholds to try for the Canny edge detector. The thresholds are specified as the percentile of the highest edge gradients to consider as strong edges; thus small values like 2-6 are appropriate. The lower threshold, for considering points as weak edges, is taken as twice this value. The default is 2.,3.2,4.4. When both multiple threshold and multiple fil- ters are to be tried, the program exhaustively tries all combi- nations. -filter (-f) OR -FilterSigmaOrIterations Multiple floats Set of filter values to try: a positive value specifies the sigma in pixels to use for Gaussian smoothing; a negative values specifies the negative of the number of iterations for a 3x3 median filter. The default is 1.5,2,3,-3, which means try Gaussin smoothing with sigmas of 1.5, 2, and 3, then a median filter with 3 iterations. If multiple median filter iterations are to be tried, enter them in order of increasing iteration number for maximum efficiency. -thickness (-thi) OR -CircleThicknesses Multiple floats Set of thicknesses of circles to correlate with the edge image, in pixels. The default is 4,2,1.5. Multiple values should be entered in order of decreasing thickness. If the number of val- ues entered is not 3, then corresponding entries must be made for -number and -step. -number (-n) OR -NumberOfCircles Multiple integers Set of numbers of circle sizes to try for each thickness. The default is 7,3,1. When more than 5 is specified, as few as 5 will be tried if the summed correlation score drops monotoni- cally to both sides of the first one tried. If the number of values entered is not 3, then corresponding entries must be made for -thickness and -step. -step (-st) OR -CircleStepSize Multiple floats Set of increments between circle sizes to try for each thick- ness, in pixels. The default is 3,1.5,1. The increment is irrelevant when only one size is being used. Increments should be less than their respective thicknesses. If the number of values entered is not 3, then corresponding entries must be made for -number and -thickness. -sections (-se) OR -SectionsToDo List of integer ranges List of sections to analyze, numbered from 0 (ranges allowed) -intensity (-int) OR -DiamForIntensityStats Two floats The diameter of hole region to get statistics from, in microns, and the type of value to store in the model in place of the cor- relation peak strength: 0 for the mean, 1 for the SD of values smoothed by averaging a 3x3 set of pixels, or -1 for the frac- tion of those smoothed values that appear to be dark outliers. If a negative value (e.g., -1) is entered for the diameter, it will use twice the difference between the determined hole size and maximum error. -montage (-m) OR -RawMontageFile File name Original montage file that is the basis for the input images. With this entry, the program will find holes on each montage piece and use this information to improve the number and loca- tion of the holes found on the full images. The -piece option must also be entered. -piece (-p) OR -PieceListFile File name File with list of nominal piece coordinates for the raw montage -aligned (-a) OR -AlignedPieceList File name File with list of aligned piece coordinates for the montage, as put out with the -AlignedPieceCoordFile option to Blendmont, so that the program can relate locations in the full images to those in the pieces. If pieces were not aligned to produce the full images (e.g., by Reducemont), then this entry is not needed. -binned (-bi) OR -FullImageIsBinned Integer Binning of the input images relative to the raw montage. This entry is not needed if both image file headers have valid pixel spacings that differ by an integer factor. -retain (-r) OR -RetainDuplicates Keep all points that correspond in overlapping pieces in the output model. Without this option, the point from one piece of such a pair will be removed, leaving only the one that is poten- tially substituted or added to the set of points from the full image. -show (-sh) OR -ShowPieceObjects Turn on the objects containing points from the montage pieces and turn off the objects containing the actual set of found points, missing, and nearby ones. By default the latter are on and the montage objects are off. -control (-c) OR -ControlValue Two floats Parameter number and value for setting algorithm control parame- ters. Numbers in parentheses are default values. 1: Target hole diameter to reduce images to if holes are big- ger than it (50.) 2: Fraction of nominal diameter to use as maximum error when analyzing grid, if the maximum error is not entered (0.05) 3: 0 use filter/threshold values from first section for the rest; 1 to use filter value from first section and keep varying threshold; 2 to vary on all sections (2) 4: Retain FFTs of circle and average templates when appropri- ate (1) 5: Minumum number of total points for making an average (5) 6: Fraction of points to average if there enough; more will be done if < minimum (0.5) 7: Maximum change between nominal and found diameter allowed; above this it do next set of circles at the nominal diameter instead of the found one (0.2) 8: Criterion for omitting positive outliers in mean value from the raw average (4.5) 9: Criterion for removal of negative mean outliers after the final grid analysis (9.0) 10: Criterion for removal of positive mean outliers after the final grid analysis (4.5) 11: Fraction of radius to take pixels from for this outlier removal (0.9) 12: Fraction of average spacing below which points found on separate montage pieces are considered to be the same (0.5) 13: Fraction of average spacing below which a point found on a montage piece is considered to be the same as a point found in the full image (0.5) 14: Fraction of radius used as criterion distance from edge of overlap zone for substituting a point found on a piece for the same point on the full image; 0 means a point right on edge qualifies, a positive value admits points not in overlap (0.) 15: Fraction of radius used as criterion distance from edge of image for using a point found only on a piece (adding or substi- tuting it) (0.25) 16: Maximum fractional extent into overlap for adding a point from a piece not found on the full image (and not eliminated because the it was identified as the same as one on the overlap- ping piece) (0.75) (Successive entries accumulate) -verbose (-v) OR -VerboseOutput Integer 1 for verbose output; 2 for more verbose output -debug (-de) OR -DebugImages Integer 1-3 for output of intermediate images. Most images come out with any value; the splitfill and crosscorr files are specific to the value: 1 for crosscorr from ideal hole templates, 2 or 3 for splitfill and crosscorr from the average edge or raw average templates. -help (-hel) OR -usage Print help output -StandardInput Read parameter entries from standard input AUTHOR David Mastronarde SEE ALSO blendmont, reducemont BUGS Email bug reports to mast at colorado dot edu. IMOD 5.0.2 imodholefinder(1)