pickbestseed(1) General Commands Manual pickbestseed(1) NAME pickbestseed - Selects best seed points for autofidseed SYNOPSIS pickbestseed options DESCRIPTION Pickbestseed is used by Autofidseed to select an optimal set of seed points fof the desired number, using bead models tracked starting from several different views and using information on the quality of the tracking, the shape of the beads, and on which of two surfaces they are located. The program operates with the following steps: 1) It reads in the different tracked models and identifies cases where the same bead was tracked in different models, as signified by tracks that are sufficiently close to each other over the whole range of views tracked (controlled by parameters 1 and 6 below). 2) Each unique bead becomes a candidate, and a single score is obtained by taking a weighted mean of four measures of the consistency and qual- ity of tracking: the mean residual from alignment during tracking; the fraction of models in which the bead is missing; the fraction of points missing from each track of the bead; and the mean distances between corresponding points in the tracks, computed from each pair of tracks. By default, these measures are give equal weighting. A lower score is better. When beads are located on two surfaces, each candidate is assigned to a surface based on which assignment predominates in the multiple surface analyses available. 3) The total area to be analyzed is measured, so that it is possible to convert between number of beads and average density per unit area. 4) Information on the elongation of beads is analyzed, namely statis- tics from the standard deviation of pixels around the bead and for the elongation. An adjusted value for the mean of the elongation over the 11 views of a track is computed by, in effect, rotating a plot of mean elongation versus the standard deviation of the elongation by an angle (parameter 18 below) that removes much of the variability due to the SD of the elongation, and using the new Y coordinate as the adjusted value. The same operation is performed with the SD of pixels around the beads to obtain an adjusted edge SD (rotation angle controlled by parameter 19). The latter values are scaled to have the same standard deviation as the adjusted elongation values, and then a plot of adjusted elongation versus scaled, adjusted SD is rotated so as to com- bine them into a single measure (parameter 20 below), with a final elongation measure taken as the Y coordinate after rotation. This mea- sure is analyzed by taking the median, finding the median absolute deviation, and considering values as outliers if their deviation from the median is more than the normalized median absolute deviation (MADN) times a criterion (parameter 12 below). A candidate is marked as elon- gated if it is an outlier or if the median elongation exceeds an abso- lute threshold (parameter 16 below). 5) Beads are identified as clustered if they are within a criterion distance (parameter 5 below) of any other bead on the same surface. This distance is evaluated at the highest tilt angle of the series, assuming that the distance between beads perpendicualr to the tilt axis is foreshortened by tilting. 6) Elongation is analyzed again, considering only points that are not identified as clustered. When there are many clustered points, which also tend to have high elongation values, this can skew the criterion for identifying outliers, so analyzing unclustered points separately can identify additional outliers. Each unclustered point is given the maximum of the elongation score from the two analyses. 7) Beads are sorted in order by their overall score and two different procedures are used to accept points in the final model, in a set of phases. Only unclustered, unelongated beads are considered for the first 4 phases. The median and MADN of all the scores are used to define a maximum acceptable score in this phase (8 MADNs from the median, by default). Once some points have been accepted, the program computes a continuous 2D density function from the points using kernel density estimation with a triweight kernel. Specifically, for each bead, a component proportional to (1 - (dist / H)^2)^3 is added for a point at distance "dist" from the bead. H is chosen based on the tar- get spacing to be achieved between points in the current phase, times parameter 10 below. 7.1) Phase 1: The target density is converted to an equivalent spacing and beads are accepted in order by their score, provided that they are not closer to an already-added bead than a certain fraction of this spacing (parameter 14 below). This procedure is then repeated with the best half of the candidates, adding them if they are not closer to an added bead than a lower fraction of the target spacing (parameter 15 below). When there are beads on two surfaces, this procedure is done separately for each surface. 7.2) Phase 2: A gap-filling routine is used to add further points up to a desired density, if necessary. This routine repeatedly finds the point with lowest density then searches out from that point in succes- sively wider rings for a bead to add. The ring spacing is the target spacing times parameter 3 below. If multiple beads are found in a ring, they are prioritized by an adjusted score, which is their overall score divided by the distance to the nearest accepted bead. In addi- tion, if clustered and overlapped points are being accepted (in a later phase), the score is increased for a clustered or overlapped point, and a point within the clustering distance of another accepted point is simply excluded. After a search is done in one location, points within a certain distance of the density minimum are excluded from further consideration on that call of the gap-filling routine. (This criterion distance is the target spacing times parameter 2 below.) The search is terminated when all density minima below a fraction of the target den- sity are examined. This fraction is parameter 8 below, but if there are two surfaces and the ratio of minority to majority surface is less than parameter 13 below, it uses the higher fraction in parameter 9 instead for the minority surface. The gap-filling routine is called twice, once allowing two rings, then allowing the number of rings in parameter 4 below. 7.3) Phase 3: If there are two surfaces, it now tries to beef up the number on the majority surface to make up for the deficiency. The tar- get number for this surface is the full target number minus the number on the minority surface, unless the "-nobeef" option is entered, in which case the target is still half the full target. First it calls the routine that considers points in order by score and adds them if their distance from other points is high enough. Then calls the gap- filling routine, but now density is computed from points on both sur- faces so that it can fill gaps left by the beads on the minority sur- face preferentially. Again, the gap-filling routine is called twice with two different numbers of rings. 7.4) Phase 4: If points are still deficient, it calls the gap-filling routine, examining points with density up to the higher fraction (parameter 9) of the target density. A revised target is used for the majority surface unless "-nobeef" is entered, and the original target is used for the minority surface. Densities are computed per surface, and the routine is called only once with the full number of rings. 7.5) Phases 5-8: If clustered and/or elongated points are allowed to be included, then it runs the same procedure as in phase 4, first allowing clustered points if they are allowed, and then elongated points with progressively higher elongation numbers, which are based on the frac- tion of tracked models in which the bead was identified as elongated. In these phases, a larger maximum score is allowed (12 MADNS above the median by default), since most beads with high scores fall in these categories. 8) Accepted points are put into the output model, along with a general value equal to the inverse of the score. With two surfaces, points on the top surface are given surface number 1, which is assigned magenta color. OPTIONS Pickbestseed uses the PIP package for input (see the manual page for pip). Options can be specified either as command line arguments (with the -) or one per line in a command file (without the -). Options can be abbreviated to unique letters; the currently valid abbreviations for short names are shown in parentheses. -tracked (-tr) OR -TrackedModel File name Name of tracked model file from one Beadtrack run. This entry is needed for each run to be included in the analysis. (Succes- sive entries accumulate) -surface (-su) OR -SurfaceFile File name Name of file with surface information from one Sortbeadsurfs run. There must be the same number of surface file entries as tracked models. (Successive entries accumulate) -resid (-re) OR -ElongationFile File name Name of file with residual and elongation data from one Bead- track run. There must be the same number of elongation file entries as tracked models. (Successive entries accumulate) -output (-o) OR -OutputSeedModel File name Name of final output model file -append (-a) OR -AppendToSeedModel Read in existing output seed model and add points to it. All points will be retained from this model. Candidate points that match these points will be accepted before phase 1, then the regular sequence of phases will be followed to reach the target number. -size (-si) OR -BeadSize Floating point Diameter of beads in pixels in the images where beads were found -image (-i) OR -ImageSizeXandY Two integers X and Y dimensions of image file used for finding and tracking beads -border (-bor) OR -BordersInXandY Two integers Number of pixels to exclude on each side in X and in Y -middle (-m) OR -MiddleZvalue Integer Z value of middle section for tracking, numbered from 0 -zseed (-z) OR -SeedZvalue Integer Z value of seed for one Beadtrack run, numbered from 0. If this option is entered at all, it must be entered for each tracked models. (Successive entries accumulate) -two (-tw) OR -TwoSurfaces Try to sort beads onto two surfaces then select a seed model that has equal numbers of beads on the two surfaces if possible. -boundary (-bou) OR -BoundaryModel File name Name of model file whose first object contains contours enclos- ing areas in which to use or to exclude beads, depending on whether -exclude is entered. If more than one contour is drawn on a view, points inside any one of the contours will be consid- ered inside the area. This program will use only the contours on the view closest to the middle section for tracking. -exclude (-ex) OR -ExcludeInsideAreas Use the contours in the boundary model to define regions to exclude from analysis rather than regions to include. -counting (-cou) OR -BoundaryForCounting Use the contours in the boundary model just for counting candi- dates inside and outside the boundary when outputting a candi- date model. -number (-nu) OR -TargetNumberOfBeads Integer Desired total number of beads to choose for output seed model. If beads are on two surfaces, the program will seek to find half the target number on each surface, then pick more beads on either surface to reach the target. Either this option or -den- sity must be entered. -density (-d) OR -TargetDensityOfBeads Floating point Desired density of beads in final seed model per 1000 square pixels of area, excluding the area outside boundary contours if any. This option provides an alternative way of specifying the target that is independent of data set size. -nobeef (-no) OR -LimitMajorityToTarget Do not increase the number of beads on the surface with more beads to make up for a deficiency on the other surface. Aut- ofidseed(1) uses this option to limit the number of beads on the majority surface in response to its -ratio option. -elongated (-el) OR -ElongatedPointsAllowed Integer Enter 1, 2, or 3 to include beads identified as elongated in up to 1/3, up to 2/3, or all of the Beadtrack runs, respectively. -cluster (-cl) OR -ClusteredPointsAllowed Integer Enter 1 to include clustered beads. i.e, ones that appear to be located within 2 diameters of other beads, where foreshortening perpendicular to the tilt axis is taken into account in comput- ing this separation. Only one of a pair of clustered points will be accepted. If -elongated is not entered, 2, 3, or 4 can be entered to also include beads identified as elongated in up to 1/3, up to 2/3, or all of the Beadtrack runs, respectively. -lower (-l) OR -LowerTargetForClustered Floating point Include clustered and elongated points as allowed by the -clus- ter and -overlap options only when the total number of beads is still below the reduced target given here. The value entered should be in the same form as the regular target was specified, i.e, a number of beads if -number was entered or a bead density if -density was entered. -rotation (-rot) OR -RotationAngle Floating point Angle of rotation of the tilt axis in the images; specifically, the angle from the vertical to the tilt axis (counterclockwise positive). -highest (-hi) OR -HighestTiltAngle Floating point Absolute value of highest tilt angle -weights (-w) OR -WeightsForScore Multiple floats Alternative weights for composing a score for each candidate bead. Enter 4 weights: for fraction of points missing in a track; for fraction of Beadtrack runs from which the point is missing; for mean residual during bead tracking; and for the mean deviation between the different tracks of the same bead. The default weights are all 1. -control (-con) OR -ControlValue Two floats Parameter number and value for setting algorithm control parame- ters. Parameters and their numbers (and default values in parentheses; float parameters have decimal points) are: 1: Deviation between points as fraction of bead diameter for tracks to be close (0.5) 2: Multiple of target spacing at which to exclude points from further searches (0.75) 3: Width of rings for finding points when filling gaps, as fraction of target spacing (0.25) 4: Number of rings to search (4) 5: Maximum # of bead diameters separation for points to be con- sidered clustered (1.375) 6: Fraction of points that must be close in two tracks for them to be considered same (0.6) 7: Scaling factor for the two elongation criteria (parameters 12 and 16), applied to the default or entered values (1.0). 8: Maximum fraction of target density at which to add points in initial phase (0.9) 9: Higher fraction of target at which to add points in more desperate searches (1.1) 10: Scaling from desired spacing to H for kernel density compu- tation (1.3) 11: Scaling from desired spacing to density grid spacing (0.2) 12: Criterion for edge SD values or elongations to be consid- ered outliers (2.24) 13: Ratio of minority to majority for using higher density fac- tor (0.65) 14: Fraction of nominal spacing allowed for initial addition of points (0.85) 15: Fraction of spacing for adding best half of points on next phase (0.7) 16: Absolute threshold for elongation to be considered overlap (2.5) 17: Option flags: the sum of 1 for fitting elongation measures versus bead integral and replacing measures with the residual of the fit (which does not help), and 2 for analyzing the elonga- tion measure in successive groups of at least 50 values, when values are arranged in order by bead integral (which does not help) 18: Angle to rotate plot of mean of edge SD versus SD of edge SD to obtain an adjusted edge SD (-59.0 degrees, the mean from 7 data sets) 19: Angle to rotate plot of mean versus SD of elongation mea- sure to obtain an adjusted elongation measure (-67.0 degrees, the mean of 8 data sets) 20: Angle to rotate plot of adjusted elongation measure versus adjusted edge SD to obtain the final elongation measure to ana- lyze for outliers (45.0 degrees, corresponds to simply averaging the two adjusted values) 21: Maximum number of MADNs above the median score allowed to accept a candidate point in phases 1-4 (8.0). 22: Maximum number of MADNs above the median score allowed to accept a clustered or elongated candidate point in phases 5-8 (12.0). (Successive entries accumulate) -phase (-p) OR -PhaseOutput Color output points by the phase in which they were added as well as by their surface. Available colors in order are green, magenta, yellow, cyan, red, solid blue, orange, purple, dark blue, salmon, dark red. One or two colors will be used for each phase, depending on whether beads are sorted into two surfaces. If more than 10 colors are needed the 9 after green are reused. -root (-roo) OR -DensityOutputRootname Text string Root name for output of density maps in gnuplot format -candidate (-ca) OR -CandidateModel File name Filename for an output model with all candidates beads sorted by clustering and elongation scores. Points will be assigned to up to 8 model surfaces. Surface numbers 0 to 3 are for non-clus- tered points with elongation values of 0 to 3, and are colored dark green, magenta, bright green, and yellow, respectively. Numbers 4 to 7 are for clustered points with elongation values of 0 to 3, and are colored mustard green, red, light blue, and orange. Open the Surface/Contour/Point dialog in 3dmod (Edit- Surface-Go To) to navigate to contours within and between sur- faces and to see labels for the surfaces. -verbose (-v) OR -VerboseOutput Integer 1 for verbose output including lists of candidates and their properties; 2 for more verbose output from addPointsInGaps rou- tine. -help (-he) OR -usage Print help output -StandardInput Read parameter entries from standard input AUTHOR David Mastronarde SEE ALSO autofidseed Email bug reports to mast at colorado dot edu. IMOD 5.2.0 pickbestseed(1)