Name

sxk_means - K-means classification of a set of images

Usage

Usage in command lines:

sxk_means.py stack outdir <maskfile> --K=10 --trials=2 --debug --maxit=100 --rand_seed=10 --crit='all' --init_method='rnd' --normalize --CTF --MPI

Usage in python programming:

k_means_main(stack, out_file, maskname,"SSE", K1, K2, rand_seed, maxit, trials, CTF=False, MPI=False, DEBUG=False, flagnorm=False)

Usage of MPI:

1. set the flag --MPI in command line.

2. mpirun -np 4 sxk_means.py and the remaining parameters.

The above example is for mympi.

Example:

sxk_means.py hri_stack.hdf RES mask2d_23.hdf --K=128 --maxit=500 --crit="D"

mpirun -np 4 sxk_means.py bdd:hri_stack RES mask2d_23.hdf --K=128 --maxit=1000 --rand_seed=100 --MPI

Note 1: when 2D input images were aligned (see sxali2d), the program will apply the 2D alignment parameters (xform.align2d) stored in headers prior to clustering.

Note 2: CTF is not implemented.

Input

stack
The input stack of images
maskfile
optional mask file to be used
outdir
name of directory where the results are writed
  • The parameters preceded with -- are optional and default values are given in parenthesis.

  • K
    The requested number of clusters (default 2).
    trials
    number of trials of K-means (see description below) (default one trial). MPI version ignore --trials, the number of trials in MPI version will be the number of cpu used.
    max_iter
    maximum number of iterations the program will perform (default 100)
    CTF
    if set, CTF information stored in file headers will be used (default no CTF).
    rand_seed
    the seed used to generating random numbers (set to -1, means different and pseudo-random each time)
    crit

    names of criterion used: 'all' all criterions, 'C' Coleman, 'H' Harabasz or 'D' Davies-Bouldin, thoses criterions return the values of classification quality, see also sxk_means_groups. Any combination is accepted, i.e., 'CD', 'HC', 'CHD', .

    MPI
    to use MPI version of k-means ( default False ). For the mpi version, the program is paralleled with different trials.
    normalize
    Normalize images under the mask
    init_method
    Method used to initialize partition: "rnd" randomize or "d2w" for d2 weighting initialization (default is rnd)

    Output

    outdir
    The directory to which the averages of K clusters, and the variance. The classification charts are written to the logfile. Warning: If the output directory already exists, the program will crash and an error message will come up. Please change the name of directory and restart the program .

    The program will write two kinds of image stack files:

    The averages have the following attributes set:

    The variances have the following attributes set:

    Description

    Reference

    Author / Maintainer

    Julien Bert, Guozhi Tao

    Keywords

    category 1
    APPLICATIONS

    Files

    statisctics.py, sxk_means.py

    See also

    sxk_means_groups sxk_means_stable

    Maturity

    beta
    works for author, often works for others.

    Bugs

    HDF file: HDF file has a limitation on the number of items contain in the header (~16000). In the case 'members' (list of images assigned to each class) is a list over 16000 elements, all assignment will be automatically export to text file: kmeans_grp_00.txt, kmeans_grp_01.txt, etc. Each file contain the list of ID images assigns to this class.

    sxk_means (last edited 2015-05-28 20:14:50 by penczek)