Feature extraction
You don't need to be Editor-In-Chief to add or edit content to WikiDoc. You can begin to add to or edit text on this WikiDoc page by clicking on the edit button at the top of this page. Next enter or edit the information that you would like to appear here. Once you are done editing, scroll down and click the Save page button at the bottom of the page.
In pattern recognition and in image processing, Feature extraction is a special form of dimensionality reduction.
When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (much data, but not much information) then the input data will be transformed into a reduced representation set of features (also named features vector). Transforming the input data into the set of features is called features extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.
General
Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. When performing analysis of complex data one of the major problems stems from the number of variables involved. Analysis with a large number of variables generally requires a large amount of memory and computation power or a classification algorithm which overfits the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy.
Best results are achieved when an expert constructs a set of application-dependent features. Nevertheless, if no such expert knowledge is available general dimensionality reduction techniques may help. These include:
- Principal components analysis
- Semidefinite embedding
- Multifactor dimensionality reduction
- Nonlinear dimensionality reduction
- Isomap
- Kernel PCA
- Latent semantic analysis
- Partial least squares
Image processing
It can be used in the area of image processing which involves using algorithms to detect and isolate various desired portions or shapes (features) of a digitized image or video stream. It is particularly important in the area of Optical Character Recognition.
Low-level
Curvature
- Edge direction, changing intensity, autocorrelation.
Image motion
- Motion detection. Area based, differential approach. Optical flow.
Shape Based
Thresholding
Blob extraction
Template matching
Hough transform
- Lines
- Circles/Ellipse
- Arbitrary shapes (Generalized Hough Transform)
Flexible methods
- Deformable, parameterized shapes
- Active contours (snakes)
References
See also
- Dimensionality reduction
- Feature detection
- Feature selection
- Data mining
- Connected component labeling
Template:Compu-graphics-stubfa:استخراج ویژگی
Acknowledgement and Attribution Regarding Sources of Content
Some of the initial content on this page may be incorporated in part from copyleft sources in the public domain including wikis such as Wikipedia and AskDrWiki. Drug information for patients came from the The National Library of Medicine. Infectious disease information may have come from the Centers for Disease Control (CDC). Differential Diagnoses are drawn from clinicians as well as an amalgamation of 3 sources: 1.The Disease Database; 2. Kahan, Scott, Smith, Ellen G. In A Page: Signs and Symptoms. Malden, Massachusetts: Blackwell Publishing, 2004:3; 3. Sailer, Christian, Wasner, Susanne. Differential Diagnosis Pocket. Hermosa Beach, CA: Borm Bruckmeir Publishing LLC, 2002:7 .

