Knowledge discovery

Knowledge discovery is a concept of the field of computer science that describes the process of automatically searching large volumes of data for patterns that can be considered knowledge about the data. The most well-known application of knowledge discovery is data mining, also known as Knowledge Discovery in Databases (KDD).

Another promising application of knowledge discovery is in the area of software modernization which involves understanding existing software artifacts. This process is related to a concept of reverse engineering. Usually the knowledge obtained from existing software is presented in the form of models to which specific queries can be made when necessary. An entity relationship is a frequent format of representing knowledge obtained from existing software. Object Management Group (OMG) developed specification Knowledge Discovery Metamodel (KDM) which defines an ontology for the software assets and their relationships for the purpose of performing knowledge discovery of existing code. Knowledge discovery from existing software systems, also known as software mining is closely related to data mining, since existing software artifacts contain enormous business value, key for the evolution of software systems. Instead of mining individual data sets, software mining focuses on metadata, such as database schemas.

Knowledge discovery is the process of deriving knowledge from the input data. Some forms of knowledge discovery create abstractions of the input data. In some scenarios, the knowledge obtained through the process of knowledge discovery becomes further data that can be used for continuous discovery.

Knowledge discovery is a complex topic that can be further categorized according to 1) what kind of data is searched; and 2) in what form is the result of the search represented.

Input data for knowledge discovery

 * Databases
 * Relational data
 * Database
 * Document warehouse
 * Data warehouse
 * oftware
 * Text
 * Concept mining
 * Graphs
 * Molecule mining
 * Sequences
 * Data stream mining
 * Learning from time-varying data streams under concept drift
 * Web

Output formats for discovered knowledge

 * Data model
 * Metadata
 * Metamodels
 * Ontology
 * Knowledge representation
 * Business rule
 * Knowledge Discovery Metamodel (KDM)
 * Business Process Modeling Notation (BPMN)
 * Intermediate representation
 * Resource Description Framework (RDF)
 * Software metrics