Support Vector Machines (SVM) method is a powerful classification technique as a data mining application. This methodology has been applied in many scientific researches in wide variety of fields in the recent decades and has proved its high efficiency, performance, and accuracy. In spite of all the advantages of Support Vector Machines, it has an important drawback that is related to the interpretability difficulty of the produced results which are not easily understandable. This disadvantage highlights more especially for scientific cases in which the main purpose is knowledge discovery from the databases, and not just classifying the data. This research is an attempt to cover this weakness via an innovative and practical technique. In this Methodology, adequate number of points of the decision boundary for classification must be distinguished, afterwards, a Symbolic Regression (SR) technique must be applied on the obtained points. In the current developed method, Genetic Programming (GP), that is an evolutionary algorithm, is being used for the Symbolic Regression part of the procedure. Based on the presented materials, the developed procedure and algorithm in this research was called SVM-GP methodology that is a technique for presenting the results of Support Vector Machines as a simple and easily understandable algebraic mathematical equation. The performance and efficiency of the developed methodology have been tested on several synthetic databases with different conditions which led to produce a versatile code for solving high-dimensional problems. Next, the algorithm have been applied for classification of three real-world cases and interesting knowledge was extracted from them. The discovered information is useful for the related researches, meanwhile, it provides evidence for the importance of SVM-GP methodology and its high performance, efficiency, and accuracy for scientific cases.

(2015). Data Mining in Scientific Databases for Knowledge Discovery, the Case of Interpreting Support Vector Machines via Genetic Programming as Simple Understandable Terms.

Data Mining in Scientific Databases for Knowledge Discovery, the Case of Interpreting Support Vector Machines via Genetic Programming as Simple Understandable Terms

TALEBZADEH, SAEED
2015-01-01

Abstract

Support Vector Machines (SVM) method is a powerful classification technique as a data mining application. This methodology has been applied in many scientific researches in wide variety of fields in the recent decades and has proved its high efficiency, performance, and accuracy. In spite of all the advantages of Support Vector Machines, it has an important drawback that is related to the interpretability difficulty of the produced results which are not easily understandable. This disadvantage highlights more especially for scientific cases in which the main purpose is knowledge discovery from the databases, and not just classifying the data. This research is an attempt to cover this weakness via an innovative and practical technique. In this Methodology, adequate number of points of the decision boundary for classification must be distinguished, afterwards, a Symbolic Regression (SR) technique must be applied on the obtained points. In the current developed method, Genetic Programming (GP), that is an evolutionary algorithm, is being used for the Symbolic Regression part of the procedure. Based on the presented materials, the developed procedure and algorithm in this research was called SVM-GP methodology that is a technique for presenting the results of Support Vector Machines as a simple and easily understandable algebraic mathematical equation. The performance and efficiency of the developed methodology have been tested on several synthetic databases with different conditions which led to produce a versatile code for solving high-dimensional problems. Next, the algorithm have been applied for classification of three real-world cases and interesting knowledge was extracted from them. The discovered information is useful for the related researches, meanwhile, it provides evidence for the importance of SVM-GP methodology and its high performance, efficiency, and accuracy for scientific cases.
2015
2015/2016
Ingegneria civile
29.
Settore ING-IND/16 - TECNOLOGIE E SISTEMI DI LAVORAZIONE
English
Tesi di dottorato
(2015). Data Mining in Scientific Databases for Knowledge Discovery, the Case of Interpreting Support Vector Machines via Genetic Programming as Simple Understandable Terms.
File in questo prodotto:
File Dimensione Formato  
Saeed Talebzadeh ( PhD Thesis ).pdf

solo utenti autorizzati

Licenza: Non specificato
Dimensione 1.45 MB
Formato Adobe PDF
1.45 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/202257
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact