Vai ai contenuti. | Spostati sulla navigazione | Spostati sulla ricerca | Vai al menu | Contatti | Accessibilità

| Crea un account

Donini, Michele (2016) Exploiting the structure of feature spaces in kernel learning. [Tesi di dottorato]

Full text disponibile come:

[img]
Anteprima
Documento PDF
4Mb

Abstract (inglese)

The problem of learning the optimal representation for a specific task recently became an important and not trivial topic in the machine learning community.
In this field, deep architectures are the current gold standard among the machine learning algorithms by generating models with several levels of abstraction discovering very complicated structures in large datasets. Kernels and Deep Neural Networks (DNNs) are the principal methods to handle the representation problem in a deep manner.
A DNN uses the famous back-propagation algorithm improving the state-of-the-art performance in several different real world applications, e.g. speech recognition, object detection and signal processing.
Nevertheless, DNN algorithms have some drawbacks, inherited from standard neural networks, since they are theoretically not well understood. The main problems are: the complex structure of the solution, the unclear decoupling between the representation learning phase and the model generation, long training time, and the convergence to a sub-optimal solution (because of local minima and vanishing gradient).
For these reasons, in this thesis, we propose new ideas to obtain an optimal representation by exploiting the kernels theory. Kernel methods have an elegant framework that decouples learning algorithms from data representations. On the other hand, kernels also have some weaknesses, for example they do not scale and they generally bring a shallow representation.
In this thesis, we propose new theory and algorithms to fill this gap and make kernel learning able to generate deeper representation and to be more scalable. Considering this scenario we propose a different point of view regarding the Multiple Kernel Learning (MKL) framework, starting from the idea of a deeper kernel.
An algorithm able to combine thousands of weak kernels with low computational and memory complexities is proposed. This procedure, called EasyMKL, outperforms the state-of-the-art methods combining the fragmented information in order to create an optimal kernel for the given task.
Pursuing the idea to create an optimal family of weak kernels, we create a new measure for the evaluation of the kernel expressiveness, called spectral complexity. Exploiting this measure we are able to generate families of kernels with a hierarchical structure of the features by defining a new property concerning the monotonicity of the spectral complexity.
We prove the quality of these weak families of kernels developing a new methodology for the Multiple Kernel Learning (MKL). Firstly we are able to create an optimal family of weak kernels by using the monotonically spectral-complex property; then we combine the optimal family of kernels by exploiting EasyMKL, obtaining a new kernel that is specific for the task; finally, we are able to generate the model by using a kernel machine.
Moreover, we highlight the connection among distance metric learning, feature learning and kernel learning by proposing a method to learn the optimal family of weak kernels for a MKL algorithm in the different context in which the combination rule is the product element-wise of kernel matrices. This algorithm is able to generate the best parameters for an anisotropic RBF kernel and, therefore, a connection naturally appears among feature weighting, combinations of kernels and metric learning.
Finally, the importance of the representation is also taken into account in three tasks from real world problems where we tackle different issues such as noise data, real-time application and big data

Abstract (italiano)

Il problema dell'apprendimento della reppresentazione ottima per un task specifico è divenuto un importante argomento nella comunità dell'apprendimento automatico.
In questo campo, le architetture di tipo deep sono attualmente le più avanzate tra i possibili algoritmi di apprendimento automatico. Esse generano modelli che utilizzando alti gradi di astrazione e sono in grado di scoprire strutture complicate in dataset anche molto ampi. I kernel e le Deep Neural Network (DNN) sono i principali metodi per apprendere una rappresentazione di un problema in modo ricco (cioè deep).
Le DNN sfruttano il famoso algoritmo di back-propagation migliorando le prestazioni degli algoritmi allo stato dell'arte in diverse applicazioni reali, come per esempio il riconoscimento vocale, il riconoscimento di oggetti o l'elaborazione di segnali.
Tuttavia, gli algoritmi DNN hanno anche delle problematiche, ereditate dalle classiche reti neurali e derivanti dal fatto che esse non sono completamente comprese teoricamente. I problemi principali sono: la complessità della struttura della soluzione, la non chiara separazione tra la fase di apprendimento della rappresentazione ottimale e del modello, i lunghi tempi di training e la convergenza a soluzioni ottime solo localmente (a causa dei minimi locali e del vanishing gradient).
Per questi motivi, in questa tesi, proponiamo nuove idee per ottenere rapprensetazioni ottimali sfruttando la teoria dei kernel. I metodi kernel hanno un elegante framework che separa l'algoritmo di apprendimento dalla rappresentazione delle informazioni. D'altro canto, anche i kernel hanno alcune debolezze, per esempio essi non scalano e, per come sono solitamente utilizzati, portano con loro una rappresentazione poco ricca (shallow).
In questa tesi, proponiamo nuovi risultati teorici e nuovi algoritmi per cercare di risolvere questi problemi e rendere l'apprendimento dei kernel in grado di generare rappresentazioni più ricche (deeper) ed essere più scalabili.
Verrà quindi presentato un nuovo algoritmo in grado di combinare migliaia di kernel deboli con un basso costo computazionale e di memoria. Questa procedura, chiamata EasyMKL, supera i metodi attualmente allo stato dell'arte combinando frammenti di informazione e creando in questo modo il kernel ottimale per uno specifico task.
Perseguendo l'idea di creare una famiglia di kernel deboli ottimale, abbiamo creato una nuova misura di valutazione dell'espressività dei kernel, chiamata Spectral Complexity. Sfruttando questa misura siamo in grado di generare famiglia di kernel deboli con una struttura gerarchica nelle feature definendo una nuova proprietà riguardante la monotonicità della Spectral Complexity.
Mostriamo la qualità dei nostri kernel deboli sviluppando una nuova metologia per il Multiple Kernel Learning (MKL). In primo luogo, siamo in grado di creare una famiglia ottimale di kernel deboli sfruttando la proprietà di monotinicità della Spectral Complexity; combiniamo quindi la famiglia di kernel deboli ottimale sfruttando EasyMKL e ottenendo un nuovo kernel, specifico per il singolo task; infine, siamo in grado di generare un modello sfruttando il nuovo kernel e kernel machine (per esempio una SVM).
Inoltre, in questa tesi sottolineiamo le connessioni tra Distance Metric Learning, Feature Larning e Kernel Learning proponendo un metodo per apprendere la famiglia ottimale di kernel deboli per un algoritmo MKL in un contesto differente, in cui la regola di combinazione è il prodotto componente per componente delle matrici kernel. Questo algoritmo è in grado di generare i parametri ottimali per un kernel RBF anisotropico. Di conseguenza, si crea un naturale collegamento tra il Feature Weighting, le combinazioni dei kernel e l'apprendimento della metrica ottimale per il task.
Infine, l'importanza della rappresentazione è anche presa in considerazione in tre task reali, dove affrontiamo differenti problematiche, tra cui: il rumore nei dati, le applicazioni in tempo reale e le grandi moli di dati (Big Data)

Statistiche Download - Aggiungi a RefWorks
Tipo di EPrint:Tesi di dottorato
Relatore:Aiolli, Fabio
Dottorato (corsi e scuole):Ciclo 28 > Scuole 28 > SCIENZE MATEMATICHE > INFORMATICA
Data di deposito della tesi:20 Gennaio 2016
Anno di Pubblicazione:21 Gennaio 2016
Parole chiave (italiano / inglese):representation learning, kernel learning, multiple kernel learning, multitaks learning
Settori scientifico-disciplinari MIUR:Area 01 - Scienze matematiche e informatiche > INF/01 Informatica
Struttura di riferimento:Dipartimenti > Dipartimento di Matematica
Codice ID:9062
Depositato il:07 Ott 2016 13:27
Simple Metadata
Full Metadata
EndNote Format

Bibliografia

I riferimenti della bibliografia possono essere cercati con Cerca la citazione di AIRE, copiando il titolo dell'articolo (o del libro) e la rivista (se presente) nei campi appositi di "Cerca la Citazione di AIRE".
Le url contenute in alcuni riferimenti sono raggiungibili cliccando sul link alla fine della citazione (Vai!) e tramite Google (Ricerca con Google). Il risultato dipende dalla formattazione della citazione.

[1] Fabio Aiolli, Matteo Ciman, Michele Donini, and Ombretta Gaggi. Climbtheworld: Real-time stairstep counting to increase physical activity. In Proceedings of the 11th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, pages 218–227. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2014. Cerca con Google

[2] Fabio Aiolli and Michele Donini. Easy multiple kernel learning. In 22th European Symposium on Artificial Neural Networks, ESANN 2014, Bruges, Belgium, April 23-25, 2014, 2014. Cerca con Google

[3] Fabio Aiolli and Michele Donini. Learning anisotropic RBF kernels. In Artificial Neural Networks and Machine Learning - ICANN 2014 - 24th International Conference on Artificial Neural Networks, Hamburg, Germany, September 15-19, 2014. Proceedings, pages 515–522, 2014. Cerca con Google

[4] Fabio Aiolli and Michele Donini. Easymkl: a scalable multiple kernel learning algorithm. Neurocomputing, 169:215–224, 2015. Cerca con Google

[5] Fabio Aiolli, Michele Donini, Enea Poletti, and Enrico Grisan. Stacked models for efficient annotation of brain tissues in MR volumes. IFMBE Proceedings, 41:261–264, 2014. Cerca con Google

[6] Fabio Aiolli, Giovanni Da San Martino, and Alessandro Sperduti. A kernel method for the optimization of the margin distribution. In ICANN (1), pages 305–314, 2008. Cerca con Google

[7] Erin L. Allwein, Robert E. Schapire, and Yoram Singer. Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1:113–141, 2000. Cerca con Google

[8] Ethem Alpaydin. Introduction to machine learning. MIT press, 2014. Cerca con Google

[9] Aouatif Amine, Ali El Akadi, Mohammed Rziza, and Driss Aboutajdine. Ga-svm and mutual information based frequency feature selection for face recognition. GSCM-LRIT, Faculty of Sciences, Mohammed V University, BP, 1014, 2009. Cerca con Google

[10] Davide Anguita, Alessandro Ghio, L. Oneto, and Sandro Ridella. A Deep Connection Between the Vapnik-Chervonenkis Entropy and the Rademacher Complexity. IEEE Transactions on Neural Networks and Learning Systems, (12):2202–2211. Cerca con Google

[11] J. Ashburner and K.J. Friston. Unified segmentation. NeuroImage, 26(3):839–851, 2005. cited By (since 1996) 1330. Cerca con Google

[12] E Ataer-Cansizoglu, J Kalpathy-Cramer, S You, K Keck, D Erdogmus, MF Chiang, et al. Analysis of underlying causes of inter-expert disagreement in retinopathy of prematurity diagnosis. Methods Inf Med, 54(1):93–102, 2015. Cerca con Google

[13] Francis Bach. Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning. (2). Cerca con Google

[14] Francis Bach. Hierarchical kernel learning. Nips, (May):1–18. Cerca con Google

[15] Francis Bach, Julien Mairal, and Jean Ponce. Convex sparse matrix factorizations. arXiv preprint arXiv:0812.1869, 2008. Cerca con Google

[16] Francis R. Bach. Exploring large feature spaces with hierarchical multiple kernel learning. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 105–112. Curran Associates, Inc., 2009. Cerca con Google

[17] Francis R Bach, Gert RG Lanckriet, and Michael I Jordan. Multiple kernel learning, conic duality, and the smo algorithm. In Proceedings of the twenty-first international conference on Machine learning, page 6. ACM, 2004. Cerca con Google

[18] K. Bache and M. Lichman. Uci machine learning repository, 2013. [19] Jing Bai, Ke Zhou, Guirong Xue, Hongyuan Zha, Gordon Sun, Belle Tseng, Zhaohui Zheng, and Yi Chang. Multi-task learning for learning to rank in web search. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 1549–1552. ACM, 2009. Cerca con Google

[20] Peter L. Bartlett, Olivier Bousquet, and Shahar Mendelson. Local Rademacher complexities. The Annals of Statistics, (4):1497–1537. Cerca con Google

[21] Pl Peter L Bartlett and Shahar Mendelson. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. Journal of Machine Learning Research, (3):463–482. Cerca con Google

[22] Jonathan Baxter. A model of inductive bias learning. J. Artif. Intell. Res.(JAIR), 12:149–198, 2000. Cerca con Google

[23] Lluís a. Belanche and Alessandra Tosi. Averaging of kernel functions. Neurocomputing, 112(April):19–25, 2013. Cerca con Google

[24] Aurélien Bellet, Amaury Habrard, and Marc Sebban. A survey on metric learning for feature vectors and structured data. arXiv preprint arXiv:1306.6709, 2013. Cerca con Google

[25] R. Bellman. Adaptive control processes: A guided tour. Princeton University Press, New Jersey, 1961. Cerca con Google

[26] Shai Ben-David and Reba Schuller. Exploiting task relatedness for multiple task learning. In Learning Theory and Kernel Machines, pages 567–580. Springer, 2003. Cerca con Google

[27] Yoshua Bengio, Yoshua Bengio, Olivier Delalleau, Olivier Delalleau, Nicolas Le Roux, Nicolas Le Roux, Downtown Branch, and Downtown Branch. The Curse of Dimensionality for Local Kernel Machines. 2(2):1–17, 2005. Cerca con Google

[28] Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives. Pattern Analysis and . . . , (1993):1–30. Cerca con Google

[29] Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle. Greedy Layer-Wise Training of Deep Networks. Advances in neural information processing systems, 19(1):153, 2007. Cerca con Google

[30] Yoshua Bengio, Jean-Francois Paiement, and Pascal Vincent. Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. In In Advances in Neural Information Processing Systems, pages 177–184. MIT Press, 2003. Cerca con Google

[31] Adriana Birlutiu, Perry Groot, and Tom Heskes. Efficiently learning the preferences of people. Machine Learning, 90(1):1–28, 2013. Cerca con Google

[32] V Bolón-Canedo, N Sánchez-Maroño, A Alonso-Betanzos, JM Benítez, and F Herrera. A review of microarray datasets and applied feature selection methods. Information Sciences, 282:111–135, 2014. Cerca con Google

[33] Verónica Bolón-Canedo, Noelia Sánchez-Maroño, and Amparo Alonso-Betanzos. A review of feature selection methods on synthetic data. Knowledge and information systems, 34(3):483–519, 2013. Cerca con Google

[34] Verónica Bolón-Canedo, Noelia Sánchez-Maroño, and Amparo Alonso-Betanzos. Data classification using an ensemble of filters. Neurocomputing, 135:13–20, 2014. Cerca con Google

[35] Olivier Bousquet and Léon Bottou. The tradeoffs of large scale learning. In Advances in neural information processing systems, pages 161–168, 2008. Cerca con Google

[36] Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends R in Machine Learning, 3(1):1–122, 2011. Cerca con Google

[37] Thorsten Brants and Alex Franz. fWeb 1T 5-gram Version 1g. 2006. Cerca con Google

[38] Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján. Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. The Journal of Machine Learning Research, 13(1):27–66, 2012. Cerca con Google

[39] Serhat S Bucak, Rong Jin, and Anil K Jain. Multiple kernel learning for visual object recognition: A review. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 36(7):1354–1369, 2014. Cerca con Google

[40] S.S. Bucak, R. Jin, and Ak. Jain. Multiple Kernel Learning for Visual Object Recognition: A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence, (7):1354–1369. Cerca con Google

[41] Evan Calabrese, Alexandra Badea, Charles Watson, and G Allan Johnson. A quantitative magnetic resonance histology atlas of postnatal rat brain development with regional estimates of growth and variability. NeuroImage, 71C:196–206, January 2013. Cerca con Google

[42] Rich Caruana. Multitask learning. Machine learning, 28(1):41–75, 1997. Cerca con Google

[43] Eduardo Castro, Vanessa Gómez-Verdejo, Manel Martínez-Ramón, Kent a. Kiehl, and Vince D. Calhoun. A multiple kernel learning approach to perform classification of groups from complex-valued fMRI data analysis: Application to schizophrenia. NeuroImage, pages 1–17. Cerca con Google

[44] O. Chapelle, B. Schölkopf, and A. Zien, editors. Semi-Supervised Learning. MIT Press, Cambridge, MA, 2006. Cerca con Google

[45] William W. Cohen. Stacked sequential learning. In International Joint Conference on Artificial Intelligence, pages 671–676, 2005. Cerca con Google

[46] Michael Collins and Nigel" Duffy. Convolution Kernels for Natural Language. Advances in Neural Information Processing Systems, 14:625–632, 2001. Cerca con Google

[47] Corinna Cortes. Learning Kernels Using Local Rademacher Complexity. Nips, pages 1–9, 2013. Cerca con Google

[48] Corinna Cortes, Marius Kloft, and Mehryar Mohri. Learning kernels using local rademacher complexity. In C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 2760–2768. Curran Associates, Inc., 2013. Cerca con Google

[49] Corinna Cortes, M Mohri, and a Rostamizadeh. Learning non-linear combinations of kernels. Advances in Neural Information . . . , pages 1–9. Cerca con Google

[50] Corinna Cortes, Mehryar Mohri, and Afshin Rostamizadeh. Learning sequence kernels. In Machine Learning for Signal Processing, 2008. MLSP 2008. IEEE Workshop on, pages 2–8. IEEE, 2008. Cerca con Google

[51] Corinna Cortes, Mehryar Mohri, and Afshin Rostamizadeh. Generalization bounds for learning kernels. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, pages 247–254, 2010. Cerca con Google

[52] Corinna Cortes, Mehryar Mohri, and Afshin Rostamizadeh. Algorithms for learning kernels based on centered alignment. The Journal of Machine Learning Research, 13(1):795–828, 2012. Cerca con Google

[53] Andrew Cotter, Joseph Keshet, and Nathan Srebro. Explicit Approximations of the Gaussian Kernel. arXiv preprint arXiv:1109.4603, page 11. Cerca con Google

[54] Andrew Cotter, Joseph Keshet, and Nathan Srebro. Explicit approximations of the gaussian kernel. arXiv preprint arXiv:1109.4603, 2011. Cerca con Google

[55] Nello Cristianini, John Shawe-Taylor, André Elisseeff, and Jaz S. Kandola. On kernel-target alignment. In Dietterich et al. [62], pages 367–373. Cerca con Google

[56] Giovanni Da San Martino, Nicolo Navarin, and Alessandro Sperduti. A memory efficient graph kernel. In The 2012 International Joint Conference on Neural Networks (IJCNN). Ieee, June 2012. Cerca con Google

[57] Giovanni Da San Martino, Nicolò Navarin, and Alessandro Sperduti. A Tree-Based Kernel for Graphs. In Proceedings of the Twelfth SIAM International Conference on Data Mining, pages 975–986, 2012. Cerca con Google

[58] Giovanni Da San Martino, Nicolò Navarin, and Alessandro Sperduti. Exploiting the ODD framework to define a novel effective graph kernel. In 23th European Symposium on Artificial Neural Networks, ESANN 2015, Bruges, Belgium, April 22-24, 2015, 2015. Cerca con Google

[59] Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, and Michael W Mahoney. Feature selection methods for text classification. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 230–239. ACM, 2007. Cerca con Google

[60] Manoranjan Dash and Huan Liu. Consistency-based search in feature selection. Artificial intelligence, 151(1):155–176, 2003. Cerca con Google

[61] Costanza D’Avanzo, Anahita Goljahani, Gianluigi Pillonetto, Giuseppe De Nicolao, and Giovanni Sparacino. A multi-task learning approach for the extraction of single-trial evoked potentials. Computer methods and programs in biomedicine, 110(2):125–136, 2013. Cerca con Google

[62] Thomas G Dietterich, Suzanna Becker, and Zoubin Ghahramani, editors. Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada]. MIT Press, 2001. Cerca con Google

[63] Paul D Dobson and Andrew J Doig. Distinguishing Enzyme Structures from Nonenzymes Without Alignments. Journal of Molecular Biology, 330(4):771–783, 2003. Cerca con Google

[64] Carlotta Domeniconi and Dimitrios Gunopulos. Adaptive nearest neighbor classification using support vector machines. In Dietterich et al., pages 665–672. Cerca con Google

[65] Richard O Duda, Peter E Hart, and David G Stork. Pattern classification. John Wiley & Sons„ 1999. Cerca con Google

[66] Jonathan Eckstein and Dimitri P Bertsekas. On the douglas—rachford splitting method and the proximal point algorithm for maximal monotone operators. Mathematical Programming, 55(1-3):293–318, 1992. Cerca con Google

[67] Dumitru Erhan, Pierre-Antoine Manzagol, Yoshua Bengio, Samy Bengio, and Pascal Vincent. The difficulty of training deep architectures and the effect of unsupervised pre-training. International Conference on Artificial Intelligence and Statistics, pages 153–160. Cerca con Google

[68] Theodoros Evgeniou and Massimiliano Pontil. Regularized multi–task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 109–117. ACM, 2004. Cerca con Google

[69] B. J. Fogg. A behavior model for persuasive design. In Proceedings of the 4th International Conference on Persuasive Technology, Persuasive ’09, pages 1–7, New York, NY, USA, 2009. ACM. Cerca con Google

[70] George Forman. An extensive empirical study of feature selection metrics for text classification. The Journal of ;achine Learning Research, 3:1289–1305, 2003. Cerca con Google

[71] George Forman. Feature selection for text classification. Computational Methods of Feature Selection, pages 257–276, 2008. Cerca con Google

[72] Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189–1232, 2000. Cerca con Google

[73] R. Gelineau-Morel, V. Tomassini, and M. Jenkinson. The effect of hypointense white matter lesions on automated gray matter segmentation in multiple sclerosis. Hum Brain Mapp, 2011. cited By (since 1996) 4. Cerca con Google

[74] Ali Gholipour, Alireza Akhondi-Asl, Judy a Estroff, and Simon K Warfield. Multiatlas multi-shape segmentation of fetal brain MRI for volumetric and morphometric analysis of ventriculomegaly. NeuroImage, 60(3):1819–31, April 2012. Cerca con Google

[75] Amir Globerson and Sam T. Roweis. Metric learning by collapsing classes. In NIPS, 2005. Cerca con Google

[76] Jacob Goldberger, Sam T. Roweis, Geoffrey E. Hinton, and Ruslan Salakhutdinov. Neighbourhood components analysis. In NIPS, 2004. Cerca con Google

[77] Tom Goldstein, Brendan O’Donoghue, Simon Setzer, and Richard Baraniuk. Fast alternating direction optimization methods. SIAM Journal on Imaging Sciences, 7(3):1588–1623, 2014. Cerca con Google

[78] Mehmet Gönen and Ethem Alpaydin. Localized multiple kernel learning. Proceedings of the 25th international conference on Machine learning - ICML ’08, (x):352–359. Cerca con Google

[79] Mehmet Gönen and Ethem Alpaydin. Multiple kernel learning algorithms. Journal of Machine Learning Research, 12:2211–2268, 2011. Cerca con Google

[80] Ioannis S Gousias, Daniel Rueckert, Rolf a Heckemann, Leigh E Dyet, James P Boardman, a David Edwards, and Alexander Hammers. Automatic segmentation of brain MRIs of 2-year-olds into 83 regions of interest. NeuroImage, 40(2):672–84, April 2008. Cerca con Google

[81] Isabelle Guyon. Feature extraction: foundations and applications, volume 207. Springer, 2006. Cerca con Google

[82] Isabelle Guyon, Asa Ben Hur, Steve Gunn, and Gideon Dror. Result analysis of the nips 2003 feature selection challenge. In Advances in Neural Information Processing Systems 17, pages 545–552. MIT Press, 2004. Cerca con Google

[83] Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. Gene selection for cancer classification using support vector machines. Machine learning, 46(1-3):389–422, 2002. Cerca con Google

[84] Piotr a Habas, Kio Kim, James M Corbett-Detig, Francois Rousseau, Orit a Glenn, a James Barkovich, and Colin Studholme. A spatiotemporal atlas of MR intensity, tissue probability and shape of the fetal brain with application to segmentation. NeuroImage, 53(2):460–70, November 2010. Cerca con Google

[85] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: An update. SIGKDD Explor. Newsl., 11(1):10–18, November 2009. Cerca con Google

[86] Mark A Hall. Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato, 1999. Cerca con Google

[87] Robert M. Haralick. Statistical and structural approaches to texture. Proc IEEE, 67(5):786–804, 1979. cited By (since 1996) 1640. Cerca con Google

[88] Anne-Claire Haury, Pierre Gestraud, and Jean-Philippe Vert. The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PloS one, 6(12):e28210, 2011. Cerca con Google

[89] David Haussler. Convolution Kernels on Discrete Structures. Technical report, Department of Computer Science, University of California at Santa Cruz, 1999. Cerca con Google

[90] Bingsheng He and Xiaoming Yuan. On the o(1/n) convergence rate of the douglasrachford alternating direction method. SIAM Journal on Numerical Analysis, 50(2):700–709, 2012. Cerca con Google

[91] Christoph Helma, Tobias Cramer, Stefan Kramer, and Luc De Raedt. Data mining and machine learning techniques for the identification of mutagenicity inducing substructures and structure activity relationships of noncongeneric compounds. Journal of Chemical Information and Computer Sciences, 44(4):1402–1411, 2004. Cerca con Google

[92] Geoffrey E Hinton. A Fast Learning Algorithm for Deep Belief Nets. 1554:1527–1554, 2006. Cerca con Google

[93] S Hochreiter, Y Bengio, P Frasconi, and J Schmidhuber. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. A Field Guide to Dynamical Recurrent Networks, pages 237–243. Cerca con Google

[94] Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural Computation, 9(8):1–32, 1997. Cerca con Google

[95] Mingyi Hong and Zhi-Quan Luo. On the linear convergence of the alternating direction method of multipliers. arXiv preprint arXiv:1208.3922, 2012. Cerca con Google

[96] Cho-Jui Hsieh, Kai-Wei Chang, Chih-Jen Lin, S Sathiya Keerthi, and Sellamanickam Sundararajan. A dual coordinate descent method for large-scale linear svm. In Proceedings of the 25th international conference on Machine learning, pages 408–415. ACM, 2008. Cerca con Google

[97] Hui-Huang Hsu, Cheng-Wei Hsieh, and Ming-Da Lu. Hybrid feature selection by combining filters and wrappers. Expert Systems with Applications, 38(7):8144–8150, 2011. Cerca con Google

[98] Jianping Hua,Waibhav D Tembe, and Edward R Dougherty. Performance of featureselection methods in the classification of high-dimension data. Pattern Recognition, 42(3):409–424, 2009. Cerca con Google

[99] Cheng-Lung Huang and Cheng-Yi Tsai. A hybrid sofm-svr with a filter-based feature selection for stock market forecasting. Expert Systems with Applications, 36(2):1529–1539, 2009. Cerca con Google

[100] Shu Huang,Wei Peng, Jingxuan Li, and Dongwon Lee. Sentiment and topic analysis on social media: a multi-task multi-label classification approach. In Proceedings of the 5th annual ACM web science conference, pages 172–181. ACM, 2013. Cerca con Google

[101] Zakria Hussain and John Shawe-Taylor. Improved loss bounds for multiple kernel learning. In Proceedings of the Fourteenth International Conference on Artificial Cerca con Google

Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11-13, 2011, pages 370–377, 2011. Cerca con Google

[102] Tâm Huynh and Bernt Schiele. Analyzing Features for Activity Recognition. Proceedings of the 2005 joint conference on Smart objects and ambient intelligence: innovative context-aware services: usages and technologies, (october):159–163. Cerca con Google

[103] G Ilczuk, R Mlynarski, W Kargul, and A Wakulicz-Deja. New feature selection methods for qualification of the patients for cardiac pacemaker implantation. In Computers in Cardiology, 2007, pages 423–426. IEEE, 2007. Cerca con Google

[104] A. Jain, S. V. N. Vishwanathan, and M. Varma. Spg-gmkl: Generalized multiple kernel learning with a million kernels. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2012. Cerca con Google

[105] Pratik Jawanpuria. Generalized Hierarchical Kernel Learning. 16:617–652, 2015. Cerca con Google

[106] Pratik Jawanpuria, Manik Varma, and Saketha Nath. On p-norm path following in multiple kernel learning for non-linear feature selection. In Tony Jebara and Eric P. Xing, editors, Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 118–126. JMLRWorkshop and Conference Proceedings, 2014. Cerca con Google

[107] Rodolphe Jenatton, Julien Mairal, Francis R Bach, and Guillaume R Obozinski. Proximal methods for sparse hierarchical dictionary learning. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 487–494, 2010. Cerca con Google

[108] Thorsten Joachims. Learning to classify text using support vector machines: Methods, theory and algorithms. Kluwer Academic Publishers, 2002. Cerca con Google

[109] Ian Jolliffe. Principal component analysis. Wiley Online Library, 2002. Cerca con Google

[110] Sham M Kakade, Shai Shalev-Shwartz, and Ambuj Tewari. Regularization techniques for learning with matrices. The Journal of Machine Learning Research, 13(1):1865–1890, 2012. Cerca con Google

[111] Hamidreza Rashidy Kanan and Karim Faez. An improved feature selection method based on ant colony optimization (aco) evaluated on face recognition system. Applied Cerca con Google

Mathematics and Computation, 205(2):716–725, 2008. Cerca con Google

[112] Purushottam Kar and Harish Karnick. Random feature maps for dot product kernels. In Proceedings of the Fifteenth International Conference on Artificial Intelligence Cerca con Google

and Statistics, AISTATS 2012, La Palma, Canary Islands, April 21-23, 2012, pages 583–591, 2012. Cerca con Google

[113] Dor Kedem, Stephen Tyree, Fei Sha, Gert R. Lanckriet, and Kilian Q. Weinberger. Non-linear metric learning. In F. Pereira, C.J.C. Burges, L. Bottou, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 2573–2581. Curran Associates, Inc., 2012. Cerca con Google

[114] Sahand Khakabimamaghani, Farnaz Barzinpour, and Mohammad R Gholamian. Enhancing ensemble performance through feature selection and hybridization. International Cerca con Google

Journal of Information Processing and Management, 2(2), 2011. Cerca con Google

[115] Hyunsoo Kim, Peg Howland, and Haesun Park. Dimension reduction in text classification with support vector machines. In Journal of Machine Learning Research, pages 37–53, 2005. Cerca con Google

[116] Kyoung-jae Kim and Ingoo Han. Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index. Expert systems with Applications, 19(2):125–132, 2000. Cerca con Google

[117] Marius Kloft. Lp-Norm Multiple Kernel Learning. Journal of Machine Learning Research, (3):953–997. Cerca con Google

[118] Marius Kloft and Gilles Blanchard. The local rademacher complexity of lp-norm multiple kernel learning. In Advances in Neural Information Processing Systems, pages 2438–2446, 2011. Cerca con Google

[119] V. Koltchinskii. Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory, 47(5):1902–1914, 2001. Cerca con Google

[120] Vladimir Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization. The Annals of Statistics, (6):2593–2656. Cerca con Google

[121] Igor Kononenko. Estimating attributes: analysis and extensions of relief. In Machine Learning: ECML-94, pages 171–182. Springer, 1994. Cerca con Google

[122] Gert R G Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I Jordan. Learning the Kernel Matrix with Semidefinite Programming. Journal of Machine Learning Research, 5:27–72, 2004. Cerca con Google

[123] Gert R. G. Lanckriet, Nello Cristianini, Peter L. Bartlett, Laurent El Ghaoui, and Michael I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27–72, 2004. Cerca con Google

[124] Hugo Larochelle, Hugo Larochelle, Yoshua Bengio, Yoshua Bengio, Jerome Lourador, Jerome Lourador, Pascal Lamblin, and Pascal Lamblin. Exploring Strategies for Training Deep Neural Networks. Journal of Machine Learning Research, 10:1–40, 2009. Cerca con Google

[125] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, (7553):436–444. Cerca con Google

[126] Chien-Pang Lee and Yungho Leu. A novel hybrid feature selection method for microarray data analysis. Applied Soft Computing, 11(1):208–213, 2011. Cerca con Google

[127] Michael C Lee, Lilla Boroczky, Kivilcim Sungur-Stasik, Aaron D Cann, Alain C Borczuk, Steven M Kawut, and Charles A Powell. Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction. Artificial intelligence in medicine, 50(1):43–53, 2010. Cerca con Google

[128] Ming-Chi Lee. Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Systems with Applications, 36(8):10896–10904, 2009. Cerca con Google

[129] Seung Ho Lee, Jae Young Choi, Konstantinos N Plataniotis, and Yong Man Ro.Color component feature selection in feature-level fusion based color face recognition. In Fuzzy Systems (FUZZ), 2010 IEEE International Conference on, pages 1–6. IEEE, 2010. Cerca con Google

[130] Guy Lever, Tom Diethe, and John Shawe-Taylor. Data-dependent kernels in nearlylinear time. arXiv preprint arXiv:1110.4416, 2011. Cerca con Google

[131] H. Liu and R. Setiono. Chi2: Feature selection and discretization of numeric attributes. In J.F. Vassilopoulos, editor, Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, November 5-8, 1995, pages 388–391, Herndon, Virginia, 1995. IEEE Computer Society. Cerca con Google

[132] Wei Liu, Buyue Qian, Jingyu Cui, and Jianzhuang Liu. Spectral kernel learning for semi-supervised classification. In IJCAI, pages 1150–1155, 2009. Cerca con Google

[133] Xiao Liu, Jun Shi, and Qi Zhang. Tumor classification by deep polynomial network and multiple kernel learning on small ultrasound image dataset. In Machine Learning in Medical Imaging, pages 313–320. Springer, 2015. Cerca con Google

[134] Yijuan Lu, Ira Cohen, Xiang Sean Zhou, and Qi Tian. Feature selection using principal feature analysis. In Proceedings of the 15th international conference on Multimedia, pages 301–304. ACM, 2007. Cerca con Google

[135] James Martens. Deep learning via Hessian-free optimization. 27th International Conference on Machine Learning, pages 735–742. Cerca con Google

[136] Andreas Maurer. Bounds for linear multi-task learning. The Journal of Machine Learning Research, 7:117–139, 2006. Cerca con Google

[137] Debasis Mazumdar, Soma Mitra, and Sushmita Mitra. Evolutionary-rough feature selection for face recognition. In Transactions on rough sets XII, pages 117–142. Springer, 2010. Cerca con Google

[138] Donald G McLaren, Kristopher J Kosmatka, Terrance R Oakes, Christopher D Kroenke, Steven G Kohama, John a Matochik, Don K Ingram, and Sterling C Johnson. A population-average MRI-based atlas collection of the rhesus macaque. NeuroImage, 45(1):52–9, March 2009. Cerca con Google

[139] Tom Mitchell, William Cohen, Estevam Hruschka, Partha Talukdar, Justin Betteridge, Andrew Carlson, Bhavana Dalvi Mishra, Matthew Gardner, Bryan Kisiel, Jayant Krishnamurthy, Ni Lao, Kathryn Mazaitis, Thahir Mohamed, Ndapa Nakashole, Emmanouil Platanios, Alan Ritter, Mehdi Samadi, Burr Settles, Richard Wang, Derry Wijaya, Abhinav Gupta, Xinlei Chen, Abulhair Saparov, Malcolm Greaves, and Joel Welling. Never-Ending Learning, 2015. Cerca con Google

[140] Radhika Mittal, Aman Kansal, and Ranveer Chandra. Empowering developers to estimate app energy consumption. In Proceedings of the 18th annual International Conference on Mobile Computing and Networking (MobiCom ’12), pages 317–328, 2012. Cerca con Google

[141] David Mizell. Using gravity to estimate accelerometer orientation. In Proceedings of the 7th IEEE International Symposium on Wearable Computers (ISWC’03), page 252, 2003. Cerca con Google

[142] Gregoire Montavon, Mikio Braun, and Klaus-Robert Müller. Kernel Analysis of Deep Networks. pages 2563–2581. Cerca con Google

[143] João FC Mota, João MF Xavier, Pedro MQ Aguiar, and Markus Püschel. A proof of convergence for the alternating direction method of multipliers applied to polyhedral-constrained functions. arXiv preprint arXiv:1112.2295, 2011. Cerca con Google

[144] K. Nakamura and E. Fisher. Segmentation of brain magnetic resonance images for measurement of gray matter atrophy in multiple sclerosis patients. NeuroImage, 44(3):769–776, 2009. Cerca con Google

[145] Andrew Y Ng. On Feature selection: Learning with Exponentially many Irrelevant Features Training Examples. Proc. 15th Interntional Conference on Machine Learning, pages 404–412, 1998. Cerca con Google

[146] Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training recurrent neural networks. International Conference on Machine Learning, (2):1310–1318. Cerca con Google

[147] Abhinav Pathak, Y. Charlie Hu, and Ming Zhang. Where is the energy spent inside my app?: Fine grained energy accounting on smartphones with eprof. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12), pages 29–42, 2012. Cerca con Google

[148] Paul Pavlidis, Jason Weston, Jinsong Cai, and William Noble Grundy. Gene functional classification from heterogeneous data. In Proceedings of the fifth annual international conference on Computational biology, pages 249–255. ACM, 2001. Cerca con Google

[149] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011. Cerca con Google

[150] Hanchuan Peng, Fuhui Long, and Chris Ding. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(8):1226–1238, 2005. Cerca con Google

[151] Susanna Pirttikangas, Kaori Fujinami, and Tatsuo Nakajima. Feature selection and activity recognition from wearable sensors. In HeeYong Youn, Minkoo Kim, and Hiroyuki Morikawa, editors, Ubiquitous Computing Systems, volume 4239 of Lecture Notes in Computer Science, pages 516–527. Springer Berlin Heidelberg, 2006. Cerca con Google

[152] E. Poletti, E. Veronese, M. Calabrese, A. Bertoldo, and E. Grisan. Supervised classification of brain tissues through local multi-scale texture analysis by coupling dir and flair mr sequences. Proceedings of SPIE, 8314:83142T, 2012. Cerca con Google

[153] Marcel Prastawa, Elizabeth Bullitt, and Guido Gerig. Synthetic ground truth for validation of brain tumor mri segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2005, pages 26–33. Springer, 2005. Cerca con Google

[154] Shibin Qiu and Terran Lane. A framework for multiple kernel support vector regression and its applications to sirna efficacy prediction. Computational Biology and Bioinformatics, IEEE/ACM Transactions on, 6(2):190–199, 2009. Cerca con Google

[155] J. Ross Quinlan. Induction of decision trees. Machine learning, 1(1):81–106, 1986. Cerca con Google

[156] Alexander Rakhlin, Ohad Shamir, and Karthik Sridharan. Making gradient descent optimal for strongly convex stochastic optimization. arXiv preprint arXiv:1109.5647, 2011. Cerca con Google

[157] A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet. SimpleMKL. Journal of Machine Learning Research, 9:2491–2521, 2008. Cerca con Google

[158] Rabab M Ramadan and Rehab F Abdel-Kader. Face recognition using particle swarm optimization-based selected features. International Journal of Signal Processing, Image Processing and Pattern Recognition, 2(2):51–65, 2009. Cerca con Google

[159] Ronald C Read and Derek G Corneil. The graph isomorphism disease. Journal of Graph Theory, 1(4):339–363, 1977. Cerca con Google

[160] B. Remeseiro, V. Bolon-Canedo, D. Peteiro-Barral, A. Alonso-Betanzos, B. Guijarro-Berdinas, A. Mosquera, M.G. Penedo, and N. Sanchez-Marono. A methodology for improving tear film lipid layer classification. Biomedical and Health Informatics, IEEE Journal of, 18(4):1485–1493, 2014. Cerca con Google

[161] Y. Saeys, I. Inza, and P. Larrañaga. A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19):2507–2517, 2007. Cerca con Google

[162] Juergen Schmidhuber. Deep Learning in Neural Networks: An Overview. pages 1–88. Cerca con Google

[163] Isaac Jacob Schoenberg. Positive definite functions on spheres. Duke Mathematical Journal, 9(1):96–108, 1942. Cerca con Google

[164] Bernhard Schölkopf, Alexander Smola, and Klaus-Robert Müller. Kernel principal component analysis. In Artificial Neural Networks ICANN97, pages 583–588. Springer, 1997. Cerca con Google

[165] Toby Segaran and Jeff Hammerbacher. Beautiful data: the stories behind elegant data solutions. " O’Reilly Media, Inc.", 2009. Cerca con Google

[166] Ahmed Serag, Paul Aljabar, Gareth Ball, Serena J Counsell, James P Boardman, Mary a Rutherford, a David Edwards, Joseph V Hajnal, and Daniel Rueckert. Construction of a consistent high-definition spatio-temporal atlas of the developing brain using adaptive kernel regression. NeuroImage, 59(3):2255–65, February 2012. Cerca con Google

[167] Thomas Serre, Lior Wolf, and Tomaso Poggio. Object recognition with features inspired by visual cortex. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2:994–1000, 2005. Cerca con Google

[168] Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014. Cerca con Google

[169] Shai Shalev-Shwartz, Yoram Singer, and Andrew Y. Ng. Online and batch learning of pseudo-metrics. In ICML, 2004. Cerca con Google

[170] Shai Shalev-Shwartz, Yoram Singer, Nathan Srebro, and Andrew Cotter. Pegasos: Primal estimated sub-gradient solver for svm. Mathematical programming, 127(1):3–30, 2011. Cerca con Google

[171] J Shawe-Taylor, P L Bartlett, Robert C Williamson, and M Anthony. Structural Risk Minimization over Data-Dependent Hierarchies. IEEE Transactions on Information Cerca con Google

Theory, 44(5):1926–1940, 1996. Cerca con Google

[172] John Shawe-Taylor and Nello Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004. Cerca con Google

[173] Feng Shi, Pew-Thian Yap, Yong Fan, John H Gilmore, Weili Lin, and Dinggang Shen. Construction of multi-region-multi-reference atlases for neonatal brain MRI segmentation. NeuroImage, 51(2):684–93, June 2010. Cerca con Google

[174] Si Si, Cho-Jui Hsieh, and Inderjit Dhillon. Memory Efficient Kernel Approximation. Proceedings of The 31st International Conference on Machine Learning, pages 701–709. Cerca con Google

[175] Pekka Siirtola and Juha Röning. Recognizing human activities user-independently on smartphones based on accelerometer data. International Journal of Interactive Multimedia and Artificial Intelligence, 1(5):38–45, 06/2012 2012. Cerca con Google

[176] Daniel L Silver. The parallel transfer of task knowledge using dynamic learning rates based on a measure of relatedness. Connection Science, 8(2):277–294, 1996. Cerca con Google

[177] Amit Singhal. Introducing the Knowledge Graph: things, not strings, 2012. Cerca con Google

[178] S.M. Smith, M. Jenkinson, M.W. Woolrich, C.F. Beckmann, T.E.J. Behrens, H. Johansen-Berg, P.R. Bannister, M. De Luca, I. Drobnjak, D.E. Flitney, R.K. Niazy, J. Saunders, J. Vickers, Y. Zhang, N. De Stefano, J.M. Brady, and P.M. Matthews. Advances in functional and structural mr image analysis and implementation as fsl. NeuroImage, 23(SUPPL. S:S208–S219, 2004. cited By (since 1996) 1935. Cerca con Google

[179] Sören Sonnenburg, Gunnar Rätsch, Christin Schäfer, and Bernhard Schölkopf. Large Scale Multiple Kernel Learning. Journal of Machine Learning Research, pages 1531—-1565. Cerca con Google

[180] Nathan Srebro. Learning with Matrix Factorizations. PhD thesis, Massachusetts Institute of Technology, 2004. Cerca con Google

[181] Bo Sun, Di Zhang, Jun He, Lejun Yu, and Xuewen Wu. Multi-feature based robust face detection and coarse alignment method via multiple kernel learning. In SPIE Security+ Defence, pages 96520H–96520H. International Society for Optics and Photonics, 2015. Cerca con Google

[182] Zhaonan Sun, Nawanol Ampornpunt, Manik Varma, and Svn Vishwanathan. Multiple kernel learning and the smo algorithm. In Advances in neural information processing systems, pages 2361–2369, 2010. Cerca con Google

[183] Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton. On the importance of initialization and momentum in deep learning. Jmlr W&Cp, 28(2010):1139– 147, 2013. Cerca con Google

[184] Yuchun Tang, Cornelius Hojatkashani, Ivo D Dinov, Bo Sun, Lingzhong Fan, Xiangtao Lin, Hengtao Qi, Xue Hua, Shuwei Liu, and Arthur W Toga. The construction of a Chinese MRI brain atlas: a morphometric comparison study between Chinese and Caucasian cohorts. NeuroImage, 51(1):33–41, May 2010. Cerca con Google

[185] Sebastian Thrun and Lorien Pratt. Learning to learn. Springer Science & Business Media, 2012. Cerca con Google

[186] Chih-Fong Tsai and Yu-Chieh Hsiao. Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches. Decision Support Systems, 50(1):258–269, 2010. Cerca con Google

[187] Harun U˘guz. A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowledge-Based Systems, 24(7):1024–1032, 2011. Cerca con Google

[188] Vladimir Naumovich Vapnik and Vlamimir Vapnik. Statistical learning theory, volume 1. Wiley New York, 1998. Cerca con Google

[189] Manik Varma and Bodla Rakesh Babu. More generality in efficient multiple kernel learning. Proceedings of the 26th Annual International Conference on Machine Learning - ICML ’09, pages 1–8. Cerca con Google

[190] Alexander Vergara, Shankar Vembu, Tuba Ayhan, Margaret A. Ryan, Margie L. Homer, and Raman Huerta. Chemical gas sensor drift compensation using classifier ensembles, 2012. Cerca con Google

[191] E. Veronese, E. Poletti, M. Calabrese, A. Bertoldo, and E. Grisan. Unsupervised segmentation of brain tissues using multiphase level sets on multiple mri sequences. In Intelligent Systems and Control/742: Computational Bioscience. ACTA Press, 2011. Cerca con Google

[192] S V N Vishwanathan and Alexander J Smola. Fast Kernels for String and Tree Matching. In NIPS, pages 569–576, 2002. Cerca con Google

[193] A. Vovk, R.W. Cox, J. Stare, D. Suput, and Z.S. Saad. Segmentation priors from local image properties: Without using bias field correction, location-based templates, or registration. NeuroImage, 2010. cited By (since 1996) 1. Cerca con Google

[194] Nikil Wale, Ian Watson, and George Karypis. Comparison of descriptor spaces for chemical compound retrieval and classification. Knowledge and Information Systems, 14(3):347–375, 2008. Cerca con Google

[195] Fei Wang and Jimeng Sun. Survey on distance metric learning and dimensionality reduction in data mining. Data Mining and Knowledge Discovery, pages 1–31, 2014. Cerca con Google

[196] Tinghua Wang, Haihui Xie, Liyun Zhong, and Shengzhou Hu. A multiple kernel learning approach to text categorization. Journal of Computational and Theoretical Nanoscience, 12(9):2121–2126, 2015. Cerca con Google

[197] Kilian Q. Weinberger and Lawrence K. Saul. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10:207–244, 2009. Cerca con Google

[198] Owen S Weislow, Rebecca Kiser, Donald L Fine, John Bader, Robert H Shoemaker, and Michael R Boyd. New soluble-formazan assay for HIV-1 cytopathic effects: application to high-flux screening of synthetic and natural products for AIDSantiviral activity. Journal of the National Cancer Institute, 81(8):577–586, 1989. Cerca con Google

[199] Christopher Williams and Matthias Seeger. The effect of the input density distribution on kernel-based classifiers. In Proceedings of the 17th International Conference on Machine Learning, number EPFL-CONF-161323, pages 1159–1166, 2000. Cerca con Google

[200] Juanying Xie and Chunxia Wang. Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases.Expert Systems with Applications, 38(5):5809–5815, 2011. Cerca con Google

[201] Xinxing Xu, Ivor W. Tsang, and Dong Xu. Soft margin multiple kernel learning. IEEE Trans. Neural Netw. Learning Syst., 24(5):749–761, 2013. Cerca con Google

[202] Zenglin Xu, Rong Jin, Haiqin Yang, Irwin King, and Michael R. Lyu. Simple and efficient multiple kernel learning by group lasso. In ICML, pages 1175–1182, 2010. Cerca con Google

[203] Feng Yang and KZ Mao. Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Transactions on Computational Biology and Cerca con Google

Bioinformatics (TCBB), 8(4):1080–1092, 2011. Cerca con Google

[204] Jing Yang, Dengju Yao, Xiaojuan Zhan, and Xiaorong Zhan. Predicting disease risks using feature selection based on random forest and support vector machine. In Cerca con Google

Bioinformatics Research and Applications, pages 1–11. Springer, 2014. Cerca con Google

[205] Jingjing Yang, Yonghong Tian, Ling-Yu Duan, Tiejun Huang, and Wen Gao. Groupsensitive multiple kernel learning for object recognition. IEEE transactions on image Cerca con Google

processing : a publication of the IEEE Signal Processing Society, (Iccv):2838–52. Cerca con Google

[206] L. Yu and H. Liu. Feature selection for high-dimensional data: A fast correlationbased filter solution. In International Conference on Machine Learning, volume 20, Cerca con Google

page 856, 2003. Cerca con Google

[207] Shi Yu, Tillmann Falck, Anneleen Daemen, Leon-Charles Tranchevent, Johan Ak Suykens, Bart De Moor, and Yves Moreau. L2-norm multiple kernel learning and its application to biomedical data fusion. BMC bioinformatics, 11:309, 2010. Cerca con Google

[208] Xiao-tong Yuan, Zhenzhen Wang, Jiankang Deng, Qingshan Liu, and Senior Member. Efficient c 2 Kernel Linearization via Random Feature Maps. pages 1–6, 2015. Cerca con Google

[209] Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. Recurrent Neural Network Regularization. Icrl, (2013):1–8. Cerca con Google

[210] Zheng Zhao and Huan Liu. Searching for interacting features. In IJCAI, volume 7, pages 1156–1161, 2007. Cerca con Google

[211] Zheng Alan Zhao and Huan Liu. Spectral feature selection for data mining. Chapman & Hall/CRC, 2011. Cerca con Google

[212] Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi-Supervised Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2009. Cerca con Google

[213] Alexander Zien and Cheng Soon Ong. Multiclass multiple kernel learning. In Proceedings of the 24th international conference on Machine learning, pages 1191–1198. ACM, 2007. Cerca con Google

Download statistics

Solo per lo Staff dell Archivio: Modifica questo record