March 8, 2018

HYBRID Tutorials

Room Europa IV Ground Floor Europa II Ground Floor OCEANIA I 2nd Floor OCEANIA II 2nd Floor OCEANIA III 2nd Floor OCEANIA IV 2nd Floor OCEANIA V 2nd Floor OCEANIA VI 2nd Floor OCEANIA VII 2nd Floor OCEANIA VIII 2nd Floor OCEANIA IX 2nd Floor OCEANIA X 2nd Floor
Capacity 300 Poster Session 100 160 100 100 120 150 110 100 180 190
08:00AM-10:00AM HYB_01 – part 1 CEC_01 CEC_06 CEC_10 CEC_15 FUZZ_01 Part 1 FUZZ_05 Part 1 IJCNN_01 Part 1 IJCNN_07 Part 1 IJCNN_12 Part 1
10:00AM-10:15AM COFFEE BREAK
10:15AM-12:15AM HYB_01 – part 2 CEC_02 CEC_07 CEC_11 CEC_16 FUZZ_01 Part 2 FUZZ_05 Part 2 IJCNN_01 Part 2 IJCNN_07 Part 2 IJCNN_12 Part 2
12:15AM-1:00PM LUNCH
1:00PM-3:00PM HYB_02 CEC_03 CEC_08 Part 1 CEC_12 CEC_17 Part 1 FUZZ_02 FUZZ_06 Part 1 IJCNN_03 IJCNN_08 IJCNN_13
3:00PM-3:15PM COFFEE BREAK
3:15PM-5:15PM HYB_03 CEC_04 CEC_08 Part 2 CEC_13 CEC_17 Part 2 FUZZ_03 FUZZ_06 Part 2 IJCNN_04 IJCNN_09 IJCNN_14 Part 1
5:15PM-7:15PM IJCNN_15 CEC_05 CEC_09 CEC_14 CEC_18 FUZZ_04 IJCNN_10 IJCNN_05 IJCNN_06 IJCNN_11 IJCNN_14 Part 2
7:30PM-9:30PM Welcome Reception @ Europa Room – Ground Floor

HYB_01 Computational Intelligence for Data Science and Big Data
HYB_02 Empirical Approach: How to get Fast, Interpretable Deep Learning
HYB_03 Interactive Adaptive Learning

Title: Computational Intelligence for Data Science and Big Data (HYB_01)Organized by Isaac Triguero, Alberto Fernández, Mikel Galar

Description:

In the era of big data, the leverage of recent advances achieved in distributed technologies enables data mining techniques to discover unknown patterns or hidden relations from voluminous data in a faster way. Extracting knowledge from big data becomes a very interesting and challenging task where we must consider new paradigms to develop scalable algorithms. However, computational intelligence models for machine learning and data mining cannot be straightforwardly adapted to the new space and time requirements. Hence, existing algorithms should be redesigned or new ones developed in order to take advantage of their capabilities in the big data context. Moreover, several issues are posed by real-world complex big data problems besides from computational complexity, and big data mining techniques should be able to deal with challenges such as dimensionality, class-imbalance, and lack of annotated samples among others.

Addressing Big Data becomes a very interesting and challenging task where we must consider new paradigms to develop scalable algorithms. The MapReduce framework, introduced by Google, allows us to carry out the processing of large amounts of information. Its open source implementation, named Hadoop, has allowed the development scalable algorithm becoming de facto standard for addressing Big Data problems. Recently, new alternatives to the standard Hadoop-MapReduce framework have arisen to improve the performance in this scenario, being Apache Spark project the most relevant one. Even working on Spark, the MapReduce framework implies that existing algorithms need to be redesigned or new ones need to be developed in order to take advantage of their capabilities in the big data context.

In this tutorial we will first provide a gentle introduction to the problem of Big Data as well as the presentation of recent technologies (Hadoop ecosystem, Spark, Flink). Then, we will dive into the field of Big Data analytics, explaining the challenges that come to Computational Intelligence techniques and introducing Machine Learning libraries such as Mahout, MLib, and FlinkML.

Afterwards, we will go across two of the main topics of the WCCI 2018, namely fuzzy modeling and evolutionary models in the Big Data context. We start by introducing the features and design for the most recent approaches for fuzzy modeling on Big Data. Then, we continue with several case studies for evolutionary instance selection/generation, feature selection/weighting and imbalanced data classification.

We aim at defining the direction for the design of powerful algorithms based on both fuzzy systems and evolutionary algorithms, and how the information extracted with these models can be useful for the experts.

Finally, we will consider the software associated with the case studies presented in order to carry out a live demonstration of some of our most recent developed models for Big Data classification.

Intended Audience
This tutorial is aimed at all those researchers involved in the development of fuzzy models as well as evolutionary algorithms, providing them an overview of the existing technologies to deal with Big Data problems. The audience will also be able to understand the impact of the use of such kind of approaches in Data Science, in particular by means of our most recent research publications on the topic and a demonstration section that intends to help the participants to better understand the underlying process.

Short Biography

Isaac Triguero received the M.Sc. and Ph.D. degrees in Computer Science from the University of Granada, Granada, Spain, in 2009 and 2014, respectively. He is currently an Assistant Professor in Data Science at the School of Computer Science of the University of Nottingham. He has published more than 25 international journal papers as well as more than 20 contributions to conferences. His research interests include data mining, data reduction, biometrics, optimization, evolutionary algorithms, semi-supervised learning, bioinformatics and big data learning.
Alberto Fernández received the M.Sc. and Ph.D. degrees in computer science from the University of Granada, Granada, Spain, in 2005 and 2010, respectively. He is currently an Assistant Professor with the Department of Computer Science and Artificial Intelligence, University of Granada, Spain. He has published more than 100 papers in highly rated JCR journals and international conferences. In 2013, 2014, and 2017 Dr. Fernández received the University of Granada Prize for Scientific Excellence Works in the field of Engineering. He has also been awarded in 2011 with the Lofti A. Zadeh Best Paper prize (IFSA Assocaiation). He has been recently selected as a Highly Cited Researcher http://highlycited.com (in the field of Computer Science, 2017 Clarivate Analytics). His research interests include classification in imbalanced domains, fuzzy rule learning, evolutionary algorithms, multiclassification problems with ensembles and decomposition techniques, and data science in big data applications
Mikel Galar received the M.Sc. and Ph.D. degrees in Computer Science in 2009 and 2012, both from the Public University of Navarra, Pamplona, Spain. He is currently an assistant professor at the Department of Automatic and Computation at the Public University of Navarre. He is the author of 31 published original articles in international journals and more than 45 contributions to conferences. He is also reviewer of more than 35 international journals. His research interests are data mining, classification, multi-classification, ensemble learning, evolutionary algorithms, fuzzy systems and big data. He is a member of the IEEE, the European Society for Fuzzy Logic and Technology (EUSFLAT) and the Spanish Association of Artificial Intelligence (AEPIA). He has received the extraordinary prize for his PhD thesis from the Public University of Navarre and the 2013 IEEE Transactions on Fuzzy System Outstanding Paper Award for the paper “A New Approach to Interval-Valued Choquet Integrals and the Problem of Ordering in Interval-Valued Fuzzy Set Applications” (bestowed in 2016).

Title: Empirical Approach: How to get Fast, Interpretable Deep Learning (HYB_02)

Organized by Plamen Angelov and Xiaowei Gu

Description:

We are witnessing an explosion of data (streams) being generated and growing exponentially. Nowadays we carry in our pockets Gigabytes of data in the form of USB flash memory sticks, smartphones, smartwatches etc. Extracting useful information and knowledge from these big data streams is of immense importance for the society, economy and science. Deep Learning quickly become a synonymous of a powerful method to enable items and processes with elements of AI in the sense that it makes possible human like performance in recognising images and speech. However, the currently used methods for deep learning which are based on neural networks (recurrent, belief, etc.) is opaque (not transparent), requires huge amount of training data and computing power (hours of training using GPUs), is offline and its online versions based on reinforcement learning has no proven convergence, does not guarantee same result for the same input (lacks repeatability).

The presenters recently introduced a new concept of empirical approach to machine learning and fuzzy sets and systems, had proven convergence for a class of such models and used the link between neural networks and fuzzy systems (neuro-fuzzy systems are known to have a duality from the radial basis functions (RBF) networks and fuzzy rule based models and having the key property of universal approximation proven for both).

In this tutorial we will present in a systematic way the basics of the newly introduced Empirical Approach to Machine Learning, Fuzzy Sets and Systems and its applications to problems like: anomaly detection, clustering, classification, prediction and control. The major advantages of this new paradigm is the liberation from the restrictive and often unrealistic assumptions and requirements concerning the nature of the data (random, deterministic, fuzzy), the need to formulate and assume a priori the type of distribution models, membership functions, the independence of the individual data observations, their large (theoretically infinite) number, etc. From a pragmatic point of view, this direct approach from data (streams) to complex, layered model representation is automated fully and leads to very efficient model structures. For example, we will demonstrate and explain step by step how fast, transparent, non-parametric, re-trainable and dynamically evolving deep learning classifiers can be developed that do not require huge amounts of training data and computational power (for comparison currently existing deep learning models require hours of training of GPUs, tens of thousands of training data and generate complex cumbersome black box representations with tens of millions of parameters or more that are not directly interpretable). Moreover, the proposed new methods can guarantee convergence and stability for the first order models, be highly parallelised and as precise as the traditional one. Furthermore, it does not require the use of stochastic tricks like “elastic distortion”, using stochastic models and as a result can guarantee full repeatability (same result for the same input image no matter how many times we repeat the experiment). In addition, the proposed new concept learns in a way similar to the way people learn – it can start from a single example. Imagine people who can recognise an object which they have seen only once and can associate to it any other previously unseen object that is similar to it. This is completely possible in real live. However, no machine learning approach can start with no prior training on labelled data. The reason why the proposed new approach makes this possible is because it is prototype based and non-parametric. We will further demonstrate semi-supervised learning where only a handful, e.g. 5% of the data are being labelled and the rest are associated based on their similarity to the prototypes identified from these 5% of the data. We will use a number of experimental results and examples to visualise, demonstrate and involve the audience with this new approach. A book is being prepared and software will be made available for further hands on experience with this new methodology.

Short Biography

Prof. Angelov (MEng 1989, PhD 1993, DSc 2015) is a Fellow of the IEEE, of the IET and of the HEA. He is Vice President of the International Neural Networks Society (INNS) for Conference and Governor of the Systems, Man and Cybernetics Society of the IEEE. He has 25+ years of professional experience in high level research and holds a Personal Chair in Intelligent Systems at Lancaster University, UK. He leads the Data Science group at the School of Computing and Communications which includes over 20 academics, researchers and PhD students. He has authored or co-authored 250 peer-reviewed publications in leading journals, peer-reviewed conference proceedings, 6 patents, two research monographs (by Wiley, 2012 and Springer, 2002) cited over 6300+ times with an h-index of 39 and i10-index of 111. His single most cited paper has 810 citations. He has an active research portfolio in the area of computational intelligence and machine learning and internationally recognised results into online and evolving learning and algorithms for knowledge extraction in the form of human-intelligible fuzzy rule-based systems. Prof. Angelov leads numerous projects (including several multimillion ones) funded by UK research councils, EU, industry, UK MoD. His research was recognised by ‘The Engineer Innovation and Technology 2008 Special Award’ and ‘For outstanding Services’ (2013) by IEEE and INNS. He is also the founding co-Editor-in-Chief of Springer’s journal on Evolving Systems and Associate Editor of several leading international scientific journals, including IEEE Transactions on Fuzzy Systems (the IEEE Transactions with the highest impact factor) of the IEEE Transactions on Systems, Man and Cybernetics as well as of several other journals such as Applied Soft Computing, Fuzzy Sets and Systems, Soft Computing, etc. He gave over a dozen plenary and key note talks at high profile conferences. Prof. Angelov was General co-Chair of a number of high profile conferences including IJCNN2013, Dallas, TX; IJCNN2015, Killarney, Ireland; the inaugural INNS Conference on Big Data, San Francisco; the 2nd INNS Conference on Big Data, Thessaloniki, Greece and a series of annual IEEE Symposia on Evolving and Adaptive Intelligent Systems. Dr Angelov is the founding Chair of the Technical Committee on Evolving Intelligent Systems, SMC Society of the IEEE and was previously chairing the Standards Committee of the Computational Intelligent Society of the IEEE (2010-2012). He was also a member of International Program Committee of over 100 international conferences (primarily IEEE). More details can be found at www.lancs.ac.uk/staff/angelov

Xiaowei Gu received the B.Eng. and M.Eng. degrees from the Hangzhou Dianzi University, Hangzhou, China. He is currently finishing his Ph.D. studies and submitting for a Ph.D. degree in computer science at the Lancaster University, Lancaster, U.K. He co-authored over 20 publications primarily in prestigious IEEE Transactions and conferences and received IEEE awards (travel award in 2016 and ALMA competition award, 2017).

Title: Interactive Adaptive Learning (HYB_03)Organized by Adrian Calma, Daniel Kottke, Robi Polikar

Description:

Science, technology, and commerce increasingly recognize the importance of machine learning approaches for data-intensive, evidence-based decision making. This is accompanied by increasing numbers of machine learning applications and volumes of data. Nevertheless, the capacities of processing systems, human supervisors, or domain experts remain limited in real-world applications. Furthermore, applications require fast reaction to new situations, which means that predictive models need to be available even if few data is yet available. Therefore, approaches are needed that optimize the whole learning process, including the interaction with human supervisors, processing systems, and data of various kind and at different timings. Such approaches include (1) techniques for estimating the impact of additional resources (e.g. data) on the learning progress; (2) techniques for the active selection of the information processed or queried; (3) techniques for reusing knowledge across time, domains, or tasks, by identifying similarities and adaptation to changes between them; (4) techniques for making use of different types of information, such as labeled or unlabeled data, constraints or domain knowledge. Solutions are provided in, for example, the fields of adaptive, active, semi-supervised, and transfer learning. However, this is mostly done in separate lines of research. Combinations thereof in interactive and adaptive machine learning systems that are capable of operating under various constraints, and thereby address the immanent real-world challenges of volume, velocity and variability of data and data mining systems, are rarely reported.

he importance of interactive machine learning setting has been recently been acknowledged [Holzinger, “Interactive machine learning for health informatics: when do we need the human-in-the-loop?” (Brain Informatics 3/2), 2016]. While a widely accepted taxonomy still needs to be developed, we consider the following definition based on Holzinger’s definition: “Interactive adaptive learning comprises systems and algorithms that can interact with agents, which could be humans or other smart systems, in a learning loop, observing the result of learning, optimizing their learning behavior through these interactions and providing input that improve the learning outcome”.

Therefore, this tutorial aims to bring together researchers and practitioners from these different areas, and to stimulate research in interactive and adaptive machine learning systems as a whole. Therefore, we discuss several topics in the field of interactive adaptive learning, e.g. stream mining, active learning and semi-supervised learning and point out the interdependence between these fields.

Short Biography

Robi Polikar is a Professor of Electrical and Computer Engineering at Rowan University, in Glassboro, NJ. He has received his B.Sc. degree in electronics and communications engineering from Istanbul Technical University, Istanbul, Turkey in 1993, and his M.Sc and Ph.D. degrees, both co-majors in electrical engineering and biomedical engineering, from Iowa State University, Ames, IA in 1995 and 2000, respectively. His current research interests within computational intelligence include ensemble systems, incremental and nonstationary learning, and various applications of pattern recognition in bioinformatics and biomedical engineering. He is a member of IEEE, ASEE, Tau Beta Pi and Eta Kappa Nu. His current and prior works are funded primarily through NSF’s CAREER and Energy, Power and Adaptive Systems (EPAS) programs. He has been heavily involved with IJCNN for over a decade with many special sessions, as well as serving as part of organizing committee. He is also an Associate Editor for IEEE Transactions on Neural Networks and Learning Systems. He has organized various special sessions and tutorials, including the ICJNN special sessions on Concept Drift, Domain Adaptation & Learning in Dynamic Environments.

Daniel Kottke is a PhD student at the University of Kassel in Germany. He studied Computer Science (BSc) and Data & Knowledge Engineering (MSc) at Otto von Guericke University Magdeburg and received the MSc degree with distinction in 2014. His main research interests are machine learning, active learning, probabilistic methods and applications in the _eld of Neuroscience, especially brain-computer interfaces. In 2016, he co-organized the iKNOW 2016 Workshop on Active Learning: Applications, Foundations and Emerging Trends in Graz, Austria. In 2017, he co-organized the combined workshop and tutorial \Interactive Adaptive Learning” and lead the tutorial part on Active Learning.

Adrian Calma started his studies at the Babes, Bolyai University, Cluj-Napoca, Romania. He moved to Germany, where he received his B.Sc. and M.Sc. in Computer Science from the University of Kassel. Since autumn 2014 he is pursuing a PhD and co-authored four peer-reviewed publications in the field of active learning out of a total of seven. He is passionate about machine learning and its applications, while focusing his research on active, semi-supervised, and collaborative learning. In 2017, he co-organized the combined workshop and tutorial \Interactive Adaptive Learning” and chaired the workshop session on Semi-supervised and Transfer Learning.