DATA SCIENCE WEEK
Thời gian: 13:30 đến 15:00 ngày 21/08/2018, 13:30 đến 15:00 ngày 22/08/2018, 15:30 đến 17:00 ngày 23/08/2018, 13:30 đến 15:00 ngày 23/08/2018, 13:30 đến 15:00 ngày 24/08/2018, 09:00 đến 11:00 ngày 24/08/2018, 09:00 đến 11:00 ngày 27/08/2018, 14:00 đến 16:00 ngày 28/08/2018, 09:00 đến 11:00 ngày 29/08/2018, 09:00 đến 11:00 ngày 30/08/2018,
Địa điểm: Hội Trường Tầng 3, Tòa nhà B1 Đại học Bách khoa Hà Nội.
Báo cáo viên: Nguyễn Xuân Long, Nguyễn Hùng Sơn, Hồ Tú Bảo, Lê Hồng Vân
Tóm tắt:DATA SCIENCE WEEK
DSLab/VIASM & FIRST PROJECT
Đơn vị tổ chức: Viện Nghiên cứu cao cấp về Toán, Trường ĐH Bách Khoa Hà Nội, JVN.
AUGUST 21-24, AFTERNOON (13:30 - 15:00)
Lecture: Mathematical foundations of machine learning
Lecturer: Lê Hồng Vân, Institute of Mathematics of Czech Academy of Sciences
Abstract: Machine learning is an interdisciplinary field in the intersection of mathematical statistics and computer sciences. Machine learning studies statistical models, develops methods and algorithms for deriving predictors or meaningful patterns from empirical data. Machine learning techniques are applied in search engine, natural language processing, image detection, robotics, self-driving cars and artificial intelligence. In our lecture series we address the following questions:
- What is the mathematical model of learning?
- How do we quantify the difficulty/complexity of a learning problem and success of a learning machine?
- What are current tasks of machine learning?
- Can machines learn to think?
Lecture 1 (August 21): Learning, machine learning and artificial intelligence.
Lecture 2 (August 22): Statistical models and framework for machine learning.
Lecture 3 (August 23): PAC (probably approximately correct) learning theory.
Lecture 4 (August 24): Deep learning, pattern theory and algebra of human thoughts.
--------------------------------------------------
AUGUST 23, AFTERNOON
Seminar:
Title: Streaming dynamic and distributed inference of latent geometric structures
Speaker: Nguyễn Xuân Long, University of Michigan, Ann Arbor
Time and Place: 15:30-17:00, VIASM
Abstract:
We develop new models and algorithms for learning the temporal dynamics of the topic polytopes and related geometric objects that arise in topic model based inference. Our model is nonparametric Bayesian and the corresponding inference algorithm is able to discover new topics as the time progresses. By exploiting the connection between the modeling of topic polytope evolution, Beta-Bernoulli process and the Hungarian matching algorithm, our method is shown to be several orders of magnitude faster than existing topic modeling approaches, as demonstrated by experiments working with several million documents in a dozen minutes.
-----------------------------------------------------
AUGUST 24, MORNING
Tutorial Lecture:Gaussian and Dirichlet processes for statistical learning and inference
Lecturer: Nguyễn Xuân Long, University of Michigan, Ann Arbor
Time and Place: 9:00-11:00, VIASM
Abstract:
Bayesian nonparametrics is an area in statistics which provides a fertile and powerful mathematical framework for the development of many computational and statistical modeling ideas. The spirit of Bayesian nonparametrics is to enable the kind of inferential procedures according to which both the statistical modeling and computational complexity may adapt to increasingly large and complex data patterns in a graceful and effective way. This circle of ideas and techniques is expected to have increasingly large role to play in the era of big data. In this tutorial lecture, I will introduce Gaussian processes and Dirichlet processes, two of the most common tools that arise in Bayesian nonparametric model constructions for regression, classification, clustering, and density estimation problems. Computational issues and applications will be discussed.
-----------------------------------------------------
AUGUST 27, MORNING
Lecture: Ontology based Machine Learning
Lecturer: Nguyễn Hùng Sơn, Warsaw University
Time and Place: 9:00-11:00, VIASM
Abstract:
One of the common problems in many machine learning or data mining projects is ralated to the issue called „information rich” but „knowledge poor”. Usually, the domain knowledge is presented either as an ontology or as a taxonomy (a lighter version of ontology) of concepts. In this talk we present some knowledge aquisition techniques as well as the utility of domain knowledge in some machine learning techniques. We will show that ontology can improve the accuracy of classification and clustering algorithms. Moreover, ontology can be also used to build a semantic evaluator for clustering algorithms.
------------------------------------------------------
AUGUST 28, AFTERNOON
Lecture: Neural Random Access Machines – a deep learning technique for sequential data.
Lecturer: Nguyễn Hùng Sơn, Warsaw University
Time and Place: 14:00-16:00, VIASM
Abstract: We present a new neural network architecture inspired by Neural Turing Machines called a Neural Random Access Machine. This architecture can manipulate and dereference pointers to an external variable-size random-access memory. It has been shown that the proposed model can learn to solve algorithmic tasks and is capable of discovering simple data structures like linked-lists and binary trees. For a subset of tasks, the learned solutions generalize to sequences ofarbitrary length.
-----------------------------------------------------
AUGUST 29, MORNING
Public lecture: AI nào cho Việt Nam?
Lecturer: Hồ Tú Bảo, VIASM
Time and Place: 9:00-11:00, VIASM
Abstract: download tại đây
Trong phần đầu chúng tôi tóm tắt lại những nội dung cơ bản và tình hình phát triển của AI nói chung. Trong phần sau, dựa trên mục tiêu phát triển đất nước và tình hình phát triển AI ở Việt Nam, chúng tôi chia sẻ một số ý kiến về việc chúng ta cần và nên chú trọng vào những lĩnh vực nào của AI trong bối cảnh Việt Nam thời chuyển đổi số.
------------------------------------------------------
AUGUST 30, MORNING
Lecture: Khoa học dữ liệu trong chăm sóc sức khoẻ và nghiên cứu y học
Lecturer: Hồ Tú Bảo, VIASM
Time and Place: 9:00-11:00, VIASM
Abstract:
Chăm sóc sức khoẻ và nghiên cứu y học là lĩnh vực quan trọng của mọi quốc gia với rất nhiều thách thức. Trong những năm qua, lượng dữ liệu y tế đã tăng lên rất nhiều, mở ra khả năng khai thác và sử dụng chúng cho hai mục tiêu kể trên. Bài giảng này trước hết giới thiệu một bức tranh toàn cảnh về khoa học dữ liệu trong y tế, đặc biệt việc khai thác và sử dụng bệnh án điện tử. Phần sau của bài giảng giới thiệu một số công việc, bài toán và kết quả nghiên cứu đề tài chúng tôi đã và đang tiến hành ở Việt Nam.
-----------------------------------------------------------------
BIO
Nguyễn Xuân Long is associate professor of Statistics and of Electrical Engineering and Computer Science at the University of Michigan, Ann Arbor. He received his PhD degree from the University of California, Berkeley. Nguyen's interests include nonparametric Bayesian statistics, machine learning and optimization, as well as applications in signal processing and environmental sciences. He is a recipient of the Leon O. Chua Award from UC Berkeley for his PhD research, the IEEE Signal Processing Society's Young Author best paper award, the CAREER award from the NSF's Division of Mathematical Sciences, and best paper awards from the International Conference on Machine Learning (ICML) in 2004 and 2014. Nguyen is an associate editor of several journals including Bayesian Analysis, Journal of Machine Learning Research and SIAM Journal on Mathematics of Data Science.
Nguyễn Hùng Sơn received the Ph. D. in 1997, D. Sci. (habilitation) in 2008 and he is working as a professor in University of Warsaw. His main research interests are fundamentals and applications of Rough set theory, data mining, text mining, bioinformatics, intelligent multiagent systems, soft computing, pattern recognition. On these topics he has published more than 140 research papers in edited books, international journals and conferences. Hung Son Nguyen is the fellow of International Rough Set society, and a member of the Editorial Board of international journals, i.e., “Transaction on Rough Sets”, “Data mining and Knowledge Discovery” (from 2005-2008) and "ERCIM News", Computational Intelligence and the Manager Editor of "Fundamenta Informaticea". He has served as a program co-chair of RSCTC'06 and ’RSKT2012, IJCRS2018, as a PC member of various other conferences including PKDD, PAKDD, AAMAS, RSCTC, RSFDGrC, RSKT, etc., and as a reviewer of many other journals. He was involved in numerous research and commercial projects including dialog-based search engine (Nutech), fraud detection for Bank of America (Nutech), logistic project for General Motors (Nutech), Semantic Search Engine, Intelligent Decision Support System for Firefighting in Poland, RID – Development of Innovative Transport System.
Hồ Tú Bảo is Professor Emeritus of Japan Advanced Institute of Science and Technology (JAIST), Director of Data Science Lab of the Vietnam Institute for Advanced Study in Mathematics (VIASM) and Director of the John von Neumann Institute (JVN) of Vietnam National University at Ho Chi Minh City. He graduated (1978) from Hanoi University of Technology, Master (1984) and Doctor (1987) in Artificial Intelligence from the Universite Paris 6, and Habilitation a diriger de recherche (1998) from the University Paris 9. He has been doing research, application and teaching in the fields of Artificial Intelligence, Machine Learning, Data Mining, and more recently in Data Science for nearly forty years. He is members of the Steering Committee of PRICAI (Pacific Rim International Conference on Artificial Intelligence), PAKDD (Pacific Asia Knowledge Discovery and Data Mining), ACML (Asia Conference on Machine Learning).
Lê Hồng Vân received her Ph.D. in mathematics from Moscow State University (MSU), Moscow in 1987, under the supervision of Anatoly Fomenko, and DrSc. in mathematics from MSU, Moscow, in 1990. She is a senior Researcher of the Institute of Mathematics of the Czech Academy of Sciences (CAS) and holds Associate Professorship of the Opava University, Czech Republic. Prior to joining the CAS she held research positions at Hanoi Institute of Mathematics in Hanoi, International Center of Theoretical Physics (ICTP) in Trieste, Max-Planck-Institute for Mathematics in Bonn, University Leipzig, Max-Planck-Institute for Mathematics in Leipzig. She held visiting positions at Institut Henri Poincar´e in Paris, Isaac Newton Institute in Cambridge, Institut des Hautes Etudes Scientifiques in Bures-sur-Yvette ´ and ETH Z¨urich. Her current research interests include differential geometry, geometric analysis, differential topology, algebraic topology, symplectic topology, representation theory, theoretical statistics and machine learning. She is a co-author with Nihat Ay, J¨urgen Jost and Lorenz Schwachh¨ofer of the book “Information Geometry”, published in 2017 in the series “Ergebnisse der Mathematik und ihrer Grenzgebiete” of Springer. She received the prize of the Moscow Mathematical Association in 1990 and the prize of ICTP in 1991, the Heisenberg-Fellowship of the German Research Association (DFG) for 1994-1998.