GS. Anne Ruiz-Gazen, Đại học Kinh tế Toulouse, Đại học Toulouse 1 Capitole, Cộng hòa Pháp.
Title: Multivariate functional outlier detection using Invariant Coordinate Selection
Abstract: Invariant Coordinate Selection (ICS) is an unsupervised multivariate statistical method that hinges on joint diagonalization of two scatter matrices. ICS is a dimensional reduction technique applied particularly in the field of anomaly detection on multivariate data. Nowadays, more and more data sets are of multivariate functional nature and various possibilities can be considered to extend the ICS outlier detection method to this multivariate functional framework. As usual in functional data analysis, we consider that the multivariate measurements correspond to functions observed on a discrete set of points in their domain. One possible extension of ICS consists in calculating for each component of the vector of curves, a functional approximation of the observed curves using some suitable basis and a finite number of basis vectors. ICS is then implemented on the stacked vector of the coordinates of the component functions in the basis of interest. Another possible extension is to calculate ICS scores at each domain point and derive some global outlyingness measurements over the domain. In this talk, the two extensions are presented and compared on several real data examples including some flight monitoring data from the aeronautics industry.
GS. Josep Antoni Martin-Fernandez, Đại học Girona, Tây Ban Nha.
Title: Basic concepts on R-mode hierarchical clustering in Compositonal Data
Abstract: R-mode hierarchical clustering (HC) identifies interrelationships between variables which are useful for variable selection and dimension reduction. The application of HC in R-mode to Compositional Data (CoDa) must be consistent with the fundamental properties of the compositional geometry, also known as the Aitchison geometry. A composition is a multivariate quantitative description of the parts or components of a whole conveying relative information, commonly expressed as a vector of proportions. The critical element of the Aitchison geometry is the inner product defined via the log-ratio coordinates of the compositions. This geometry allows to express a composition as coordinates in an orthonormal basis, formed by log-ratios and called olr-coordinates.
Recent publications introduce R-mode agglomerative HC methods in CoDa for creating orthonormal log-ratio basis. The HC methods form hierarchical groups of mutually exclusive subsets of parts which can be associated to a sequential binary partition of the parts. In this talk, we explore the basic concepts of the R-mode clustering algorithms and the connections between concepts such as distance between parts, cluster representative of a group of parts, and compositional biplot. Practical examples will be presented to visually illustrate the proposed approach.