Symbolic Data Analysis and the SODAS Software

Upload bởi: nhendoc115

Mã tài liệu: 202159

Số trang: 478

Định dạng: pdf

Dung lượng file:

Chuyên mục: Khoa học công nghệ

Đến trang tải tài liệu này

Info

Databases are now ubiquitous in industrial companies and public administrations, and they often grow to an enormous size. They contain units described by variables that are often categorical or numerical (the latter can of course be also transformed into categories). It is then easy to construct categories and their Cartesian product. In symbolic data analysis these categories are considered to be the new statistical units, and the first step is to get these higher-level units and to describe them by taking care of their internal variation. What do we mean by ‘internal variation’? For example, the age of a player in a football team is 32 but the age of the players in the team (considered as a category) varies between 22 and 34; the height of the mushroom that I have in my hand is 9 cm but the height of the species (considered as a category) varies between 8 and 15 cm.

A more general example is a clustering process applied to a huge database in order to summarize it. Each cluster obtained can be considered as a category, and therefore each variable value will vary inside each category. Symbolic data represented by structured variables, lists, intervals, distributions and the like, store the ‘internal variation’ of categories better than do standard data, which they generalize. ‘Complex data’ are defined as structured data, mixtures of images, sounds, text, categorical data, numerical data, etc. Therefore, symbolic data can be induced from categories of units described by complex data (see Section 1.4.1) and therefore complex data describing units can be considered as a special case of symbolic data describing higher-level units.

The aim of symbolic data analysis is to generalize data mining and statistics to higherlevel units described by symbolic data. The SODAS2 software, supported by EUROSTAT, extends the standard tools of statistics and data mining to these higher-level units. More precisely, symbolic data analysis extends exploratory data analysis (Tukey, 1958; Benzécri, 1973; Diday et al., 1984; Lebart et al., 1995; Saporta, 2006), and data mining (rule discovery, clustering, factor analysis, discrimination, decision trees, Kohonen maps, neural networks, _ _ _ ) from standard data to symbolic data.

Phần bên dưới chỉ hiển thị một số trang ngẫu nhiên trong tài liệu. Bạn tải về để xem được bản đầy đủ

GỢI Ý

Những tài liệu gần giống với tài liệu bạn đang xem

Data Analysis and Signal Processing in ...

Upload: ttthanh85

📎 Số trang: 428
👁 Lượt xem: 518
⬇ Lượt tải: 16

Tìm tài liệu

Info

Phần bên dưới chỉ hiển thị một số trang ngẫu nhiên trong tài liệu. Bạn tải về để xem được bản đầy đủ

GỢI Ý

QUAN TÂM

CHUYÊN MỤC

KhoTri thức số

Về chúng tôi

MENU

LIÊN HỆ

098 333 9285