Mã tài liệu: 202159
Số trang: 478
Định dạng: pdf
Dung lượng file:
Chuyên mục: Khoa học công nghệ
Databases are now ubiquitous in industrial companies and public administrations, and they often grow to an enormous size. They contain units described by variables that are often categorical or numerical (the latter can of course be also transformed into categories). It is then easy to construct categories and their Cartesian product. In symbolic data analysis these categories are considered to be the new statistical units, and the first step is to get these higher-level units and to describe them by taking care of their internal variation. What do we mean by ‘internal variation’? For example, the age of a player in a football team is 32 but the age of the players in the team (considered as a category) varies between 22 and 34; the height of the mushroom that I have in my hand is 9 cm but the height of the species (considered as a category) varies between 8 and 15 cm.
A more general example is a clustering process applied to a huge database in order to summarize it. Each cluster obtained can be considered as a category, and therefore each variable value will vary inside each category. Symbolic data represented by structured variables, lists, intervals, distributions and the like, store the ‘internal variation’ of categories better than do standard data, which they generalize. ‘Complex data’ are defined as structured data, mixtures of images, sounds, text, categorical data, numerical data, etc. Therefore, symbolic data can be induced from categories of units described by complex data (see Section 1.4.1) and therefore complex data describing units can be considered as a special case of symbolic data describing higher-level units.
The aim of symbolic data analysis is to generalize data mining and statistics to higherlevel units described by symbolic data. The SODAS2 software, supported by EUROSTAT, extends the standard tools of statistics and data mining to these higher-level units. More precisely, symbolic data analysis extends exploratory data analysis (Tukey, 1958; Benzécri, 1973; Diday et al., 1984; Lebart et al., 1995; Saporta, 2006), and data mining (rule discovery, clustering, factor analysis, discrimination, decision trees, Kohonen maps, neural networks, _ _ _ ) from standard data to symbolic data.
Những tài liệu gần giống với tài liệu bạn đang xem
📎 Số trang: 428
👁 Lượt xem: 440
⬇ Lượt tải: 16
📎 Số trang: 353
👁 Lượt xem: 309
⬇ Lượt tải: 16
📎 Số trang: 284
👁 Lượt xem: 342
⬇ Lượt tải: 16
📎 Số trang: 269
👁 Lượt xem: 342
⬇ Lượt tải: 16
📎 Số trang: 283
👁 Lượt xem: 321
⬇ Lượt tải: 16
Những tài liệu bạn đã xem
📎 Số trang: 478
👁 Lượt xem: 474
⬇ Lượt tải: 16