Quiz Cluster analysis – Multivariate methods

Test your knowledge.

Receive immediate feedback.

You find all answers in the book.

Quiz | Cluster Analysis

/37

Quiz | Cluster Analysis

1 / 37

In single-linkage clustering (nearest neighbor), how is the distance between a newly formed cluster and an object calculated?

By taking the average of the distances between the objects in the cluster and the object

By taking the maximum of the distances between the objects in the cluster and the object

By taking the sum of the distances between the objects in the cluster and the object

By taking the mimimum of the distances between the objects in the cluster and the object

2 / 37

What does the Euclidean distance in a cluster analysis primarily consider?

Correlation between objects

Absolute differences between objects

Similarity between objects

Dissimilarity between objects

3 / 37

How are proximity measures typically categorized in cluster analysis?

As measures of variable correlations

As similarity measures only

As either similarity or distance measures

As dissimilarity measures only

4 / 37

How can the number of clusters in hierarchical cluster analysis be determined using the elbow criterion?

By identifying a "leap" in the values of the heterogeneity measure

By comparing the results of different clustering methods

By applying k-means cluster analysis

By calculating t-values for each variable

5 / 37

How do agglomerative and divisive hierarchical clustering methods differ?

Agglomerative methods are faster than divisive methods.

Agglomerative methods form different groups from a broad partition, while divisive methods divide a full sample into different groups.

Divisive methods are more commonly used in practice than agglomerative methods.

Agglomerative methods are based on a broad partition, while divisive methods form different groups from a granular partition.

6 / 37

Complete-linkage clustering (furthest neighbor) calculates distances between clusters by:

Taking the maximum of the distances between objects in the clusters.

Taking the sum of the distances between objects in the clusters.

Taking the average of the distances between objects in the clusters.

Taking the minimum of the distances between objects in the clusters.

7 / 37

What is the purpose of calculating t-values and F-values in cluster analysis?

To assess the quality of a clustering solution and characterize the clusters

To apply the elbow criterion

To identify outliers

To determine the number of clusters

8 / 37

In k-means clustering, what is the target criterion for forming clusters?

Maximum variance within clusters

Minimum variance within clusters

Maximum variance between clusters

Minimum variance between clusters

9 / 37

When should cluster analysis be used instead of factor analysis?

10 / 37

Which cluster fusion algorithm is known to provide fairly good partitions and often indicates the correct number of clusters?

Ward method

Complete linkage

Single linkage

Average linkage

11 / 37

What factors should be considered when selecting cluster variables for analysis?

The selection of a clustering method

The relevance, independence, and measurability of variables, among others

The number of clusters to be formed

The variables with the highest correlations

12 / 37

What is one of the ways to process nominally scaled variables in cluster analysis?

Convert them into ordinal variables

Use them as-is without any modification

Calculate their means and variances

Transform them into binary variables

13 / 37

Why is the single-linkage method considered suitable for identifying outliers?

It forms a few large groups with many small ones "left over".

It is not suitable for detecting outliers.

It tends to form chains, making it effective at detecting outliers.

It uses the largest value of individual distances.

14 / 37

What is the main objective of cluster analysis?

To study correlations between metric and nominal variables

To group similar objects into clusters based on similarities

To reduce the number of variables in a data set

To analyze causal relationships between variables

15 / 37

What algorithm can be used to detect outliers?

Complete-linkage algorithm

Single-linkage algorithm

Ward algorithm

16 / 37

What is the first step in performing a cluster analysis?

Determination of the number of clusters

Selection of the clustering method

Selection of cluster variables

Interpretation of a cluster solution

17 / 37

In cluster analysis, what does the Minkowski metric generalize?

The determination of the number of clusters

The Pearson correlation coefficient

The selection of cluster variables

The Euclidean distance and city block metric

18 / 37

When transforming a nominal variable into binary variables, what does the value '1' typically represent?

"Attribute value is uncertain"

"Attribute value does not exist"

"Attribute value is missing"

"Attribute value exists"

19 / 37

How can you determine the number of clusters?

K-Means

Agglomeration schedule

Dendrogramm

20 / 37

What are the two types of hierarchical clustering?

Partioning clustering

Agglomerative clustering

Divisive clustering

21 / 37

What is one limitation of agglomerative cluster procedures, especially for large case numbers?

They require calculating a distance matrix for each clustering step

They always yield accurate results

They work well with small datasets

They are computationally efficient

22 / 37

In the case of binary variables, what does the Simple Matching (SM) similarity coefficient count in the numerator?

The total number of properties

The number of properties common to both objects

The difference between the binary values

The number of properties that only one object has

23 / 37

What is the key criterion for clustering objects in Ward's method?

Maximizing the variance between clusters

Maximizing the number of clusters

Minimizing the total number of objects in each cluster

Minimizing the sum of squared distances within each cluster

24 / 37

Why is cluster analysis considered related to exploratory data analysis procedures?

It calculates the mean values of variables.

It is used for predictive modeling.

It leads to suggestions for grouping objects and discovering structures in datasets.

It helps identify the standard deviation of data.

25 / 37

When is it recommended to use a partitioning clustering algorithm like k-means or two-step clustering?

When dealing with small datasets

When there is a need for agglomerative clustering

When working with a large number of cases

When the Ward method is selected

26 / 37

What is the purpose of selecting an appropriate proximity measure in cluster analysis?

To quantify the similarity or dissimilarity between objects

To calculate the mean values of variables

To determine the number of clusters

To decide which variables are relevant for clustering

27 / 37

What should researchers always consider when presenting the results of a cluster analysis?

The date size should have at least 5,000 observations.

The number of clusters used should always be greater than 5.

There should be several clustering variables with similar meaning.

The stability of the results under different conditions.

28 / 37

What is the first step in a cluster analysis once the cluster variables have been determined?

Applying the Ward method

Creating a distance matrix between all cases

Deciding the number of clusters

Selecting the proximity measure and fusion algorithm

29 / 37

What is the primary advantage of Ward's method in cluster analysis?

It is suitable for identifying outliers.

It works best when variables are correlated.

It forms chains of objects.

It often finds good partitionings and correctly assigns elements to groups.

30 / 37

What is the primary purpose of cluster analysis?

To merge objects into comparable groups based on similarities

To calculate the mean value of a dataset

To increase data heterogeneity

To identify the standard deviation of a dataset

31 / 37

What is the main characteristic of dilating clustering procedures?

They group objects into individual groups of approximately equal size.

They form a few large groups with many small ones "left over".

They tend to form chains by merging individual objects.

They show no tendency to dilate or contract.

32 / 37

Why might several iterations be required in a cluster analysis?

To save computational time

To avoid using proximity measures

To achieve a meaningful interpretation of the results

To confuse the results

33 / 37

What is the similarity coefficient that includes cases where both considered objects do not have certain attributes?

Simple Matching (SM) similarity coefficient

Russel and Rao similarity coefficient

Jaccard similarity coefficient

Euclidean similarity coefficient

34 / 37

In cluster analysis, what is "intragroup homogeneity"?

The number of groups formed

The degree of similarity within groups

The degree of dissimilarity between groups

The degree of dissimilarity within groups

35 / 37

What is the aim of cluster analysis?

36 / 37

Which similarity coefficient measures the relative proportion of common properties in relation to the number of properties that apply to at least one of the objects under consideration?

Euclidean similarity coefficient

Pearson similarity coefficient

Simple Matching (SM) similarity coefficient

Jaccard similarity coefficient

37 / 37

How is the similarity or dissimilarity between objects determined in cluster analysis?

By conducting a factor analysis

By performing discriminant analysis

By calculating the mean values of variables

By using proximity measures

Your score is

Learn more…
Methods
Service
About us

Contact
Feedback
Order data etc.

General
Imprint
Privacy notice