File:Datest Dendrogram PCP.JPG

Datest_Dendrogram_PCP.JPG(600 × 600 pixels, file size: 38 KB, MIME type: image/jpeg)


Summary

Description
English: This picture has been created with MD*Tech XploRe.


Dendrogram

This chart is the graphical representation of the clustering algorithm. For more detailed interpretation check en:Analysis of Tuberculosis.

Parallel Coordinate Plot

This plot shows the values of the cluster means (cluster = groups found in the data) on a vertical standardized axis. For interpretation check en:Analysis of Tuberculosis.

Usage of the Program

This program provides the opportunity to compute different numbers of clusters with several distance measures and clustering algortihms. Default settings are the computation of 3 clusters, usage of the euclidean distance and the Ward clustering algorithm. The output contains the text vectors of the the country names falling into each cluster as well as the cluster means and variances. Furthermore, the PCP (Parallel Coordinate Plot) of cluster means is plotted to give a first insight into differences between the groups found in the data. Finally, the cluster information can be saved to new files for ongoin analysis (clusters could, for instance, be checked for outliers with the the outlier program that can be found here).

Program Code

Attention! For repeating the computation a transformed dataset is needed! If you have not yet computated and saved the transformation, run the program for transformation on the wikipage en:Analysis of Tuberculosis first!


library("xplore")
library("stats")

; ----- Reading Data ---------------------------------------------------------------------------

choose = "Read data from:"

defaults = "C:\Dokumente und Einstellungen\All Users\Desktop\UN_data_ordered.csv"

v1 = readvalue(choose, defaults)

x = readcsvm(v1)

data = x.double
country = x.text

; ----- Cluster Analysis ----------------------------------------------------------------------

data2 = data[,4:cols(data)]
data3 = data2/var(data2)

d=distance(data3,"euclid")
t=agglom(d,"WARD",3)

g=tree(t.g,0,"CENTER")
g=g.points
l = 5.*(1:rows(g)/5) + (0:4)' - 4

setmaskl(g, l, 0, 1, 1)
setmaskp(g, 0, 0, 0)

; ----- Graphical Options ----------------------------------------------------------------------

setsize(600, 600)
f = createdisplay(2,1)

axeson()
show(f, 1, 1, g)
title1 = "Dendrogram"
xlabel1 = "Countries"
ylabel1 = "Euclidean-distance"
setgopt(f, 1, 1, "title", title1, "xlabel", xlabel1, "ylabel", "Euclidean-distance")

; ----- Get Cluster Data and Countries ---------------------------------------------------------

cluster1=paf(data,t.pd==1)
cluster2=paf(data,t.pd==2)
cluster3=paf(data,t.pd==3)

x1=paf(data2,t.pd==1)
x2=paf(data2,t.pd==2)
x3=paf(data2,t.pd==3)

country[cluster1[,1]]
country[cluster2[,1]]
country[cluster3[,1]]

; ----- Get Basic Info about the Clusters (Mean, Variance) and Draw PCP ------------------------

mc=(mean(x1)')~(mean(x2)')~(mean(x3)')
mc

vc=(sqrt(var(x1))')~(sqrt(var(x2))')~(sqrt(var(x3))')
vc

col1  = grc.col.green-grc.col.blue
col2  = grc.col.red-grc.col.blue
col   = grc.col.blue+col1*(mc'[,1]<=min(mc'[,1]))+col2*(mc'[,1]>min(mc'[,1])&&mc'[,1]<max(mc'[,1]))

/*
mctrans = mc' - mean(mc')
mctrans = mctrans'/sqrt(var(mc'))'
mctrans
*/

mctrans = mc' - min(mc')
mctrans = mctrans'/(max(mc')-min(mc'))'
mctrans

gr = grpcp(mctrans',col)

; ----------- Graphical Options ----------

title2 = "Parallel Coordinate Plot of Cluster Means"
xlabel2 = "Aids"|"Mal"|"Tub"|"Con"|"Drug"|"Edu"|"Lit"|"San"|"Wat"|"CO2"|"Int"|"PC"|"Tel"
ylabel2 = "0"|"stdzd."|"1"

axesoff()
axes = graxes((0.80|13.2)~(-0.05|1.05), "origin", 7.5, "ytextpos", 9, "xtextpos", 6, "xticks", (1:13), "xtext", xlabel2, "yticks", 0|0.5|1, "ytext", ylabel2, "xtextsize", 16, "ytextsize", 16)
axes1 = graxes((0.80|13.2)~(-0.05|1.05), "origin", 1.5, "ytextpos", 3, "xtextpos", -1, "xticks", (1:13), "yticks", 0|0.5|1, "ytext", 0|0.5|1, "xtextsize", 16, "ytextsize", 16)

show(f, 2, 1, gr, axes, axes1)
setgopt(f, 2, 1, "title", title2)

; ----- Saving Options for Clusters-------------------------------------------------------------

proc()=save(c1, c2, c3)

head2 = "Save Clusters"

item2 = "Cluster1" | "Cluster2" | "Cluster3"

sel2 = selectitem(head2, item2)

switch

case(sel2[1]==1 && sel2[2]==0 && sel2[3]==0)

folder = "Save cluster1 to:"

default3 = "C:\Dokumente und Einstellungen\All Users\Desktop\Cluster1.csv"

v3=readvalue(folder, default3)


Date 30 March 2007 (original upload date)
Source Transferred from en.wikibooks to Commons.
Author Schtiwi at English Wikibooks

Licensing

Schtiwi at the English Wikipedia, the copyright holder of this work, hereby publishes it under the following license:
GNU head Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled GNU Free Documentation License.
w:en:Creative Commons
attribution share alike
This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.
Attribution: Schtiwi at the English Wikipedia
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
This licensing tag was added to this file as part of the GFDL licensing update.

Original upload log

The original description page was here. All following user names refer to en.wikibooks.
Date/Time Dimensions User Comment
2007-03-30 11:50 600×600× (38876 bytes) Schtiwi This picture has been created with MD*Tech XploRe.

Captions

Add a one-line explanation of what this file represents

Items portrayed in this file

depicts

30 March 2007

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current14:19, 19 August 2017Thumbnail for version as of 14:19, 19 August 2017600 × 600 (38 KB)JackPotte{{BotMoveToCommons|en.wikibooks|year={{subst:CURRENTYEAR}}|month={{subst:CURRENTMONTHNAME}}|day={{subst:CURRENTDAY}}}} == {{int:filedesc}} == {{Information |Description={{en|This picture has been created with MD*Tech XploRe. == Dendrogram == This ch...

The following page uses this file: