Jayden`s

    Scree Plot ํ™œ์šฉ๋ฒ•

    "Scree Plot" ์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๊ณ , ์œ„์—์„œ PCA๋กœ ๋งŒ๋“  ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋งŒ๋“ค์–ด๋ณด์„ธ์š”. 90%์˜ ๋‚ด์šฉ์„ ์„ค๋ช…ํ•˜๊ธฐ ์œ„ํ•ด์„œ, ๋ช‡๊ฐœ์˜ PC๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋‚˜์š”? ์œ„์˜ ์—ฌ๋Ÿฌ ๊ณผ์ •์€ ์ƒ๋žตํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. :) ๋จผ์ € ๊ฐ ์ฃผ์„ฑ๋ถ„์— ๋Œ€ํ•œ ์•„์ด๊ฒ๋ฒจ๋ฅ˜๊ฐ’์„ ๋ชจ๋‘ ๋”ํ•˜๊ณ  ๋‚˜๋ˆ , ๊ฐ๊ฐ์˜ proportion์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. values = values / np.sum(values) # ์œ„ ์˜ ๊ฐ’์„ ์‹œ๊ฐํ™” plt.title('Scree plot') plt.xlabel('numberofcomp') plt.ylabel('proposion') plt.plot(values); ๊ฐ๊ฐ์˜ ๊ณ ์œ ๊ฐ’์˜ ๋น„์ค‘์„ ๊ณ„์‚ฐํ•ด๋ด…๋‹ˆ๋‹ค. print(values[:2].sum()) print(values[:3].sum..

    [TIL]14.Clustering(๊ตฐ์ง‘ํ™”)

    ๋ชฉํ‘œ Scree plot์˜ ์˜๋ฏธ Supervised learning(์ง€๋„ํ•™์Šต)๊ณผ Unsupervised learning(๋น„์ง€๋„ํ•™์Šต)์— ๋Œ€ํ•œ ์ดํ•ด(์ฐจ์ด์— ๋Œ€ํ•œ ์ดํ•ด) Kmeans clustering์— ๋Œ€ํ•œ ์ดํ•ด Scree Plot Machine Learning ์ง€๋„ ํ•™์Šต(Supervised learning) ํŠธ๋ ˆ์ด๋‹ ๋ฐ์ดํ„ฐ์— ๋ผ๋ฒจ(๋‹ต)์ด ์žˆ์„ ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค. ๋ถ„๋ฅ˜(Classification) ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์˜ ์นดํ…Œ๊ณ ๋ฆฌ ํ˜น์€ ํด๋ž˜์Šค ์˜ˆ์ธก์„ ์œ„ํ•ด ์‚ฌ์šฉ ํšŒ๊ท€(Regression ; prediction) ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ continuousํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ ๋น„์ง€๋„ ํ•™์Šต(Unsupervised learning) ๋ผ๋ฒจ๋ง์ด ๋˜์–ด์žˆ์ง€์•Š์€ ๊ฒฝ์šฐ ์‚ฌ์šฉ, ๋ฐ์ดํ„ฐ์˜ ํŠน์„ฑ์„ ํ† ๋Œ€๋กœ ์•Œ์•„์„œ..

    Dendrogram์„ ํ†ตํ•œ Clustering ์‹œ๊ฐํ™” ๋ฐ Elbow Method

    1. ์ •๊ทœํ™”๋ถ€ํ„ฐ!(๊ฐ ๋ณ€์ˆ˜์˜ ๊ธฐ์ค€์„ ๋งž์ถ”๊ธฐ ์œ„ํ•ด ์ •๊ทœํ™” ์ž‘์—…์„ ํ•ด์คฌ์Šต๋‹ˆ๋‹ค.) from sklearn.preprocessing import StandardScaler scaler = StandardScaler() Z = scaler.fit_transform(df) Z 2-1. Hierarchical Clustering ๋ฐ Dendrogram์„ ํ†ตํ•œ ์‹œ๊ฐํ™” import numpy as np from matplotlib import pyplot as plt from scipy.cluster.hierarchy import linkage, dendrogram from sklearn.cluster import AgglomerativeClustering Z = linkage(Z, method='ward&#39..

    Clustering(๊ตฐ์ง‘ํ™”)

    Machine Learning์—์„œ Supervised Learning / Unsupervised Learning / Reinforce Learning 3๊ฐ€์ง€์˜ ์ฐจ์ด๋Š” ๋ฌด์—‡์ผ๊นŒ?(์˜ˆ์‹œ๋„ ํ•จ๊ป˜!) ๋จผ์ € Machine Learning(๊ธฐ๊ณ„ ํ•™์Šต)์ด๋ž€ ์ธ๊ณต์ง€๋Šฅ์˜ ํ•˜์œ„ ์ง‘ํ•ฉ์œผ๋กœ ์ปดํ“จํ„ฐ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ํ•™์Šตํ•˜๊ณ  ๊ฒฝํ—˜์„ ํ†ตํ•ด ๊ฐœ์„ ํ•˜๋„๋ก ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์„ ๋งํ•œ๋‹ค. ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์—์„œ ํŒจํ„ด๊ณผ ์ƒ๊ด€๊ด€๊ณ„ ๋“ฑ์˜ ๋ถ„์„์„ ํ† ๋Œ€๋กœ ์ตœ์ ์˜ ์˜์‚ฌ๊ฒฐ์ •๊ณผ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์— ์ดˆ์ ์„ ๋งž์ถ˜๋‹ค. Supervised Learning(์ง€๋„ํ•™์Šต) : ์ •๋‹ต์ด ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•. ์ž…๋ ฅ๊ฐ’์ด ์ฃผ์–ด์ง€๋ฉด ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•œ Label๋„ ์ฃผ์–ด ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์œผ๋กœ ๊ทธ ์ข…๋ฅ˜์—๋Š” ๋ถ„๋ฅ˜, ํšŒ๊ท€ ๋“ฑ์ด ์žˆ๋‹ค. ์˜ˆ์‹œ) ๊ฐ•์•„์ง€ ์‚ฌ์ง„..

    [TIL]13.High Dimensional Data

    ๋ชฉํ‘œ Vector Transformation ์ดํ•ด Eigenvector / Eigenvalue์— ๋Œ€ํ•œ ์ดํ•ด ๋ฐ์ดํ„ฐ์˜ feature ์ˆ˜(์ฐจ์› ์ˆ˜)๊ฐ€ ๋Š˜์–ด๋‚˜๋ฉด ์ƒ๊ธฐ๋Š” ๋ฌธ์ œ์  ๋ฐ ์ด๋ฅผ handlingํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ• PCA์˜ ๊ธฐ๋ณธ ์›๋ฆฌ์™€ ๋ชฉ์ ์— ๋Œ€ํ•œ ์ดํ•ด Vector transformation R^2 ๊ณต๊ฐ„์—์„œ ๋ฒกํ„ฐ๋ฅผ ๋ณ€ํ™˜ ์ฆ‰, ์„ ํ˜• ๋ณ€ํ™˜์€ ์ž„์˜์˜ ๋‘ ๋ฒกํ„ฐ๋ฅผ ๋”ํ•˜๊ฑฐ๋‚˜ ํ˜น์€ ์Šค์นผ๋ผ๊ฐ’์„ ๊ณฑํ•˜๋Š” ๊ฒƒ $$T(u+v)=T(u)+T(v)$$ $$T(cu)=cT(u)$$ ๋ฒกํ„ฐ๋ณ€ํ™˜์œผ๋กœ์„œ์˜ '๋งคํŠธ๋ฆญ์Šค์™€ ๋ฒกํ„ฐ์˜ ๊ณฑ' f๋ผ๋Š” transformation์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž„์˜์˜ ๋ฒกํ„ฐ [x1, x2]์— ๋Œ€ํ•ด [2x1 + x2, x1 - 3x2]๋กœ ๋ณ€ํ™˜์„ ํ•œ๋‹ค. \begin{align} f(\begin{bmatrix}x_1 \\ x_2 \end..