Jayden`s

    [TIL] 79. ์ธ์ฝ”๋”-๋””์ฝ”๋”, Attention

    ์ธ์ฝ”๋”-๋””์ฝ”๋” RNN์˜ ๋‹ค๋Œ€์ผ(many to one) ๊ตฌ์กฐ๋Š” ์ฃผ๋กœ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์— ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.(๋ฌธ์žฅ์ด ๋“ค์–ด์˜ค๋ฉด ๊ธ์ •์ธ์ง€ ๋ถ€์ •์ธ ํŒ๋‹จํ•˜๋Š” ๊ฐ์„ฑ๋ถ„์„ ๋“ฑ) ๋‹ค๋Œ€๋‹ค(many to many) ๊ตฌ์กฐ๋Š” ์ฃผ๋กœ ๊ฐœ์ฒด๋ช… ์ธ์‹, ํ’ˆ์‚ฌ ํƒœ๊น…๊ณผ ๊ฐ™์€ ๋ฌธ์ œ๋ฅผ ํ’€๊ฒŒ๋ฉ๋‹ˆ๋‹ค. ์ธ์ฝ”๋”-๋””์ฝ”๋”์˜ ๊ตฌ์กฐ๋Š” ํ•˜๋‚˜์˜ RNN์„ ์ธ์ฝ”๋”, ๋˜ ๋‹ค๋ฅธ ํ•˜๋‚˜์˜ RNN์„ ๋””์ฝ”๋”๋กœ ๋‘๋Š” ๊ตฌ์กฐ ๋‘๊ฐœ์˜ RNN์„ ์—ฐ๊ฒฐํ•ด์„œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ธ์ฝ”๋”-๋””์ฝ”๋” ๊ตฌ์กฐ๋Š” ์ฃผ๋กœ ์ž…๋ ต ๋ฌธ์žฅ๊ณผ ์ถœ๋ ฅ ๋ฌธ์žฅ์˜ ๊ธธ์ด๊ฐ€ ๋‹ค๋ฅผ ๋•Œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋Œ€ํ‘œ์ ์œผ๋กœ ๋ฒˆ์—ญ๊ธฐ, ํ…์ŠคํŠธ ์š”์•ฝ ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. seq2seq(Sequence to Sequence) ์ž…๋ ฅ๋œ ์‹œํ€€์Šค๋กœ๋ถ€ํ„ฐ ๋‹ค๋ฅธ ๋„๋ฉ”์ธ์˜ ์‹œํ€€์Šค๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๋ชจ๋ธ ์ฑ—๋ด‡, ๊ธฐ๊ณ„ ๋ฒˆ์—ญ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ํ™œ์šฉ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ ์‹œํ€€์Šค๋ฅผ ์งˆ๋ฌธ, ์ถœ๋ ฅ ์‹œํ€€์Šค๋ฅผ ๋Œ€๋‹ต ..

    [4673]์…€ํ”„ ๋„˜๋ฒ„

    rm_list = [] for i in range(1, 10001): for j in str(i): i += int(j) if i < 10000: rm_list.append(i) tot_list = list(range(1, 10000)) for k in set(rm_list): tot_list.remove(k) for n in tot_list: print(n) 4673 ์…€ํ”„๋„˜๋ฒ„ ์˜ค... int์™€ str์„ ์ž˜์“ฐ๋ฉด ์ด๋ ‡๊ฒŒ ํŽธํ•˜๊ตฌ๋‚˜

    [๋”ฅ๋Ÿฌ๋‹, NLP] Transformer(Positional encoding, Attention)

    Positional Encoding RNN๊ณผ ๋‹ฌ๋ฆฌ Transformer๋Š” ๋ชจ๋“  ํ† ํฐ์ด ํ•œ๋ฒˆ์— ์ž…๋ ฅ๋˜๊ธฐ ๋•Œ๋ฌธ์— recursive๋ฅผ ํ†ตํ•œ ๋‹จ์–ด ๊ฐ„ ์œ„์น˜, ์ˆœ์„œ ์ •๋ณด๋ฅผ ๋‹ด์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์—, ์• ์ดˆ์— input ์‹œ ํ† ํฐ์˜ ์œ„์น˜์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๋งŒ๋“ค์–ด ํ† ํฐ์— ํฌํ•จ์‹œํ‚ค๋Š” ์ž‘์—…์„ ํ•˜๊ฒŒ ๋˜๋Š”๋ฐ ์ด ๊ณผ์ •์ด Positional Encoding ์ž…๋‹ˆ๋‹ค. Self-Attention Attention : ๋””์ฝ”๋”์—์„œ ์ถœ๋ ฅ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋งค ์‹œ์ (time step)๋งˆ๋‹ค, ์ธ์ฝ”๋”์—์„œ์˜ ์ „์ฒด ์ž…๋ ฅ ๋ฌธ์žฅ์„ ์ฐธ๊ณ ํ•˜๋Š” ๋ฐฉ๋ฒ•. ์ด ๋•Œ, ์ „์ฒด ์ž…๋ ฅ๋˜๋Š” ๋ฌธ์žฅ์˜ ํ† ํฐ์„ ๋™์ผํ•œ ๋น„์ค‘์œผ๋กœ ์ฐธ๊ณ ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ, ํ•ด๋‹น ์‹œ์ ์˜ ์˜ˆ์ธกํ•  ๋‹จ์–ด์™€ ์—ฐ๊ด€์„ฑ์ด ๋†’์€ ์ž…๋ ฅ ํ† ํฐ์„ ๋” ๋น„์ค‘์žˆ๊ฒŒ ์ง‘์ค‘(attention)ํ•ด์„œ ๋ณด๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋ฌธ์žฅ ๋‚ด์—์„œ์˜ ํ† ..

    [๋”ฅ๋Ÿฌ๋‹, NLP] RNN, LSTM, GRU

    RNN(Recurrent Neural Network) ์ž…๋ ฅ์ธต -> ์€๋‹‰์ธต -> ์ถœ๋ ฅ์ธต์˜ ๊ตฌ์กฐ๋งŒ์„ ๊ฐ–๋Š” ์‹ ๊ฒฝ๋ง์„ ํ”ผ๋“œ ํฌ์›Œ๋“œ ์‹ ๊ฒฝ๋ง์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง(RNN)์€ ์€๋‹‰์ธต์—์„œ ๋‚˜์˜จ ๊ฐ’์„ ์ถœ๋ ฅ์ธต์œผ๋กœ ๋ณด๋‚ด๋ฉด์„œ ๋™์‹œ์— ๋‹ค์‹œ ์€๋‹‰์ธต ๋…ธ๋“œ์˜ ๊ฒฐ๊ณผ๊ฐ’์„ ๋‹ค์Œ ๊ณ„์‚ฐ์„ ์œ„ํ•œ ์ž…๋ ฅ์œผ๋กœ ๋ณด๋‚ด๋Š” ๊ตฌ์กฐ๋ฅผ ๊ฐ–๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.RNN ๊ตฌ์กฐ ์™ผ์ชฝ๊ณผ ์˜ค๋ฅธ์ชฝ ๋ชจ๋‘ RNN ๊ตฌ์กฐ๋ฅผ ํ‘œํ˜„ํ•œ ๊ทธ๋ฆผ์ž…๋‹ˆ๋‹ค. ์˜ค๋ฅธ์ชฝ ๊ทธ๋ฆผ์„ ๊ธฐ์ค€์œผ๋กœ 3๋ฒˆ์งธ ์…€(์—ฐ๋‘์ƒ‰ ๋„ค๋ชจ)๋Š” ์ด์ „ x2์— ์˜ํ•œ ๊ฒฐ๊ณผ๊ฐ’๊ณผ ์ƒˆ๋กœ์šด input์ธ x3๋ฅผ ํ•จ๊ป˜ ๋ฐ›์•„ y3์˜ ์ถœ๋ ฅ๊ฐ’์„ ๋‚ด๋†“๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ฆ‰, RNN์€ t ์‹œ์ ์—์„œ x_t์˜ ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•œ ์ถœ๋ ฅ๊ฐ’ y_t๋ฅผ ๊ณ„์‚ฐํ•  ๋•Œ, ์ด์ „ ์‹œ์ (t-1)์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ๊ฐ’์„ ํ•จ๊ป˜ ์ž…๋ ฅ๊ฐ’์œผ๋กœ ๋ฐ›์•„ ๋ฐ˜์˜ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ–๋Š” ์ด์œ ๋Š” ์ž์—ฐ์–ด, ์‹œ๊ณ„..

    [๋”ฅ๋Ÿฌ๋‹, NLP] ๋ถ„ํฌ ๊ฐ€์„ค, Word2Vec

    ๋ถ„ํฌ ๊ฐ€์„ค(Distributed Representation) ํšŸ์ˆ˜ ๊ธฐ๋ฐ˜์ด ์•„๋‹Œ, ๋‹จ์–ด์˜ ๋ถ„ํฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ถ„ํฌ๊ธฐ๋ฐ˜ ๋‹จ์–ดํ‘œํ˜„์˜ ๋ฐฐ๊ฒฝ์ด ๋˜๋Š” ๊ฐ€์„ค ๋น„์Šทํ•œ ์œ„์น˜์— ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด๋“ค์€ ๋น„์Šทํ•œ ์˜๋ฏธ๋ฅผ ์ง€๋‹Œ๋‹ค.๋Š” ๊ฐ€์„ค์ž…๋‹ˆ๋‹ค. Word2Vec ๋ง๊ทธ๋Œ€๋กœ ๋‹จ์–ด๋ฅผ ๋ฒกํ„ฐํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ ์›ํ•ซ์ธ์ฝ”๋”ฉ๊ณผ๋Š” ๋‹ค๋ฅธ ๋ถ„์‚ฐ ํ‘œํ˜„ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์›ํ•ซ์ธ์ฝ”๋”ฉ์€ ๋‹จ์–ด ๋ฒกํ„ฐ์˜ ์ฐจ์›์ด ๋‹จ์–ด ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ๊ฐ€ ๋˜๋ฉฐ, ํ•ด๋‹นํ•˜์ง€ ์•Š๋Š” ์—ด์—๋Š” ์ „๋ถ€ 0 ๊ฐ’์œผ๋กœ ํฌ์†Œํ•˜๋‹ค๋ฉด, Word2Vec์€ ๋น„๊ต์  ์ €์ฐจ์›์— ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ๋ถ„์‚ฐํ•˜์—ฌ ํ‘œํ˜„ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ฃผ๋ณ€ ๋‹จ์–ด๋ฅผ input์œผ๋กœ ์ค‘์‹ฌ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š” CBoW ๋ฐฉ๋ฒ•๊ณผ ์ค‘์‹ฌ ๋‹จ์–ด๋ฅผ input์œผ๋กœ ์ฃผ๋ณ€ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š” Skip-gram ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.