๊ด€๋ฆฌ ๋ฉ”๋‰ด

๋ชฉ๋ก๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ• (1)

DATA101

[Deep Learning] ์ตœ์ ํ™” ๊ฐœ๋…๊ณผ ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•(Gradient Descent)

๐Ÿ“š ๋ชฉ์ฐจ1. ์ตœ์ ํ™” ๊ฐœ๋… 2. ๊ธฐ์šธ๊ธฐ ๊ฐœ๋… 3. ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ• ๊ฐœ๋… 4. ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•์˜ ํ•œ๊ณ„1. ์ตœ์ ํ™” ๊ฐœ๋…๋”ฅ๋Ÿฌ๋‹ ๋ถ„์•ผ์—์„œ ์ตœ์ ํ™”(Optimization)๋ž€ ์†์‹ค ํ•จ์ˆ˜(Loss Function) ๊ฐ’์„ ์ตœ์†Œํ™”ํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ตฌํ•˜๋Š” ๊ณผ์ •์ž…๋‹ˆ๋‹ค(์•„๋ž˜ ๊ทธ๋ฆผ 1 ์ฐธ๊ณ ). ๋”ฅ๋Ÿฌ๋‹์—์„œ๋Š” ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ๋ฅผ ๊ฑฐ์ณ ์˜ˆ์ธก๊ฐ’(\(\hat{y}\))์„ ์–ป์Šต๋‹ˆ๋‹ค. ์ด ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ ์ •๋‹ต(\(y\))๊ณผ์˜ ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š” ํ•จ์ˆ˜๊ฐ€ ์†์‹ค ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ๊ฐ’๊ณผ ์‹ค์ ฏ๊ฐ’์˜ ์ฐจ์ด๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ(a.k.a., Feature)๋ฅผ ์ฐพ๋Š” ๊ณผ์ •์ด ์ตœ์ ํ™”์ž…๋‹ˆ๋‹ค. ์ตœ์ ํ™” ๊ธฐ๋ฒ•์—๋Š” ์—ฌ๋Ÿฌ ๊ฐ€์ง€๊ฐ€ ์žˆ์œผ๋ฉฐ, ๋ณธ ํฌ์ŠคํŒ…์—์„œ๋Š” ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•(Gradient Descent)์— ๋Œ€ํ•ด ์•Œ์•„๋ด…๋‹ˆ๋‹ค.2. ๊ธฐ์šธ๊ธฐ ๊ฐœ๋…..