[NLP]Recurrent Neural Network and Language Modeling

Notice

Recent Posts

Recent Comments

Link

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

Archives

Today

Total

관리 메뉴

크크루쿠쿠

[NLP]Recurrent Neural Network and Language Modeling 본문

DeepLearning/부스트캠프 AI Tech

[NLP]Recurrent Neural Network and Language Modeling

JH_KIM 2021. 9. 7. 13:33

RNN

- Basic structure (Vanilla RNN)

전 data 에서 나온 hidden state를 입력으로 들어간다.

왼쪽 -> rolled 오른쪽 -> unrolled

- how to calculate the hidden state of RNNs

ht-1: old hidden-state

ht: new hidden-state

fW: RNN function with parameters W

Types of RNNs

- One-to-one

Standard Neural Network

- One-to-many

Image Captioning

- many-to-one

Sentiment Classification

- Sequence-to-sequence

Machine Translation

-> 다 읽은 후 해석

Video classification on frame level

-> 입력하는 즉시 ouput생성

Character-level Language Model

Example of training sequence "hello"

어떤 character가 올지 예측

- Vocab: [h,e,l,o]

이러한 모델 사용

h : [1,0,0,0], e:[0,1,0,0], l:[0,0,1,0], o:[0,0,0,1]

Logit이라 표현된 이유 -> 한 character만 추출해야 하기 때문.

이러한 방식으로 무한한 길이의 model 사용 가능

-> 주식예측같이 점화식 처럼 긴 sequence data 예측 가능

Backpropagation through time (BPTT)

-> 현실적으로 길어질 경우 한번에 하기 힘들다.

truncation으로 잘라서 학습하는 방식을 사용함.

Vanishing/Exploding Gradient Problem in RNN

RNN은 훌륭한 모델이지만..

same matrix를 매 time마다 곱할 시 backpropagation 할 때 gradient vanishing or exploding 문제가 발생하게 된다.

w가 1보다 클 시 -> exploding

w가 1보다 작을 시 -> vanishing

--> 같은 값들이 계속 곱해지기 때문!

저작자표시 비영리 변경금지 (새창열림)

'DeepLearning > 부스트캠프 AI Tech' 카테고리의 다른 글

Sequence to Sequence with Attention (0)	2021.09.13
[NLP] LSTM and GRU (0)	2021.09.07
[NLP]Word Embedding (0)	2021.09.06
[NLP]Intro to NLP, Bag-of-Words (0)	2021.09.06
Pytorch Troubleshooting (0)	2021.08.20

'DeepLearning/부스트캠프 AI Tech' Related Articles

Comments

크크루쿠쿠

[NLP]Recurrent Neural Network and Language Modeling 본문

[NLP]Recurrent Neural Network and Language Modeling

RNN

Types of RNNs

Character-level Language Model

Backpropagation through time (BPTT)

Vanishing/Exploding Gradient Problem in RNN

'DeepLearning > 부스트캠프 AI Tech' 카테고리의 다른 글

티스토리툴바