8 posts tagged with "Natural Language Processing"

LaMDA is a family of Transformer- based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text.
The first challenge, safety, involves ensuring that the model’s responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias.

Few-Shot Question Answering by Pretraining Span Selection (Splinter)

January 1, 2021 · 2 min read

Jisu Lim

AI Engineer

We explore the more realistic few-shot setting, where only a few hundred training examples are available, and observe that standard models perform poorly, highlighting the discrepancy between current pretraining objectives and question answering.
We propose a new pretraining scheme tailored for question answering: recurring span selection. Given a passage with multiple sets of recurring spans, we mask in each set all recurring spans but one, and ask the model to select the correct span in the passage for each masked span.

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

March 23, 2020 · 2 min read

Jisu Lim

AI Engineer

ELECTRA : PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS

Masked language modeling(MLM)들은 일반적으로 많은 양의 계산을 필요로한다. 그에 대한 대안으로 이 논문은 replaced token detection이라고도 하는 pre-training을 효율적으로 하는 것에 의의를 둔다. 입력을 masking 하는 대신 작은 generator 모델을 통해 생성된 토큰으로 대체한다. 그래서 corrupted 토큰들의 원본을 예측하는 대신 이 토큰이 생성된 토큰인지 아닌지를 분별한다.
그래서 BERT와 똑같은 모델 사이즈, 데이터, 학습양으로 더 뛰어난 성능을 보여지고, RoBERTa나 XLNet 보다 1/4의 계산량으로 비슷한 결과를 보여주고 같은 계산량이면 더 능가한다.

Transformer and BERT

December 6, 2017 · 11 min read

Jisu Lim

AI Engineer

2018년 당시에 [뉴옥 타임지]에서 Oren Etzioni, chief executive of the Allen Institute for Artificial Intelligence가 말하길,
기계가 아직 인간의 보통 감각을 표현할 수는 없지만, Bert는 폭발적인 발전의 순간이라고 했습니다. 이 Bert 모델에 기초가 된 [Transformer]는 어텐션 매커니즘을 사용하여 Encoder-Decoder로 구성되는 구조를 보려고 합니다.

End-to-End Neural Coreference Resolution

July 26, 2017 · 3 min read

Jisu Lim

AI Engineer

Coreference Resolution

Coreferece를 찾는 NLP Task 중 하나로 coreference는 문장 속에서 Entity와 같은 의미로 언급(mention)된 span을 찾는 것을 목적.

Neural Machine Translation of Rare Words with Subword Units

August 31, 2016 · 4 min read

Jisu Lim

AI Engineer

Neural Machine Translation of Rare Words with Subword Units

데이터 압축으로 쓰이던 bpe를 자연어에 쓴 논문이다. 단어보다 작은 subword unit을 사용하여 음운론적이고 형태학적으로 번역함으로써, open-vocabulary NMT모델을 소개한다.

Abstract​

ELECTRA : PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS​

Coreference Resolution​

Neural Machine Translation of Rare Words with Subword Units​

Abstract

ELECTRA : PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS

Coreference Resolution

Neural Machine Translation of Rare Words with Subword Units