김규진 (Kyujin Kim)

Majored in Linguistics, currently finding positions in AI engineer

Tag

paper 5
실습 3

paper

[paper] E-BRANCHFORMER: BRANCHFORMER WITH ENHANCED MERGING FOR SPEECH RECOGNITION

2 분 소요

E-Branchformer[Kim22]는 음성인식 분야 SOTA모델 Conformer와 견주어 비교되는 모델이다.

[paper] Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

1 분 소요

VITS는 One-stage TTS 중에서 준수한 natural sounding audio를 생성하는 모델이다.

[paper] Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions, ICASSP 2018

1 분 소요

Tacotron2는 Google에서 제시한 Speech Synthesis 논문이다.

[paper] ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification

3 분 소요

본 포스트는 Speaker Verification 분야 높은 성적을 거둔 논문에 대한 내용이다.

[paper] ChildAugment: Data Augmentation Methods for Zero-Resource Children’s Speaker Verification

최대 1 분 소요

To be updated soon.

맨 위로 이동 ↑

실습

[실습] E-branchformer, Conformer ASR 훈련 비교

최대 1 분 소요

포스트 준비 중

[실습] VITS 모델(TTS) 중국어 데이터 훈련 및 평가 실습

2 분 소요

이번에는 VITS 모델(TTS)을 중국어 데이터로 훈련한 과정을 기록한 포스트이다.

[실습] ECAPA-TDNN 모델을 활용한 화자분할 실험

최대 1 분 소요

이전 포스트를 참고하여 화자분할 실습을 진행하고자 한다.

맨 위로 이동 ↑