'ML' 카테고리의 글 목록 (2 Page)

ML 20

200701

Disentangling style factors from speaker representations 1. abstract and contribution: separate speaking style w/ i-vector or x-vector, latent space를 two subspace로 disentangling 했고 classifier를 붙였음 contribution 1) i/x-vector에는 말하는 스타일이나 감정에 대한 정보가 들어가있다는걸 증명함 2) classification accuracy로 disentanglement를 입증함 3) style과 residual information이 reconstruction 하는데에 필수적이란걸 입증함 assumption : i-vector등의 임베딩..

ML 2020.07.01

200701

그만좀 찾았으면 하는 의미에서 정리해두는 자주쓰는 우분투 명령어들 1) GPU메모리 확인: nvidia-smi -l 2) 시스템 메모리 확인: free -m, top (shift+m, shift+p), htop (sudo apt-get install htop) 3) 하드디스크 남은용량 확인: df -h 4) find files with 3 digits: find . -name "*m_[0-9][0-9][0-9].*" -type f : 특정 문자열이 들어간 파일을 하위 폴더까지 찾기: find . -name "*STR*" : 특정 문자열이 들어간 파일을 찾아서 삭제: find . -name "*STR*" -delete : 검색된 파일에서 문자열 찾기: find -name [FILE] -exec grep "..

ML 2020.07.01

200630

attention..이제 그만좀 까먹어야지... * python dictionary, dict={"key1": "value1", "key2", "value2"} attention weight: 주어진 query (decoder hidden state of timestep t)에 대해서 모든 key(overall encoder hidden state)와의 유사도(attention score)를 각각 구하여 키와 맵핑되어있는 각각의 밸류(usually key == value) 에 반영해주고, 이 밸류를 더해서 리턴함 attention input - Q: 디코더 히든 스테이트 of timestep t - K, V: overall encoder hidden state attention 구하는 과정.. 1) sc..

ML 2020.06.30

200628

Effective Emotion Transplantation in an End-to-End Text-to-Speech System application: emotional text-to-speech scenario: emotional source total 11 hours, neutral target total 1 hour training method 1) emo/src data로 TTS training 2-a) neu/tgt data로 M_emo는 freeze시켜놓고 M_TTS만 training 2-b) emo/src data로 true emotional embedding e를 생성, 이 e와 tgt text를 이용하여 emo/tgt spectrogram을 생성하고 이 생성한 spectrogram으로 ..

ML 2020.06.29

200625

https://evols-atirev.tistory.com/28 원격서버 vscode로 연결해서 작업하기 Microsoft의 visual studio code로 자신의 로컬 컴퓨터의 파일 뿐만 아니라 원격 서버까지 워킹 디렉토리로 삼을 수 있습니다. 이 방법을 사용하면 WSL에도 쉽게 접속해서 사용할 수 있습니다. WSL에 사용� evols-atirev.tistory.com 파이참프로로 디버깅 하려다가 여기로 넘어옴

ML 2020.06.29

200624

이번엔 제발 그만 까먹어야지 tensorflow 1.x의 tf.layers.conv2d(input, filter, kernel_size, strides, padding, data_format='NHWC', dilation_rate=1) - input [N, H, W, C] - filter: integer, output space의 차원을 가리킴...channel이라고 생각하는게 빠르다 - kernel_size: 2d convolution을 할 때 쓰는 patch의 size라고 보면 댐, 그니까 kernel이 (3, 3)이고 filter가 32라고 하면 weight matrix는 [3, 3, 32] 크기임 - strides: list or integer, 만약에 dilation!=1 이면 always st..

ML 2020.06.24

200623

Style transfer network : 컨텐츠는 보존하면서 스타일은 1) 초창기?방법 2) domain adaptation과의 연계 3) GAN과의 연계: cycleGAN이다....그다음엔 starGAN... 참고한 곳: https://blog.lunit.io/2017/04/27/style-transfer/ Style Transfer Introduction Style transfer란, 두 영상(content image & style image)이 주어졌을 때 그 이미지의 주된 형태는 content image와 유사하게 유지하면서 스타일만 우리가 원하는 style image와 유사하게 바꾸는 것을 말합 blog.lunit.io Joint Speaker Counting, Speech Recogniti..

ML 2020.06.24

200618

On Enhancing Speech Emotion Recognition using Generative Adversarial Networks 전작: Adversarial auto-encoders for speech based emotion recognition motivation: SER 성능을 GAN structure를 이용해 올려보겠다. approach: using GAN DB: IEMOCAP, MSP-IMPROV, feature는 opensmile toolkit 1582-dim Background: AAE, GAN 1) vanilla GAN만 쓰면 수렴 안해서 AAE를 사용하여 정보를 compress : input of D: Real sample and output of G, input of G: s..

ML 2020.06.23

200612

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis 유명한 논문...revisit해야할데가 좀 있음 reference encoder는 stack of 2d convnet w/ stride 2 following w/ GRU. 여기서 나온 임베딩은 multi-head attention의 query로 이용되며, randomly initialized token embeddings와의 similarity를 측정하는데에 쓰임. 이 output은 weighted sum의 형태로 style embedding으로 사용되며 이걸 text encoder output과 concat하여 decoder atten..

ML 2020.06.12

200609

DiscreTalk : Text-to-Speech as a Machine Translation Problem : GAN-based VQ-VAE + NMT-Transformer 1) GAN-based VQ-VAE - loss: VQ-VAE loss를 가져오지만 encoder, decoder 스트럭처는 melGAN에서 가져온다. discriminator는 K개. 따라서 reconstruction loss는 spectrum에 대한 loss 2개, codebook loss, commitment loss, adversarial loss로 구성됨 NMT모델에 beam search와 같은 기존의 ASR필드에서 쓰던 기법들을 적용해봤다 Adversarial Auto-encoders for Speech Based Emo..

ML 2020.06.09

1 2

주로 독서록 가끔 명령어랑 논문 종종 혼잣말

레즈비언, 바이섹슈얼, 함민복, 게이, 두리반, 성소수자, 차별금지법, 양성애, 2013퀴어퍼레이드, 대만뉴웨이브, 퀴어, 2013 퀴어퍼레이드, 태그를 입력해 주세요., 동성애, 오르한 파묵, LGBT, 트랜스젠더, 퀴어퍼레이드, 자긍심, 퀴어문화축제,

Today :
Yesterday :

stri.destride

ML 20

티스토리툴바

« 2024/05 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31