人工知能学会研究会資料 SIG-KST-035-06(2018-11 Generation of Mechanical Sound Using Generative Adversarial Networks 1,2 1 1,3 4 1 2 3 4 Abstract:. DCGAN pix2pix pix2pix 1 [16] [13, 14, 15] (variational autoencoder; VAE) [2] (generative adversarial network; GAN) [1] GAN [8, 9, 10, 11] [12] GAN deep convolutional GAN (DCGAN) [3] pix2pix [4] 2 GAN DCGAN pix2pix 3 4 2 GAN 2.1 GAN GAN 1 Generator Discriminator 2 1 Generator G Discriminator D Generator 1 * 本資料の著作権は著者に帰属します
1: GAN (A) D (B) G GAN (1) z Z P Z X Generator G : Z X Discriminator D : X [0, 1] min G max V (D, G) = E x p data (x)[log D(x)] D + E z pz(z)[log(1 D(G(x)))] (1) Generator Discriminator GAN DCGAN pix2pix 2 2.2 10 512 32 pixel 2 10 2.3 DCGAN DCGAN [3] Radford GAN Generator, Discriminator 2: CNN Convolutional Neural Network 3 4 Generator, Discriminator. ReLU Generator Sigmoid 512 8 pixel 5 2.4 pix2pix pix2pix[4] Isola pix2pix Generator U-Net[6] U-Net 6 7, pix2pix Generator, Discriminator LeaklyReLU 512 8 pixel 8 8 pixel 8 8 pixel 5 5 5 pix2pix 1 1000 10 10000
3: DCGAN Generator 4: DCGAN Discriminator 5: pix2pix 3 pix2pix DCGAN pix2pix 8 9 DCGAN 10 pix2pix 11 DCGAN pix2pix 512 8 1000 10000. 10 10000 dynamic time warping (DTW) [7] DCGAN pix2pix 1 DCGAN pix2pix pix2pix 1: 1 2,, 1 2 DCGAN pix2pix 0.016 0.068 0.016 0.006 0.067 0.004 0.011 0.067 0.005
6: pix2pix Encoder-Decoder(Generator) 9: 11: pix2pix 10: DCGAN pix2pix 8 8pixcel 2 4 8 16 32. 10 10000. pix2pix dynamic time warping (DTW) [7] 8 8 12)
7: pix2pix Discriminator (a) DCGAN (b) pix2pix 12: DTW 8: 4 GAN DCGAN pix2pix 2. DCGAN pix2pix [1] Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets., In Advances in neural information processing systems, pp. 2672-2680 (2014) [2] Kingma, D. P., Welling, M.: Auto-encoding variational bayes., arxiv preprint, arxiv:1312.6114 (2013) [3] Radford, A., Metz, L., and Chintala, S.:.Unsupervised representation learning with deep convolutional generative adversarial networks., arxiv preprint arxiv:1511.06434. (2015)
[4] Isola, P., Zhu, J. Y., Zhou, T., et al.: Imageto-image translation with conditional adversarial networks., arxiv preprint (2017) [5] Xi, X., Keogh, E., Shelton, C.: Fast time series classification using numerosity reduction., In Proceedings of the 23rd international conference on Machine learning, pp. 1033-1040 (2006) [6] Ronneberger, O., Fischer, P., and Brox, T.: U-net: Convolutional networks for biomedical image segmentation., In International Conference on Medical image computing and computerassisted intervention, pp. 234-241 (2015) [15] Simard. P, Steinkraus. D, Platt. J; Best Practices forconvolutional Neural Networks Applied to- Visual Docment Analysis In International Conference on Document Analysis and Recognition pp.958-962 (2003) [16] Feng.X, Zhang.Y, and Glass.J: Speech feature dinoising and dereverberation via deep autoencoders for noisy reverberant speech recognition In International Conference on Acoustics, Speech and Signal Processing pp.1759-1763(2014) [7] Bellman, R., Kalaba, R.: On adaptive control processes, In IRE Transactions on Automatic Control, pp. 1-9(1959) [8] Denton. E, Chintala. S, Szlam. A, and Fergus. R; Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks, arxiv preprint (2015) [9] Ledig. C, Theis. L, Huszar. F, Caballero. J, Cunningham. A, Acosta. A, Aitken. A, Tejini. A, Totz. J, Wang. Z, and Shi. W; Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, arxiv preprint (2016) [10] Zhang. H, Xu. T, Li. H, Zhang. S, Wang. X, Huang. X, and Metaxas. D; StackGan: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, arxiv preptint [11] Karras. T, Aila. T, Laine. S, Lethinen. J; Progressive Growing of GANs for Improved Quality, Stability, and Variation, arxiv prepring [12] Donahue. C, McAuley. J, and Puckette. M; Synthesizing Audio with Generative Adversarial Networks, arxiv preprint (2018) [13] Salamon. J, Bello. J; Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound, In IEEE Single Processing Letters, pp. 279-283(2017) [14] McFee. B, Humphrey. E, Bello. J; A Software Frameworkfor Musical DataAugmentation In The International Society for Music Information Retrieval pp. 248-254(2015)