発表文献

2024

    1. SLT
      Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings
      Shota Horiguchi, Atsushi Ando, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, and Marc Delcroix
      In IEEE Spoken Language Technology Workshop (SLT), Dec 2024
    2. SLT
      Investigation of Speaker Representation for Target-Speaker Speech Processing
      Takanori Ashihara, Takafumi Moriya, Shota Horiguchi, Junyi Peng, Tsubasa Ochiai, Marc Delcroix, Kohei Matsuura, and Hiroshi Sato
      In IEEE Spoken Language Technology Workshop (SLT), Dec 2024
    3. INTERSPEECH
      Factor-Conditioned Speaking-Style Captioning
      Atsushi Ando, Takafumi Moriya, Shota Horiguchi, and Ryo Masumura
      In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2024
    4. INTERSPEECH
      SpeakerBeam-SS: Real-Time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
      Hiroshi Sato, Takafumi Moriya, Masato Mimura, Shota Horiguchi, Tsubasa Ochiai, Takanori Ashihara, Atsushi Ando, Kentaro Shinayama, and Marc Delcroix
      In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2024
    5. ICASSP
      Streaming Active Learning for Regression Problems Using Regression via Classification
      Shota Horiguchi, Kota Dohi, and Yohei Kawaguchi
      In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024
    1. CHiME
      NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
      Naoyuki Kamo*, Naohiro Tawara*, Atsushi Ando, Takatomo Kano, Hiroshi Sato, Rintaro Ikeshita, Takafumi Moriya, Shota Horiguchi, Kohei Matsuura, Atsunori Ogawa, Alexis Plaquet, Takanori Ashihara, Tsubasa Ochiai, Masato Mimura, Marc Delcroix, Tomohiro Nakatani, Taichi Asami, and Shoko Araki
      In The 8th International Workshop on Speech Processing in Everyday Environments (CHiME-2024), Sep 2024
      (*) Equal contribution
    1. Preprint
      Mamba-based Segmentation Model for Speaker Diarization
      Alexis Plaquet, Naohiro Tawara, Marc Delcroix, Shota Horiguchi, Atsushi Ando, and Shoko Araki
      arxiv:2410.06459, Oct 2024
    2. Preprint
      Guided Speaker Embedding
      Shota Horiguchi, Takafumi Moriya, Atsushi Ando, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, and Marc Delcroix
      arxiv:2410.12182, Oct 2024
    3. Preprint
      Alignment-Free Training for Transducer-Based Multi-Talker ASR
      Takafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, and Mimura Masato
      arxiv:2409.20301, Sep 2024
    4. Preprint
      Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
      Hiroyuki Namba, Shota Horiguchi, Masaki Hamamoto, and Masashi Egi
      arXiv:2402.08209, Feb 2024

    2023

    1. TASLP
      Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors
      Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, and Yohei Kawaguchi
      IEEE/ACM Transactions on Audio, Speech, and Language Processing, Jan 2023
    1. APSIPA ASC
      Synthetic Data Augmentation for ASR with Domain Filtering
      Tuan Vu Ho, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Takashi Sumiyoshi
      In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Nov 2023
    2. INTERSPEECH
      Spoofing Attacker Also Benefits from Large-Scale Self-Supervised Models
      Aoi Ito* and Shota Horiguchi*
      In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2023
      (*) Equal contribution
    3. INTERSPEECH
      CAPTDURE: Captioned Sound Dataset of Single Sources
      Yuki Okamoto, Kanta Shimonishi, Keisuke Imoto, Kota Dohi, Shota Horiguchi, and Yohei Kawaguchi
      In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2023
    4. SLT
      Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization
      Shota Horiguchi, Yuki Takashima, Shinji Watanabe, and Paola García
      In IEEE Spoken Language Technology Workshop (SLT), Jan 2023

        2022

        1. TASLP
          Encoder-Decoder Based Attractors for End-to-End Neural Diarization
          Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Paola García
          IEEE/ACM Transactions on Audio, Speech, and Language Processing, Mar 2022
          🏆 Itakura Prize Innovative Young Researcher Award
        1. INTERSPEECH
          Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models
          Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Yohei Kawaguchi
          In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2022
        2. ICML
          Rethinking Fano’s Inequality in Ensemble Learning
          Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, and Nobuo Nukaga
          In International Conference on Machine Learning (ICML), Jul 2022
        3. Odyssey
          Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization
          Natsuo Yamashita, Shota Horiguchi, and Takeshi Homma
          In The Speaker and Language Recognition Workshop (Odyssey), Jun 2022
        4. ICASSP
          Multi-Channel End-to-End Neural Diarization with Distributed Microphones
          Shota Horiguchi, Yuki Takashima, Paola García, Shinji Watanabe, and Yohei Kawaguchi
          In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2022
        5. ICASSP
          Environmental Sound Extraction Using Onomatopoeic Words
          Yuki Okamoto, Shota Horiguchi, Masaaki Yamamoto, Keisuke Imoto, and Yohei Kawaguchi
          In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2022
          🏆 IEEE SPS Japan Student Conference Paper Award

            2021

              1. ASRU
                Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors
                Shota Horiguchi, Paola García, Shinji Watanabe, Yawen Xue, Yuki Takashima, and Yohei Kawaguchi
                In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2021
              2. INTERSPEECH
                Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers
                Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2021
              3. INTERSPEECH
                Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization
                Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2021
              4. ICASSP
                End-to-End Speaker Diarization as Post-Processing
                Shota Horiguchi, Paola García, Yusuke Fujita, Shinji Watanabe, and Kenji Nagamatsu
                In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2021
              5. SLT
                End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection
                Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola Garcia, and Kenji Nagamatsu
                In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
              6. SLT
                Online End-to-End Neural Diarization with Speaker-Tracing Buffer
                Yawen Xue, Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
              7. SLT
                Block-Online Guided Source Separation
                Shota Horiguchi, Yusuke Fujita, and Kenji Nagamatsu
                In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
              1. DIHARD
                The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-vector Clustering Systems Combined by DOVER-Lap
                Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, and Sanjeev Khudanpur
                In The Third DIHARD Speech Diarization Challenge (DIHARD III), Jan 2021

                2020

                1. TPAMI
                  Significance of Softmax-based Features in Comparison to Distance Metric Learning-based Features
                  Shota Horiguchi, Daiki Ikami, and Kiyoharu Aizawa
                  IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2020
                1. INTERSPEECH
                  Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones
                  Shota Horiguchi, Yusuke Fujita, and Kenji Nagamatsu
                  In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Oct 2020
                2. INTERSPEECH
                  End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors
                  Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Kenji Nagamatsu
                  In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Oct 2020
                3. ICRA
                  Anticipating the Start of User Interaction for Service Robot in the Wild
                  Koichiro Ito, Quan Kong, Shota Horiguchi, Takashi Sumiyoshi, and Kenji Nagamatsu
                  In IEEE International Conference on Robotics and Automation (ICRA), Jun 2020
                1. SemEval
                  Hitachi at SemEval-2020 Task 8: Simple but Effective Modality Ensemble for Meme Emotion Recognition
                  Terufumi Morishita*, Gaku Morio*, Shota Horiguchi, Hiroaki Ozaki, and Toshinori Miyoshi
                  In The Forteenth Workshop on Semantic Evaluation (SemEval), Dec 2020
                  (*) Equal contribution
                2. CHiME
                  CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings
                  Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, and Neville Ryant
                  In The 6th International Workshop on Speech Processing in Everyday Environments (CHiME-2020), May 2020
                1. Preprint
                  Neural Speaker Diarization with Speaker-Wise Chain Rule
                  Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, and Nagamatsu Kenji
                  arXiv:2006.01796, Jun 2020
                2. Preprint
                  End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-Label Classification
                  Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, and Nagamatsu Kenji
                  arXiv:2003.20966, Feb 2020

                2019

                  1. ASRU
                    Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
                    Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
                    In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2019
                  2. ASRU
                    End-to-End Neural Speaker Diarization with Self-Attention
                    Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
                    In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2019
                  3. INTERSPEECH
                    End-to-End Neural Speaker Diarization with Permutation-Free Objectives
                    Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, and Shinji Watanabe
                    In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                  4. INTERSPEECH
                    Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation
                    Shota Horiguchi, Naoyuki Kanda, and Kenji Nagamatsu
                    In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                  5. INTERSPEECH
                    Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition
                    Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, and Shinji Watanabe
                    In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                  6. INTERSPEECH
                    Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party Scenario
                    Naoyuki Kanda, Christoph Boeddeker, Jens Heitkaemper, Yusuke Fujita, Shota Horiguchi, Kenji Nagamatsu, and Reinhold Haeb-Umbach
                    In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                  7. ICASSP
                    Acoustic Modeling for Distant Multi-Talker Speech Recognition with Single- and Multi-Channel Branches
                    Naoyuki Kanda, Yusuke Fujita, Shota Horiguchi, Rintaro Ikeshita, Kenji Nagamatsu, and Shinji Watanabe
                    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2019
                  8. WACV
                    Omnidirectional Pedestrian Detection by Rotation Invariant Training
                    Masato Tamura, Shota Horiguchi, and Tomokazu Murakami
                    In IEEE Winter Conference on Applications of Computer Vision (WACV), Jan 2019

                      2018

                      1. TMM
                        Personalized Classifier for Food Image Recognition
                        Shota Horiguchi, Sosuke Amano, Makoto Ogawa, and Kiyoharu Aizawa
                        IEEE Transactions on Multimedia, Oct 2018
                      1. ACMMM
                        Face-Voice Matching Using Cross-Modal Embeddings
                        Shota Horiguchi, Naoyuki Kanda, and Kenji Nagamatsu
                        In ACM International Conference on Multimedia (ACMMM), Oct 2018
                      1. CHiME
                        The Hitachi/JHU CHiME-5 System: Advances in Speech Recognition for Everyday Home Environments Using Multiple Microphone Arrays
                        Naoyuki Kanda, Rintaro Ikeshita, Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu, Xiaofei Wang, Vimal Manohar, Nelson Enrique Yalta Soplin, Matthew Maciejewski, Szu-Jui Chen, Aswin Shanmugam Subramanian, Ruizhi Li, Zhiqi Wang, Jason Naradowsky, L. Paola Garcia-Perera, and Gregory Sell
                        In The 5th International Workshop on Speech Processing in Everyday Environments (CHiME-2018), Sep 2018

                        2016

                          1. Food Search Based on User Feedback to Assist Image-Based Food Recording Systems
                            Sosuke Amano, Shota Horiguchi, Kiyoharu Aizawa, Kazuki Maeda, Masanori Kubota, and Makoto Ogawa
                            In International Workshop On Multimedia Assisted Dietary Management (MADiMa), Oct 2016
                          2. ICIP
                            The Log-Normal Distribution of the Size of Objects in Daily Meal Images and Its Application to the Efficient Reduction of Object Proposals
                            Shota Horiguchi, Kiyoharu Aizawa, and Makoto Ogawa
                            In IEEE International Conference on Image Processing (ICIP), Sep 2016