発表文献

2025

    1. ICASSP
      Guided Speaker Embedding
      Shota Horiguchi, Takafumi Moriya, Atsushi Ando, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, and Marc Delcroix
      In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
    2. ICASSP
      Multi-channel Speaker Counting for EEND-VC-based Speaker Diarization on Multi-domain Conversation
      Naohiro Tawara, Atsushi Ando, Shota Horiguchi, and Marc Delcroix
      In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
    3. ICASSP
      Mamba-based Segmentation Model for Speaker Diarization
      Alexis Plaquet, Naohiro Tawara, Marc Delcroix, Shota Horiguchi, Atsushi Ando, and Shoko Araki
      In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
    4. ICASSP
      Alignment-Free Training for Transducer-Based Multi-Talker ASR
      Takafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, and Masato Mimura
      In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
      1. Preprint
        Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge
        Naoyuki Kamo, Naohiro Tawara, Atsushi Ando, Takatomo Kano, Hiroshi Sato, Rintaro Ikeshita, Takafumi Moriya, Shota Horiguchi, Kohei Matsuura, Atsunori Ogawa, Alexis Plaquet, Takanori Ashihara, Tsubasa Ochiai, Masato Mimura, Marc Delcroix, Tomohiro Nakatani, Taichi Asami, and Shoko Araki
        arXiv:2502.09859, Feb 2025

      2024

        1. SLT
          Investigation of Speaker Representation for Target-Speaker Speech Processing
          Takanori Ashihara, Takafumi Moriya, Shota Horiguchi, Junyi Peng, Tsubasa Ochiai, Marc Delcroix, Kohei Matsuura, and Hiroshi Sato
          In IEEE Spoken Language Technology Workshop (SLT), Dec 2024
        2. SLT
          Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings
          Shota Horiguchi, Atsushi Ando, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, and Marc Delcroix
          In IEEE Spoken Language Technology Workshop (SLT), Dec 2024
          🏆 Honorable Mention Award @IEEE SLT 2024
        3. INTERSPEECH
          Factor-Conditioned Speaking-Style Captioning
          Atsushi Ando, Takafumi Moriya, Shota Horiguchi, and Ryo Masumura
          In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2024
        4. INTERSPEECH
          SpeakerBeam-SS: Real-Time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
          Hiroshi Sato, Takafumi Moriya, Masato Mimura, Shota Horiguchi, Tsubasa Ochiai, Takanori Ashihara, Atsushi Ando, Kentaro Shinayama, and Marc Delcroix
          In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2024
        5. ICASSP
          Streaming Active Learning for Regression Problems Using Regression via Classification
          Shota Horiguchi, Kota Dohi, and Yohei Kawaguchi
          In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024
        1. CHiME
          NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
          Naoyuki Kamo*, Naohiro Tawara*, Atsushi Ando, Takatomo Kano, Hiroshi Sato, Rintaro Ikeshita, Takafumi Moriya, Shota Horiguchi, Kohei Matsuura, Atsunori Ogawa, Alexis Plaquet, Takanori Ashihara, Tsubasa Ochiai, Masato Mimura, Marc Delcroix, Tomohiro Nakatani, Taichi Asami, and Shoko Araki
          In The 8th International Workshop on Speech Processing in Everyday Environments (CHiME-2024), Sep 2024
          (*) Equal contribution
        1. Preprint
          Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
          Hiroyuki Namba, Shota Horiguchi, Masaki Hamamoto, and Masashi Egi
          arXiv:2402.08209, Feb 2024

        2023

        1. TASLP
          Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors
          Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, and Yohei Kawaguchi
          IEEE/ACM Transactions on Audio, Speech, and Language Processing, Jan 2023
        1. APSIPA ASC
          Synthetic Data Augmentation for ASR with Domain Filtering
          Tuan Vu Ho, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Takashi Sumiyoshi
          In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Nov 2023
        2. INTERSPEECH
          Spoofing Attacker Also Benefits from Large-Scale Self-Supervised Models
          Aoi Ito* and Shota Horiguchi*
          In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2023
          (*) Equal contribution
        3. INTERSPEECH
          CAPTDURE: Captioned Sound Dataset of Single Sources
          Yuki Okamoto, Kanta Shimonishi, Keisuke Imoto, Kota Dohi, Shota Horiguchi, and Yohei Kawaguchi
          In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2023
        4. SLT
          Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization
          Shota Horiguchi, Yuki Takashima, Shinji Watanabe, and Paola García
          In IEEE Spoken Language Technology Workshop (SLT), Jan 2023

            2022

            1. TASLP
              Encoder-Decoder Based Attractors for End-to-End Neural Diarization
              Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Paola García
              IEEE/ACM Transactions on Audio, Speech, and Language Processing, Mar 2022
              🏆 Itakura Prize Innovative Young Researcher Award
            1. INTERSPEECH
              Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models
              Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Yohei Kawaguchi
              In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2022
            2. ICML
              Rethinking Fano’s Inequality in Ensemble Learning
              Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, and Nobuo Nukaga
              In International Conference on Machine Learning (ICML), Jul 2022
            3. Odyssey
              Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization
              Natsuo Yamashita, Shota Horiguchi, and Takeshi Homma
              In The Speaker and Language Recognition Workshop (Odyssey), Jun 2022
            4. ICASSP
              Multi-Channel End-to-End Neural Diarization with Distributed Microphones
              Shota Horiguchi, Yuki Takashima, Paola García, Shinji Watanabe, and Yohei Kawaguchi
              In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2022
            5. ICASSP
              Environmental Sound Extraction Using Onomatopoeic Words
              Yuki Okamoto, Shota Horiguchi, Masaaki Yamamoto, Keisuke Imoto, and Yohei Kawaguchi
              In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2022
              🏆 IEEE SPS Japan Student Conference Paper Award

                2021

                  1. ASRU
                    Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors
                    Shota Horiguchi, Paola García, Shinji Watanabe, Yawen Xue, Yuki Takashima, and Yohei Kawaguchi
                    In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2021
                  2. INTERSPEECH
                    Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers
                    Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                    In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2021
                  3. INTERSPEECH
                    Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization
                    Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                    In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2021
                  4. ICASSP
                    End-to-End Speaker Diarization as Post-Processing
                    Shota Horiguchi, Paola García, Yusuke Fujita, Shinji Watanabe, and Kenji Nagamatsu
                    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2021
                  5. SLT
                    End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection
                    Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola Garcia, and Kenji Nagamatsu
                    In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
                  6. SLT
                    Online End-to-End Neural Diarization with Speaker-Tracing Buffer
                    Yawen Xue, Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                    In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
                  7. SLT
                    Block-Online Guided Source Separation
                    Shota Horiguchi, Yusuke Fujita, and Kenji Nagamatsu
                    In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
                  1. DIHARD
                    The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-vector Clustering Systems Combined by DOVER-Lap
                    Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, and Sanjeev Khudanpur
                    In The Third DIHARD Speech Diarization Challenge (DIHARD III), Jan 2021

                    2020

                    1. TPAMI
                      Significance of Softmax-based Features in Comparison to Distance Metric Learning-based Features
                      Shota Horiguchi, Daiki Ikami, and Kiyoharu Aizawa
                      IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2020
                    1. INTERSPEECH
                      Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones
                      Shota Horiguchi, Yusuke Fujita, and Kenji Nagamatsu
                      In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Oct 2020
                    2. INTERSPEECH
                      End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors
                      Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Kenji Nagamatsu
                      In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Oct 2020
                    3. ICRA
                      Anticipating the Start of User Interaction for Service Robot in the Wild
                      Koichiro Ito, Quan Kong, Shota Horiguchi, Takashi Sumiyoshi, and Kenji Nagamatsu
                      In IEEE International Conference on Robotics and Automation (ICRA), Jun 2020
                    1. SemEval
                      Hitachi at SemEval-2020 Task 8: Simple but Effective Modality Ensemble for Meme Emotion Recognition
                      Terufumi Morishita*, Gaku Morio*, Shota Horiguchi, Hiroaki Ozaki, and Toshinori Miyoshi
                      In The Forteenth Workshop on Semantic Evaluation (SemEval), Dec 2020
                      (*) Equal contribution
                    2. CHiME
                      CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings
                      Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, and Neville Ryant
                      In The 6th International Workshop on Speech Processing in Everyday Environments (CHiME-2020), May 2020
                    1. Preprint
                      Neural Speaker Diarization with Speaker-Wise Chain Rule
                      Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, and Nagamatsu Kenji
                      arXiv:2006.01796, Jun 2020
                    2. Preprint
                      End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-Label Classification
                      Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, and Nagamatsu Kenji
                      arXiv:2003.20966, Feb 2020

                    2019

                      1. ASRU
                        Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
                        Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
                        In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2019
                      2. ASRU
                        End-to-End Neural Speaker Diarization with Self-Attention
                        Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
                        In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2019
                        Best Paper Finalist
                      3. INTERSPEECH
                        End-to-End Neural Speaker Diarization with Permutation-Free Objectives
                        Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, and Shinji Watanabe
                        In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                      4. INTERSPEECH
                        Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation
                        Shota Horiguchi, Naoyuki Kanda, and Kenji Nagamatsu
                        In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                      5. INTERSPEECH
                        Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition
                        Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, and Shinji Watanabe
                        In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                      6. INTERSPEECH
                        Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party Scenario
                        Naoyuki Kanda, Christoph Boeddeker, Jens Heitkaemper, Yusuke Fujita, Shota Horiguchi, Kenji Nagamatsu, and Reinhold Haeb-Umbach
                        In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                      7. ICASSP
                        Acoustic Modeling for Distant Multi-Talker Speech Recognition with Single- and Multi-Channel Branches
                        Naoyuki Kanda, Yusuke Fujita, Shota Horiguchi, Rintaro Ikeshita, Kenji Nagamatsu, and Shinji Watanabe
                        In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2019
                      8. WACV
                        Omnidirectional Pedestrian Detection by Rotation Invariant Training
                        Masato Tamura, Shota Horiguchi, and Tomokazu Murakami
                        In IEEE Winter Conference on Applications of Computer Vision (WACV), Jan 2019

                          2018

                          1. TMM
                            Personalized Classifier for Food Image Recognition
                            Shota Horiguchi, Sosuke Amano, Makoto Ogawa, and Kiyoharu Aizawa
                            IEEE Transactions on Multimedia, Oct 2018
                          1. ACMMM
                            Face-Voice Matching Using Cross-Modal Embeddings
                            Shota Horiguchi, Naoyuki Kanda, and Kenji Nagamatsu
                            In ACM International Conference on Multimedia (ACMMM), Oct 2018
                          1. CHiME
                            The Hitachi/JHU CHiME-5 System: Advances in Speech Recognition for Everyday Home Environments Using Multiple Microphone Arrays
                            Naoyuki Kanda, Rintaro Ikeshita, Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu, Xiaofei Wang, Vimal Manohar, Nelson Enrique Yalta Soplin, Matthew Maciejewski, Szu-Jui Chen, Aswin Shanmugam Subramanian, Ruizhi Li, Zhiqi Wang, Jason Naradowsky, L. Paola Garcia-Perera, and Gregory Sell
                            In The 5th International Workshop on Speech Processing in Everyday Environments (CHiME-2018), Sep 2018

                            2016

                              1. Food Search Based on User Feedback to Assist Image-Based Food Recording Systems
                                Sosuke Amano, Shota Horiguchi, Kiyoharu Aizawa, Kazuki Maeda, Masanori Kubota, and Makoto Ogawa
                                In International Workshop On Multimedia Assisted Dietary Management (MADiMa), Oct 2016
                              2. ICIP
                                The Log-Normal Distribution of the Size of Objects in Daily Meal Images and Its Application to the Efficient Reduction of Object Proposals
                                Shota Horiguchi, Kiyoharu Aizawa, and Makoto Ogawa
                                In IEEE International Conference on Image Processing (ICIP), Sep 2016