発表文献

2026

  1. CSL
    Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge
    Naoyuki Kamo, Naohiro Tawara, Atsushi Ando, Takatomo Kano, Hiroshi Sato, Rintaro Ikeshita, Takafumi Moriya, Shota Horiguchi, Kohei Matsuura, Atsunori Ogawa, Alexis Plaquet, Takanori Ashihara, Tsubasa Ochiai, Masato Mimura, Marc Delcroix, Tomohiro Nakatani, Taichi Asami, and Shoko Araki
    Computer Speech & Language, Jan 2026

        2025

          1. INTERSPEECH
            Mitigating Non-Target Speaker Bias in Guided Speaker Embedding
            Shota Horiguchi, Takanori Ashihara, Marc Delcroix, Atsushi Ando, and Naohiro Tawara
            In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2025
          2. INTERSPEECH
            Pretraining Multi-Speaker Identification for Neural Speaker Diarization
            Shota Horiguchi, Atsushi Ando, Naohiro Tawara, and Marc Delcroix
            In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2025
          3. INTERSPEECH
            Analysis of Semantic and Acoustic Token Variability Across Speech, Music, and Audio Domains
            Takanori Ashihara, Marc Delcroix, Tsubasa Ochiai, Kohei Matsuura, and Shota Horiguchi
            In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2025
          4. INTERSPEECH
            Voice Impression Control in Zero-Shot TTS
            Kenichi Fujita, Shota Horiguchi, and Yusuke Ijima
            In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2025
          5. ICASSP
            Guided Speaker Embedding
            Shota Horiguchi, Takafumi Moriya, Atsushi Ando, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, and Marc Delcroix
            In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
          6. ICASSP
            Multi-channel Speaker Counting for EEND-VC-based Speaker Diarization on Multi-domain Conversation
            Naohiro Tawara, Atsushi Ando, Shota Horiguchi, and Marc Delcroix
            In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
          7. ICASSP
            Mamba-based Segmentation Model for Speaker Diarization
            Alexis Plaquet, Naohiro Tawara, Marc Delcroix, Shota Horiguchi, Atsushi Ando, and Shoko Araki
            In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
          8. ICASSP
            Alignment-Free Training for Transducer-Based Multi-Talker ASR
            Takafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, and Masato Mimura
            In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2025
            1. Preprint
              Dissecting the Segmentation Model of End-to-End Diarization with Vector Clustering
              Alexis Plaquet, Naohiro Tawara, Marc Delcroix, Shota Horiguchi, Atsushi Ando, Shoko Araki, and Hervé Bredin
              arXiv:2506.11605, Jun 2025

            2024

              1. SLT
                Investigation of Speaker Representation for Target-Speaker Speech Processing
                Takanori Ashihara, Takafumi Moriya, Shota Horiguchi, Junyi Peng, Tsubasa Ochiai, Marc Delcroix, Kohei Matsuura, and Hiroshi Sato
                In IEEE Spoken Language Technology Workshop (SLT), Dec 2024
              2. SLT
                Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings
                Shota Horiguchi, Atsushi Ando, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, and Marc Delcroix
                In IEEE Spoken Language Technology Workshop (SLT), Dec 2024
                🏆 Honorable Mention Award @IEEE SLT 2024
              3. INTERSPEECH
                Factor-Conditioned Speaking-Style Captioning
                Atsushi Ando, Takafumi Moriya, Shota Horiguchi, and Ryo Masumura
                In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2024
              4. INTERSPEECH
                SpeakerBeam-SS: Real-Time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
                Hiroshi Sato, Takafumi Moriya, Masato Mimura, Shota Horiguchi, Tsubasa Ochiai, Takanori Ashihara, Atsushi Ando, Kentaro Shinayama, and Marc Delcroix
                In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2024
              5. ICASSP
                Streaming Active Learning for Regression Problems Using Regression via Classification
                Shota Horiguchi, Kota Dohi, and Yohei Kawaguchi
                In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024
              1. CHiME
                NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
                Naoyuki Kamo*, Naohiro Tawara*, Atsushi Ando, Takatomo Kano, Hiroshi Sato, Rintaro Ikeshita, Takafumi Moriya, Shota Horiguchi, Kohei Matsuura, Atsunori Ogawa, Alexis Plaquet, Takanori Ashihara, Tsubasa Ochiai, Masato Mimura, Marc Delcroix, Tomohiro Nakatani, Taichi Asami, and Shoko Araki
                In The 8th International Workshop on Speech Processing in Everyday Environments (CHiME-2024), Sep 2024
                (*) Equal contribution
              1. Preprint
                Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
                Hiroyuki Namba, Shota Horiguchi, Masaki Hamamoto, and Masashi Egi
                arXiv:2402.08209, Feb 2024

              2023

              1. TASLP
                Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors
                Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, and Yohei Kawaguchi
                IEEE/ACM Transactions on Audio, Speech, and Language Processing, Jan 2023
              1. APSIPA ASC
                Synthetic Data Augmentation for ASR with Domain Filtering
                Tuan Vu Ho, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Takashi Sumiyoshi
                In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Nov 2023
              2. INTERSPEECH
                Spoofing Attacker Also Benefits from Large-Scale Self-Supervised Models
                Aoi Ito* and Shota Horiguchi*
                In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2023
                (*) Equal contribution
              3. INTERSPEECH
                CAPTDURE: Captioned Sound Dataset of Single Sources
                Yuki Okamoto, Kanta Shimonishi, Keisuke Imoto, Kota Dohi, Shota Horiguchi, and Yohei Kawaguchi
                In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2023
              4. SLT
                Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization
                Shota Horiguchi, Yuki Takashima, Shinji Watanabe, and Paola García
                In IEEE Spoken Language Technology Workshop (SLT), Jan 2023

                  2022

                  1. TASLP
                    Encoder-Decoder Based Attractors for End-to-End Neural Diarization
                    Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Paola García
                    IEEE/ACM Transactions on Audio, Speech, and Language Processing, Mar 2022
                    🏆 Itakura Prize Innovative Young Researcher Award
                  1. INTERSPEECH
                    Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models
                    Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Yohei Kawaguchi
                    In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2022
                  2. ICML
                    Rethinking Fano’s Inequality in Ensemble Learning
                    Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, and Nobuo Nukaga
                    In International Conference on Machine Learning (ICML), Jul 2022
                  3. Odyssey
                    Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization
                    Natsuo Yamashita, Shota Horiguchi, and Takeshi Homma
                    In The Speaker and Language Recognition Workshop (Odyssey), Jun 2022
                  4. ICASSP
                    Multi-Channel End-to-End Neural Diarization with Distributed Microphones
                    Shota Horiguchi, Yuki Takashima, Paola García, Shinji Watanabe, and Yohei Kawaguchi
                    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2022
                  5. ICASSP
                    Environmental Sound Extraction Using Onomatopoeic Words
                    Yuki Okamoto, Shota Horiguchi, Masaaki Yamamoto, Keisuke Imoto, and Yohei Kawaguchi
                    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2022
                    🏆 IEEE SPS Japan Student Conference Paper Award

                      2021

                        1. ASRU
                          Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors
                          Shota Horiguchi, Paola García, Shinji Watanabe, Yawen Xue, Yuki Takashima, and Yohei Kawaguchi
                          In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2021
                        2. INTERSPEECH
                          Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers
                          Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                          In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2021
                        3. INTERSPEECH
                          Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization
                          Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                          In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2021
                        4. ICASSP
                          End-to-End Speaker Diarization as Post-Processing
                          Shota Horiguchi, Paola García, Yusuke Fujita, Shinji Watanabe, and Kenji Nagamatsu
                          In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2021
                        5. SLT
                          End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection
                          Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola Garcia, and Kenji Nagamatsu
                          In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
                        6. SLT
                          Online End-to-End Neural Diarization with Speaker-Tracing Buffer
                          Yawen Xue, Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Paola Garcia, and Kenji Nagamatsu
                          In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
                        7. SLT
                          Block-Online Guided Source Separation
                          Shota Horiguchi, Yusuke Fujita, and Kenji Nagamatsu
                          In IEEE Spoken Language Technology Workshop (SLT), Jan 2021
                        1. DIHARD
                          The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-vector Clustering Systems Combined by DOVER-Lap
                          Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, and Sanjeev Khudanpur
                          In The Third DIHARD Speech Diarization Challenge (DIHARD III), Jan 2021

                          2020

                          1. TPAMI
                            Significance of Softmax-based Features in Comparison to Distance Metric Learning-based Features
                            Shota Horiguchi, Daiki Ikami, and Kiyoharu Aizawa
                            IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2020
                          1. INTERSPEECH
                            Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones
                            Shota Horiguchi, Yusuke Fujita, and Kenji Nagamatsu
                            In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Oct 2020
                          2. INTERSPEECH
                            End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors
                            Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Kenji Nagamatsu
                            In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Oct 2020
                          3. ICRA
                            Anticipating the Start of User Interaction for Service Robot in the Wild
                            Koichiro Ito, Quan Kong, Shota Horiguchi, Takashi Sumiyoshi, and Kenji Nagamatsu
                            In IEEE International Conference on Robotics and Automation (ICRA), Jun 2020
                          1. SemEval
                            Hitachi at SemEval-2020 Task 8: Simple but Effective Modality Ensemble for Meme Emotion Recognition
                            Terufumi Morishita*, Gaku Morio*, Shota Horiguchi, Hiroaki Ozaki, and Toshinori Miyoshi
                            In The Forteenth Workshop on Semantic Evaluation (SemEval), Dec 2020
                            (*) Equal contribution
                          2. CHiME
                            CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings
                            Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, and Neville Ryant
                            In The 6th International Workshop on Speech Processing in Everyday Environments (CHiME-2020), May 2020
                          1. Preprint
                            Neural Speaker Diarization with Speaker-Wise Chain Rule
                            Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, and Nagamatsu Kenji
                            arXiv:2006.01796, Jun 2020
                          2. Preprint
                            End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-Label Classification
                            Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, and Nagamatsu Kenji
                            arXiv:2003.20966, Feb 2020

                          2019

                            1. ASRU
                              Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
                              Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
                              In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2019
                            2. ASRU
                              End-to-End Neural Speaker Diarization with Self-Attention
                              Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
                              In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2019
                              Best Paper Finalist
                            3. INTERSPEECH
                              End-to-End Neural Speaker Diarization with Permutation-Free Objectives
                              Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, and Shinji Watanabe
                              In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                            4. INTERSPEECH
                              Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation
                              Shota Horiguchi, Naoyuki Kanda, and Kenji Nagamatsu
                              In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                            5. INTERSPEECH
                              Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition
                              Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, and Shinji Watanabe
                              In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                            6. INTERSPEECH
                              Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party Scenario
                              Naoyuki Kanda, Christoph Boeddeker, Jens Heitkaemper, Yusuke Fujita, Shota Horiguchi, Kenji Nagamatsu, and Reinhold Haeb-Umbach
                              In The Annual Conference of the International Speech Communication Association (INTERSPEECH), Sep 2019
                            7. ICASSP
                              Acoustic Modeling for Distant Multi-Talker Speech Recognition with Single- and Multi-Channel Branches
                              Naoyuki Kanda, Yusuke Fujita, Shota Horiguchi, Rintaro Ikeshita, Kenji Nagamatsu, and Shinji Watanabe
                              In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2019
                            8. WACV
                              Omnidirectional Pedestrian Detection by Rotation Invariant Training
                              Masato Tamura, Shota Horiguchi, and Tomokazu Murakami
                              In IEEE Winter Conference on Applications of Computer Vision (WACV), Jan 2019

                                2018

                                1. TMM
                                  Personalized Classifier for Food Image Recognition
                                  Shota Horiguchi, Sosuke Amano, Makoto Ogawa, and Kiyoharu Aizawa
                                  IEEE Transactions on Multimedia, Oct 2018
                                1. ACMMM
                                  Face-Voice Matching Using Cross-Modal Embeddings
                                  Shota Horiguchi, Naoyuki Kanda, and Kenji Nagamatsu
                                  In ACM International Conference on Multimedia (ACMMM), Oct 2018
                                1. CHiME
                                  The Hitachi/JHU CHiME-5 System: Advances in Speech Recognition for Everyday Home Environments Using Multiple Microphone Arrays
                                  Naoyuki Kanda, Rintaro Ikeshita, Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu, Xiaofei Wang, Vimal Manohar, Nelson Enrique Yalta Soplin, Matthew Maciejewski, Szu-Jui Chen, Aswin Shanmugam Subramanian, Ruizhi Li, Zhiqi Wang, Jason Naradowsky, L. Paola Garcia-Perera, and Gregory Sell
                                  In The 5th International Workshop on Speech Processing in Everyday Environments (CHiME-2018), Sep 2018

                                  2016

                                    1. Food Search Based on User Feedback to Assist Image-Based Food Recording Systems
                                      Sosuke Amano, Shota Horiguchi, Kiyoharu Aizawa, Kazuki Maeda, Masanori Kubota, and Makoto Ogawa
                                      In International Workshop On Multimedia Assisted Dietary Management (MADiMa), Oct 2016
                                    2. ICIP
                                      The Log-Normal Distribution of the Size of Objects in Daily Meal Images and Its Application to the Efficient Reduction of Object Proposals
                                      Shota Horiguchi, Kiyoharu Aizawa, and Makoto Ogawa
                                      In IEEE International Conference on Image Processing (ICIP), Sep 2016