cv
Education
- 2015.4 - 2017.3
M.E., Information Science and Technology
The University of Tokyo, Tokyo, Japan
- Supervisor: Prof. Kiyoharu Aizawa
- Thesis title: Personalized Object Recognition
- 2011.4 - 2015.3
B.E., Information and Communication Engineering
The University of Tokyo, Tokyo, Japan
- Supervisor: Prof. Kiyoharu Aizawa
Work Experience
- 2024.2 - Present
Research Specialist
NTT Corporation, Human Informatics Laboratories
- Research topic
- Speech technology
- Research topic
- 2021.10 - 2024.1
Senior Researcher
Hitachi, Ltd. Research and Development Group
- Team leader (2021.10-2022.9, 2023.4-2024.1)
- Tech lead (2022.10-2023.3)
- Research topic
- Speaker diarization (2021-2023)
- Streaming active learning (2022-2024)
- 2017.4 - 2021.9
Researcher
Hitachi, Ltd. Research and Development Group
- Research topic
- Multimodal enviromnental recognition for human-robot interaction (2017-2019)
- Meeting transcription using distributed microphones (2019-2021)
- Speaker diarization (2019-2021)
- Research topic
Honors and Awards
- 2024
2nd prize in The 8th CHiME Speech Separation and Recognition Challenge (CHiME-8) Task 1
- As the NTT team
- In charge of preparation of simulated mixtures and pretraining of EEND-VC
- 2023
Itakura Prize Innovative Young Researcher Award, The Acoustical Society of Japan
- For the research on overlap-aware speaker diarization for unkonwn numbers of speakers
- 2021
2nd prize in The Third DIHARD Speech Diarization Challenge (DIHARD III)
- As the Hitachi-JHU team
- Technical report
- In charge of EEND-EDA, EEND as post-processing, and system ensemble
- 2018
2nd prize in The 5th CHiME Speech Separation and Recognition Challenge (CHiME-5)
- As the Hitachi-JHU team
- Technical report
- In charge of speech separation and server cooling 🌬️
- 2017
Outstanding Research Presentation Award, The Institute of Image Information and Television Engineers
- For the presentation in PRMU, Feb, 2017
Invited Talks
- 2023.8
Speaker Diarization: A Key to Solving Cocktail Party Problem
- 2019.7
Face-Voice Matching Using Cross-Modal Embeddings (In Japanese)
Membership
- Institute of Electrical and Electronics Engineers (IEEE)
- IEEE Signal Processing Society (SPS)
- Acoustical Society of Japan (ASJ): No. 22704
Academic Services
-
Session chair
- IEEE ICASSP 2022 (in Singapore)
-
Review experience
- IEEE/ACM Transactions on Audio, Speech, and Language Processing (2021-2024)
- IEEE Transactions on Multimedia (2020-2021)
- IEEE Transactions on Neural Networks and Learning Systems (2021)
- IEEE Transactions on Pattern Recognition and Machine Intelligence (2020)
- IEEE Transactions on Affective Computing (2023)
- IEEE Open Journal of Signal Processing (2023-2024)
- IEEE Signal Processing Letters (2024)
- Computer Speech & Language (2021-2022)
- Speech Communication (2022-2024)
- EURASIP Journal on Audio, Speech and Music Processing (2023)
- Neural Networks (2022)
- ICASSP (2019-2024)
- INTERSPEECH (2024)
- ASRU (2023)
- EUSIPCO (2021-2023)
- SLT (2022,2024)
- WASPAA (2023)
- DCASE (2020-2021, 2023-2024)
- MLSP (2021, 2023)
- MMSP (2023)
- APSIPA ASC (2021)
- RO-MAN (2022)
- ICML (2020)
Journal
Conference
Internship Supervision
- Natsuo Yamashita (The University of Tokyo, 2021.8 - 2022.2) @Hitachi, Ltd.
- Aoi Ito (Hosei University, 2022.11 - 2024.1) @Hitachi, Ltd.