Details of the Researcher

PHOTO

Sho Yokoi
Section
Center for Language AI Research
Job title
Associate Professor
Degree
  • 博士(情報科学)(東北大学)

  • 修士(情報科学)(東北大学)

e-Rad No.
60888949
Profile

ほとんどすべての情報は http://www.cl.ecei.tohoku.ac.jp/~yokoi/index_ja.html の方に記載してあります。

Research Interests 2

  • Representation Learning

  • Natural Language Processing

Awards 1

  1. 言語処理学会第31回年次大会 (NLP), 委員特別賞

    2025/03 言語処理学会 不均衡最適輸送を用いた意味変化検出

Papers 34

  1. On Entity Identification in Language Models International-journal International-coauthorship Peer-reviewed

    Masaki Sakata, Sho Yokoi, Benjamin Heinzerling, Takumi Ito, Kentaro Inui

    Findings of the Association for Computational Linguistics: ACL 2025 2025/07

  2. Quantifying Lexical Semantic Shift via Unbalanced Optimal Transport International-journal Peer-reviewed

    Ryo Kishino, Hiroaki Yamagiwa, Ryo Nagata, Sho Yokoi, Hidetoshi Shimodaira

    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL) 2025/07

  3. SoftMatcha: A Soft and Fast Pattern Matcher for Billion-Scale Corpus Searches International-journal International-coauthorship Peer-reviewed

    Hiroyuki Deguchi, Go Kamoda, Yusuke Matsushita, Chihiro Taguchi, Kohei Suenaga, Masaki Waga, Sho Yokoi

    The Thirteenth International Conference on Learning Representations (ICLR) 2025/04

  4. TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models International-journal Peer-reviewed

    Makoto Shing, Kou Misaki, Han Bao, Sho Yokoi, Takuya Akiba

    The Thirteenth International Conference on Learning Representations (ICLR) 2025/04

  5. Zipfian Whitening International-journal Peer-reviewed

    Sho Yokoi, Han Bao, Hiroto Kurita, Hidetoshi Shimodaira

    Advances in Neural Information Processing Systems 37 (NeurIPS) 2024/12

  6. Subspace Representations for Soft Set Operations and Sentence Similarities International-journal Peer-reviewed

    Yoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura

    Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL) 3512-3524 2024/06

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2024.naacl-long.192  

  7. Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps International-journal Peer-reviewed

    Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    The Twelfth International Conference on Learning Representations (ICLR) 2024/05

  8. Transformer Language Models Handle Word Frequency in Prediction Head.

    Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    ACL (Findings) 4523-4535 2023

    DOI: 10.18653/v1/2023.findings-acl.276  

  9. Norm of word embedding encodes information gain

    Momose Oyama, Sho Yokoi, Hidetoshi Shimodaira

    2022/12/19

    DOI: 10.48550/arXiv.2212.09663  

    More details Close

    Distributed representations of words encode lexical semantic information, but how is that information encoded in word embeddings? Focusing on the skip-gram with negative-sampling method, we show theoretically and experimentally that the squared norm of word embedding encodes the information gain defined by the Kullback-Leibler divergence of the co-occurrence distribution of a word to the unigram distribution of the corpus. Furthermore, through experiments on tasks of keyword extraction, hypernym prediction, and part-of-speech discrimination, we confirmed that the KL divergence and the squared norm of embedding work as a measure of the informativeness of a word provided that the bias caused by word frequency is adequately corrected.

  10. Improving word mover's distance by leveraging self-attention matrix

    Hiroaki Yamagiwa, Sho Yokoi, Hidetoshi Shimodaira

    CoRR abs/2211.06229 2022

    DOI: 10.48550/arXiv.2211.06229  

  11. Subspace-based Set Operations on a Pre-trained Word Embedding Space

    Yoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura

    CoRR abs/2210.13034 2022

    DOI: 10.48550/arXiv.2210.13034  

  12. Instance-Based Neural Dependency Parsing Peer-reviewed

    Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Masashi Yoshikawa, Kentaro Inui

    Transactions of the Association for Computational Linguistics 9 1493-1507 2021/12/17

    Publisher: MIT Press - Journals

    DOI: 10.1162/tacl_a_00439  

    eISSN: 2307-387X

    More details Close

    Abstract Interpretable rationales for model predictions are crucial in practical applications. We develop neural models that possess an interpretable inference process for dependency parsing. Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set. The training edges are explicitly used for the predictions; thus, it is easy to grasp the contribution of each edge to the predictions. Our experiments show that our instance-based models achieve competitive accuracy with standard neural models and have the reasonable plausibility of instance-based explanations.

  13. Incorporating Residual and Normalization Layers into Analysis of Masked Language Models.

    Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    CoRR abs/2109.07152 2021

  14. Instance-Based Neural Dependency Parsing.

    Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Masashi Yoshikawa, Kentaro Inui

    Transactions of the Association for Computational Linguistics 9 1493-1507 2021

    DOI: 10.1162/tacl_a_00439  

  15. Computationally Efficient Wasserstein Loss for Structured Labels

    Ayato Toyokuni, Sho Yokoi, Hisashi Kashima, Makoto Yamada

    CoRR abs/2103.00899 2021

  16. Efficient Estimation of Influence of a Training Instance Peer-reviewed

    Sosuke Kobayashi, Yokoi Sho, Suzuki Jun, Inui Kentaro

    Journal of Natural Language Processing 28 (2) 573-597 2021

    Publisher: The Association for Natural Language Processing

    DOI: 10.5715/jnlp.28.573  

    ISSN: 1340-7619

    eISSN: 2185-8314

    More details Close

    Understanding the influence of a training instance on a machine-learning model is important for interpreting the behavior of the model. However, it has been difficult and inefficient to evaluate the influence, considering how the prediction of a model would be changed if a training instance were not used. This prevents the application of influence estimation in neural networks with a large number of parameters. In this paper, we propose an efficient method for estimating the influence for neural networks. The method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance. By switching between dropout masks, we can use sub-networks that learned or did not learn each training instance and estimate its influence based on the difference between the sub-networks. Through experiments with BERT and VGGNet on classification datasets, it was demonstrated that the proposed method can enhance the interpretability of error predictions. Quantitative evaluations were also performed by analyzing learning curves of sub-networks and applying the method to data filtering.

  17. Incorporating Residual and Normalization Layers into Analysis of Masked Language Models Peer-reviewed

    Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 4547-4568 2021

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2021.emnlp-main.373  

  18. Computationally Efficient Wasserstein Loss for Structured Labels Peer-reviewed

    Ayato Toyokuni, Sho Yokoi, Hisashi Kashima, Makoto Yamada

    Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop 1-7 2021

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2021.eacl-srw.1  

  19. Evaluation of Similarity-based Explanations Peer-reviewed

    Kazuaki Hanawa, Sho Yokoi, Satoshi Hara, Kentaro Inui

    The Ninth International Conference on Learning Representations (ICLR) 2021

  20. Modeling Event Salience in Narratives via Barthes' Cardinal Functions

    Takaki Otake, Sho Yokoi, Naoya Inoue, Ryo Takahashi, Tatsuki Kuribayashi, Kentaro Inui

    CoRR abs/2011.01785 2020

  21. Word Rotator's Distance: Decomposing Vectors Gives Better Representations

    Sho Yokoi, Ryo Takahashi, Reina Akama, Jun Suzuki, Kentaro Inui

    CoRR abs/2004.15003 2020

  22. Utterance Pair Scoring for Noisy Dialogue Data Filtering

    Reina Akama, Sho Yokoi, Jun Suzuki, Kentaro Inui

    CoRR abs/2004.14008 2020

  23. Efficient Estimation of Influence of a Training Instance Peer-reviewed

    Sosuke Kobayashi, Sho Yokoi, Jun Suzuki, Kentaro Inui

    Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing 2020

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2020.sustainlp-1.6  

  24. Attention is Not Only a Weight: Analyzing Transformers with Vector Norms Peer-reviewed

    Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 7057-7075 2020

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2020.emnlp-main.574  

  25. Word Rotator's Distance Peer-reviewed

    Sho Yokoi, Ryo Takahashi, Reina Akama, Jun Suzuki, Kentaro Inui

    Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2944-2960 2020

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2020.emnlp-main.236  

  26. Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness Peer-reviewed

    Reina Akama, Sho Yokoi, Jun Suzuki, Kentaro Inui

    Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 941-958 2020

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2020.emnlp-main.68  

  27. Modeling Event Salience in Narratives via Barthes' Cardinal Functions Peer-reviewed

    Takaki Otake, Sho Yokoi, Naoya Inoue, Ryo Takahashi, Tatsuki Kuribayashi, Kentaro Inui

    Proceedings of the 28th International Conference on Computational Linguistics (COLING) 1784-1794 2020

    Publisher: International Committee on Computational Linguistics

    DOI: 10.18653/v1/2020.coling-main.160  

  28. Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition. Peer-reviewed

    Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Ryuto Konno, Kentaro Inui

    Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL) 6452-6459 2020

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/2020.acl-main.575  

  29. Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions

    Sho Yokoi, Sosuke Kobayashi, Kenji Fukumizu, Jun Suzuki, Kentaro Inui

    CoRR abs/1809.00800 2018

  30. Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions Peer-reviewed

    Sho Yokoi, Sosuke Kobayashi, Kenji Fukumizu, Jun Suzuki, Kentaro Inui

    Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1763-1775 2018

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/d18-1203  

  31. Unsupervised Learning of Style-sensitive Word Vectors Peer-reviewed

    Reina Akama, Kento Watanabe, Sho Yokoi, Sosuke Kobayashi, Kentaro Inui

    Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL) 572-578 2018

    Publisher: Association for Computational Linguistics

    DOI: 10.18653/v1/P18-2091  

  32. Learning Co-Substructures by Kernel Dependence Maximization Peer-reviewed

    Sho Yokoi, Daichi Mochihashi, Ryo Takahashi, Naoaki Okazaki, Kentaro Inui

    Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI) 3329-3335 2017

    Publisher: ijcai.org

    DOI: 10.24963/ijcai.2017/465  

  33. Link Prediction in Sparse Networks by Incidence Matrix Factorization Peer-reviewed

    Sho Yokoi, Hiroshi Kajino, Hisashi Kashima

    J. Inf. Process. 25 477-485 2017

    DOI: 10.2197/ipsjjip.25.477  

  34. Link Prediction by Incidence Matrix Factorization

    Sho Yokoi, Hiroshi Kajino, Hisashi Kashima

    ECAI 2016 - 22nd European Conference on Artificial Intelligence(ECAI) 1730-1731 2016

    Publisher: IOS Press

    DOI: 10.3233/978-1-61499-672-9-1730  

Show all ︎Show first 5

Misc. 5

  1. Transformerの文脈を混ぜる作用と混ぜない作用

    小林悟郎, 栗林樹生, 栗林樹生, 横井祥, 横井祥, 乾健太郎, 乾健太郎

    言語処理学会年次大会発表論文集(Web) 27th 2021

    ISSN: 2188-4420

  2. 句の呼応と話題の一貫性に着目した低品質対話データの教師なしフィルタリング

    赤間怜奈, 鈴木潤, 横井祥, 乾健太郎, 赤間怜奈, 鈴木潤, 横井祥, 乾健太郎

    言語処理学会年次大会発表論文集(Web) 26th 2020

    ISSN: 2188-4420

  3. 超球面上での最適輸送に基づく文類似性尺度

    横井祥, 高橋諒, 赤間怜奈, 鈴木潤, 乾健太郎, 横井祥, 高橋諒, 赤間怜奈, 鈴木潤, 乾健太郎

    言語処理学会年次大会発表論文集(Web) 26th 2020

    ISSN: 2188-4420

  4. Language-independent Dialogue Data Filtering for Neural Dialogue Response Generation

    赤間怜奈, 赤間怜奈, 横井祥, 横井祥, 鈴木潤, 鈴木潤, 乾健太郎, 乾健太郎

    人工知能学会全国大会(Web) 34th 2020

  5. Optimal Transport Cost between Texts via Norm-Direction Decomposition

    横井祥, 横井祥, 高橋諒, 高橋諒, 赤間怜奈, 赤間怜奈, 鈴木潤, 鈴木潤, 乾健太郎, 乾健太郎

    人工知能学会全国大会(Web) 34th 2020

Presentations 1

  1. 大規模言語モデルを利用したコーパスの意味分析入門 — 文を超えた意味の計算 Invited

    横井祥

    英語コーパス学会 2025年度春季研究会・総会 2025/06

Research Projects 4

  1. 意味とデータとモデルを繋ぐ言語幾何学の創出

    横井 祥

    Offer Organization: 科学技術振興機構

    System: 創発的研究支援事業

    Institution: 国立国語研究所

    2024/10 - 2030

    More details Close

    言葉の持つ多種多様な「意味」の情報が、テキストを用いて学習したモデルの内部でどのように保持されているのかを表現するための、新しい数理基盤を構築します。とくに、意味に関連する諸概念・テキストデータの持つ統計量・モデルの内部表現の幾何的な構造の三者を繋ぐための新しい理論を構築します。またこれにより、ブラックボックスモデルに対する解釈性を上げ、機能や知識の追加や削除を安定して実現するための汎用的技法を提供します。

  2. Fronteers of Data Science by Probabilistic Description and Inference of Dynamics

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Transformative Research Areas (A)

    Institution: The Institute of Statistical Mathematics

    2022/06 - 2027/03

  3. Knowledge inference system for robot integrating commonsense in natural language with real-world observation

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Institute of Physical and Chemical Research

    2022/04/01 - 2026/03/31

  4. 言葉が埋め込まれた空間の形と言葉の意味の接続

    横井 祥

    Offer Organization: 科学技術振興機構

    System: 戦略的創造研究推進事業 ACT-X

    Institution: 東北大学

    2020 - 2022

    More details Close

    単語ベクトルを基盤とした手法の発展により自然言語処理は著しい進化を遂げました。しかし文など単語よりも大きな単位のテキストの表現や比較方法は確立できておらず、言語処理システムの致命的なエラーの感知すら難しい状況です。本研究では単語ベクトルの持つ幾何学的情報を言語学的情報に帰着させ、さらに最適輸送を足がかりとして文に含まれる情報の差に鋭敏な文類似性尺度を構築することでこの問題の解決を目指します。