Publications
(*=Equal Contribution)
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision
Yunyi Zhang, Ruozhen Yang*, Xueqiang Xu*, Rui Li*, Jinfeng Xiao, Jiaming Shen, Jiawei Han.
arXiv:2403.00165A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion
Yanzhen Shen, Yu Zhang, Yunyi Zhang, Jiawei Han.
arXiv:2402.13405Unsupervised Episode Detection for Large-Scale News Events
Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han.
arXiv:2408.04873ACER: Automatic Language Model Context Extension via Retrieval
Luyu Gao, Yunyi Zhang, Jamie Callan.
arXiv:2410.09141Taxonomy-guided Semantic Indexing for Academic Paper Search
SeongKu Kang, Yunyi Zhang, Pengcheng Jiang, Dongha Lee, Jiawei Han, Hwanjo Yu.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.Automated Mining of Structured Knowledge from Text in the Era of Large Language Models [Website]
Yunyi Zhang, Ming Zhong, Siru Ouyang, Yizhu Jiao, Sizhe Zhou, Linyi Ding, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2024. (Tutorial)Ontology Enrichment for Effective Fine-grained Entity Typing [Paper]
Siru Ouyang, Jiaxin Huang, Pranav Pillai, Yunyi Zhang, Yu Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2024.Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains [Paper] [Code]
Yu Zhang*, Yunyi Zhang*, Yanzhen Shen, Yu Deng, Lucian Popa, Larisa Shwartz, ChengXiang Zhai, and Jiawei Han.
AAAI Conference on Artificial Intelligence (AAAI), 2024.PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training [Paper] [Code]
Yunyi Zhang, Minhao Jiang, Yu Meng, Yu Zhang, Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers [Paper] [Code]
Yu Zhang, Bowen Jin, Xiusi Chen, Yanzhen Shen, Yunyi Zhang, Yu Meng, and Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023.Pretrained Language Representations for Text Understanding: A Weakly-Supervised Perspective [Website]
Yu Meng, Jiaxin Huang, Yu Zhang, Yunyi Zhang, and Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023. (Tutorial)Unsupervised Story Discovery from Continuous News Streams via Scalable Thematic Embedding [Paper] [Code]
Susik Yoon, Dongha Lee, Yunyi Zhang and Jiawei Han.
ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR), 2023.Unsupervised Event Chain Mining from Multiple Documents [Paper]
Yizhu Jiao, Ming Zhong, Jiaming Shen, Yunyi Zhang, Chao Zhang and Jiawei Han.
The ACM Web Conference (WWW), 2023.Mining Structures from Massive Texts by Exploring the Power of Pre-trained Language Models [Website]
Yu Zhang, Yunyi Zhang, and Jiawei Han.
The International Conference on Extending Database Technology (EDBT), 2023. (Tutorial)Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts [Paper] [Code]
Yu Zhang*, Yunyi Zhang*, Martin Michalski*, Yucheng Jiang*, Yu Meng*, and Jiawei Han.
ACM International Conference on Web Search and Data Mining (WSDM), 2023.Entity Set Co-Expansion in StackOverflow [Paper]
Yu Zhang*, Yunyi Zhang*, Yucheng Jiang, Martin Michalski, Yu Deng, Lucian Popa, ChengXiang Zhai, Jiawei Han.
Workshop on Knowledge Discovery and Data Mining in IT Operations (BigData-IT@IEEE BigData), 2022.Unsupervised Key Event Detection from Massive Text Corpora [Paper] [Code]
Yunyi Zhang, Fang Guo, Jiaming Shen, and Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022.Topic Discovery via Latent Space Clustering of Language Model Representations [Paper] [Code]
Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang and Jiawei Han.
The ACM Web Conference (WWW), 2022.Corpus-based Open-Domain Event Type Induction [Paper] [Code]
Jiaming Shen, Yunyi Zhang, Heng Ji, and Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training [Paper] [Code]
Yu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji and Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup [Paper] [Code: GradCache, GC-DPR]
Luyu Gao, Yunyi Zhang, Jiawei Han, Jamie Callan.
The 6th Workshop on Representation Learning for NLP (RepL4NLP@ACL), 2021.Text Classification Using Label Names Only: A Language Model Self-Training Approach [Paper] [Code]
Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding [Paper] [Code]
Yu Meng*, Yunyi Zhang*, Jiaxin Huang, Yu Zhang, Chao Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020.CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring [Paper] [Code]
Jiaxin Huang, Yiqing Xie, Yu Meng, Yunyi Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020.Empower Entity Set Expansion via Language Model Probing [Paper] [Code]
Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han.
Annual Meeting of the Association for Computational Linguistics (ACL), 2020.Guiding Corpus-based Set Expansion by Auxiliary Sets Generation and Co-Expansion [Paper] [Code]
Jiaxin Huang, Yiqing Xie, Yu Meng, Jiaming Shen, Yunyi Zhang, Jiawei Han.
The ACM Web Conference (WWW), 2020.Complexity of Leading Digit Sequences [Paper]
Xinwei He*, A. J. Hildebrand*, Yuchen Li*, Yunyi Zhang*.
Discrete Mathematics & Theoretical Computer Science, vol. 22 no. 1, April 2020.