publications

A full list of publications can be found at my google scholar page.

2023

  1. arXiv
    StarCoder: May the source be with you!
    Li, Raymond, Allal, Loubna Ben, Zi, yangtian, ..., , Zhu, Jian, and others,
    arXiv preprint arXiv:2305.06161 2023
  2. MLHC
    Dialogue-Contextualized Re-ranking for Medical History-Taking
    Zhu, Jian, Valmianski, Ilya, and Kannan, Anitha
    Machine Learning for Health Care 2023 2023
  3. AMPPS
    Multidimensional signals and analytic flexibility: Estimating degrees of freedom in human speech analyses
    Coretta, Ste, Casillas, Joseph V, ..., , Zhu, Jian, ..., , and Roettger, Timo B
    Advances in Methods and Practices in Psychological Sciences 2023

2022

  1. arXiv
    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
    Scao, Teven Le, Fan, Angela, Akiki, Christopher, ..., , Zhu, Jian, and others,
    arXiv preprint arXiv:2211.05100 2022
  2. EMNLP Findings
    Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
    Zhu, Jian, Tian, Zuoyu, Liu, Yadong, Zhang, Cong, and Lo, Chia-wen
    In Findings of Empirical Methods in Natural Language Processing 2022
  3. NeurIPS
    The BigScience ROOTS Corpus: A 1.6 TB Composite Multilingual Dataset
    Laurençon, Hugo, Saulnier, Lucile, Wang, Thomas, ..., , Zhu, Jian, and others,
    In NeurIPS Datasets and Benchmarks Track 2022
  4. Interspeech
    ByT5 model for massively multilingual grapheme-to-phoneme conversion
    Zhu, Jian, Zhang, Cong, and Jurgens, David
    Interspeech 2022
  5. ICASSP
    Phone-to-audio alignment without text: A Semi-supervised Approach
    Zhu, Jian, Zhang, Cong, and Jurgens, David
    IEEE International Conference on Acoustics, Speech and Signal Processing 2022

2021

  1. EMNLP
    Idiosyncratic but not Arbitrary: Learning Idiolects in Online Registers Reveals Distinctive yet Consistent Individual Styles
    Zhu, Jian, and Jurgens, David
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing Nov 2021
  2. NAACL
    The structure of online social networks modulates the rate of lexical change
    Zhu, Jian, and Jurgens, David
    In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Jun 2021
  3. Interspeech
    Synchronising Speech Segments with Musical Beats in Mandarin and English Singing
    Zhang, Cong, and Zhu, Jian
    In Proc. Interspeech 2021 2021

2020

  1. SpeechProsody
    Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
    Zhu, Jian
    In Proc. Speech Prosody 2020 2020

2019

  1. SemEval
    UM-IU@LING at SemEval-2019 Task 6: Identifying Offensive Tweets Using BERT and SVMs
    Zhu, Jian, Tian, Zuoyu, and Kübler, Sandra
    In Proceedings of the 13th International Workshop on Semantic Evaluation Jun 2019
  2. ICASSP
    Predicting tongue motion in unlabeled ultrasound videos using convolutional LSTM neural networks
    Zhao, Chaojie, Zhang, Peng, Zhu, Jian, Wu, Chengrui, Wang, Huaimin, and Xu, Kele
    In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
  3. ICASSP
    Denoising convolutional autoencoder based B-mode ultrasound tongue image feature extraction
    Li, Bo, Xu, Kele, Feng, Dawei, Mi, Haibo, Wang, Huaimin, and Zhu, Jian
    In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
  4. arXiv
    A CNN-based tool for automatic tongue contour tracking in ultrasound images
    Zhu, Jian, Styler, Will, and Calloway, Ian
    arXiv preprint arXiv:1907.10210 2019

2016

  1. JASA
    Effect of several acoustic cues on perceiving Mandarin retroflex affricates and fricatives in continuous speech
    Zhu, Jian, and Chen, Yaping
    The Journal of the Acoustical Society of America 2016