Ruixiang (Ryan) Tang
About Me
I am a final year Ph.D. student in the Department of Computer Science at Rice University. My advisor is Prof. Xia Hu, who leads the DATA Lab and Rice and Rice D2K Lab. I obtained my bachelor’s degree from the Department of Automation, Tsinghua University, advised by Prof. Jiwen Lu.
My core research focus is in the realm of Trustworthy AI, a critical area that demands the infusion of trust throughout the AI lifecycle—from data acquisition and model development to deployment and user interaction. Within this overarching theme, I specialize in issues related to privacy, security, and explainability. Additionally, I collaborate closely with health informaticians from UTHealth and Baylor College of Medicine to deploy reliable AI for healthcare applications.
Recently, the emergence of Large Language Models (LLMs) like GPT-4, brings new and complex challenges to the trustworthiness of AI. My long-term objective is to develop computational regulatory frameworks for these generative AI systems to prevent potential misuse and ensure their responsible utilization.
I’m graduating in 2024 and am actively seeking a tenure-track faculty position.
Research Overview
Interests
Large Language Model Security & Privacy
ML Backdoor Learning and its Applications in Copyright Protection
Explainable AI (XAI) and Fairness
Text Mining in Healthcare Applications
Education
Ph.D. in Computer Science, 2021- now
Rice University
Ph.D. in Computer Engineering, 2019 -2021
Texas A&M University
BSc in Automation, 2014 -2019
Tsinghua University
Research Experiences
Microsoft Research, Redmond, WA, May. 2023 - Aug. 2023
Team: AI for Health Team
Role: Applied Scientist Intern
Project: Medical Dialogue Generation, Personal Identifying Information (PII) Detection
Mentor(s): Gord Lueck, Rodolfo Quispe, Huseyin Inan, Janardhan Kulkarni
Microsoft Research, Remote, May. 2022 - Aug. 2022
Team: AI for Health Team
Role: Research Intern
Project: Medical Dialogue Summarization, Privacy Risk Analysis in Large Language Models
Mentor(s): Gord Lueck, Rodolfo Quispe, Huseyin Inan, Janardhan Kulkarni
Adobe Research, Remote, May. 2021 - Aug. 2021
Team: Document Intelligence Team
Role: Research Intern
Project: Watermarking Computer Vision Foundation Models and APIs
Mentor(s): Curtis Wigington, Rajiv Jain
Microsoft Research Asia, Beijing, China, Mar. 2019 - May. 2019
Team: Social Computing Group
Role: Research Intern
Project: Interpretable Recommendation Systems
Mentor(s): Xiting Wang, Xing Xie
Duke University, Durham, NC, May. 2018 - Aug. 2018
Team: Research of Radiology
Role: Summer Intern
Project: Classification of Chest CT using Case-level Weak Supervision
Mentor: Joseph Y. Lo
News (2023)
10/2023: Our paper about using LLMs for patient-trial matching was selected as the AMIA 2023 Best Student Paper and KDDM Student Innovation Award.
10/2023: Our paper about building knowledge refinement and retrieval systems for interdisciplinarity in biomedical research has been selected as CIKM 2023 Best Demo Paper Honorable Mention.
10/2023: One paper has been accepted by EMNLP 2023. We introduce a membership inference attack aimed at Large Language Models to analyze associated privacy risks.
09/2023: Two papers have been accepted by NeurIPS 2023. We proposed a honeypot mechanism to defend against backdoor attacks in language models.
09/2023: Our paper "The Science of Detecting LLM Generated Texts" has been accepted by the Communications of the ACM. We provide an overview of existing LLM-generated text detection techniques and enhance the control and regulation of language generation models.
08/2023: One paper has been accepted by ECML-PKDD 2023. We proposed a serial key protection mechanism for safeguarding DNN models.
08/2023: Two papers have been accepted by CIKM 2023. We proposed a transferable watermark for defending against model extraction attacks.
07/2023: Three Papers have been accepted by AMIA 2023 Annual Symposium. We investigated methods for harnessing the capabilities of LLMs in several healthcare tasks, such as Named Entity Recognition, Relation Extraction, and Patient-Trial Matching.
05/2023: One paper has been accepted by ACL 2023. We reveal that LLMs are "lazy learners" that tend to exploit shortcuts in prompts for downstream tasks. Additionally, we uncover a surprising finding that larger models are more likely to utilize shortcuts in prompts during inference.
04/2023: One paper has been accepted by SIGKDD Exploration 2023. We proposed a clean-label backdoor-based watermarking framework for safeguarding training datasets.
03/2023: One paper has been accepted by ICHI 2023. We proposed a deep ensemble framework for improving phenotype prediction from multi-modal data.