CV
教育经历
学校名 | 专业 | 时间 |
---|---|---|
清华大学 | 计算机科学与技术 | 2013.09 - 2017.07 |
Education
University | Major | Period |
---|---|---|
Tsinghua University | Computer Science and Technology | 2013.09 - 2017.07 |
工作经历
安徽华米信息科技有限公司 - AI大模型专家 (2024.11 - 至今)
- 主导Amazfit智能手表语音助手Flow的AI云端开发,实现基于大模型的智能交互功能
- 负责产品功能定义与技术方案设计,协调跨部门团队完成需求落地
- 构建大模型评估体系,优化模型性能指标30%以上
- 开发自动化测试框架,提升模型迭代效率40%
北京原子回声智能科技有限公司 - 大模型负责人&联合创始人 (2022.04 - 至今)
- 主导研发Atom-1B/7B/13B系列大语言模型,覆盖轻量到通用场景需求
- 建立200TB+中文语料清洗体系,完成多轮数据去重与敏感信息过滤
- 构建从数据清洗到模型部署的全流程技术体系,提升团队研发效率30%+
- 作为Llama中文社区联合发起人(GitHub星标14k+),贡献核心开源代码与中文预训练方案
北京循环智能科技有限公司 - 算法研究员 (2020.03 - 2022.04)
- 研发自然语言处理算法,优化文本分类与实体识别模型准确率15%
- 设计大模型训练框架,支持千亿参数模型的分布式训练
- 主导多个企业级NLP项目落地,包括智能客服、内容审核等场景
Work Experience
Huami Technology - AI Large Model Expert (2024.11 - Present)
- Led AI cloud development for Amazfit smart watch voice assistant Flow, implementing LLM-based intelligent interaction
- Designed product features and technical solutions, coordinating cross-functional teams for implementation
- Built model evaluation system, improving key performance metrics by 30%
- Developed automated testing framework, increasing model iteration efficiency by 40%
Atom Echo - Large Model Lead & Co-founder (2022.04 - Present)
- Spearheaded development of Atom-1B/7B/13B LLM series for diverse application scenarios
- Established 200TB+ Chinese corpus cleaning pipeline with deduplication and sensitive data filtering
- Built end-to-end technical framework from data to deployment, boosting team efficiency by 30%
- As co-founder of Llama Chinese community (14k+ GitHub stars), contributed core code and Chinese pretraining solutions
Recurrent AI - Algorithm Researcher (2020.03 - 2022.04)
- Developed NLP algorithms, improving text classification and NER accuracy by 15%
- Designed large model training framework supporting distributed training of 100B+ parameter models
- Led implementation of enterprise NLP projects including smart customer service and content moderation
项目经历
Atom系列大语言模型
- 主导研发Atom-1B/7B/13B等模型,覆盖轻量到通用场景需求
- 建立200TB+中文语料清洗体系,完成多轮数据去重、敏感信息过滤
- 构建推理服务引擎,支持32K上下文长度,实现高效部署
- 模型HuggingFace下载量超20万次,落地知识库报告生成、内容生成等多场景
AskOnce个性化知识库产品
- 基于Atom大模型的企业级知识管理平台
- 支持PDF/Word/Excel等多格式文档解析,构建动态知识库索引
- 开发自然语言交互界面,响应时效压缩至秒级
- 建立标准化知识萃取流程,员工问题解决效率提升60%+
宝马企业内部资料问答系统
- 实现企业文档自动化归纳分类与结构化处理
- 开发定制化检索增强生成(RAG)架构,答案准确率达行业领先水平
- 支持与企业OA/CRM系统集成,形成内部知识流转闭环
涉密项目内部资料问答系统
- 实现日均百万级多语言涉密文档自动化解析
- 集成涉密数据分级管控机制,保障敏感信息处理合规
- 建立数据溯源追踪系统,完整记录文档处理全链路日志
通用数据治理算法能力平台
- 支持文本/音频双模态数据治理全流程
- 集成20+核心算法能力:文本分类/实体抽取/翻译/合规检测等
- 覆盖金融、法律、医疗等多领域数据治理需求
分类驱动电销实时话术推荐系统
- 预处理百万级历史对话文本,提取500+高频应答话术
- 实现毫秒级话术推荐响应,减少人工决策时间30%
- 落地电销外呼及客服IM场景,提升沟通效率与标准化水平
Projects
Atom Series Large Language Models
- Led development of Atom-1B/7B/13B models covering lightweight to general scenarios
- Established 200TB+ Chinese corpus cleaning system with deduplication and sensitive info filtering
- Built inference service engine supporting 32K context length for efficient deployment
- Models downloaded over 200,000 times on HuggingFace, applied in report generation and content creation
AskOnce Personalized Knowledge Base
- Enterprise knowledge management platform based on Atom LLM
- Supports multi-format document parsing (PDF/Word/Excel) with dynamic indexing
- Developed natural language interface with second-level response time
- Established standardized knowledge extraction process, improving efficiency by 60%+
BMW Internal Q&A System
- Automated classification and structuring of corporate documents
- Developed custom RAG architecture with industry-leading accuracy
- Integrated with OA/CRM systems to form knowledge circulation loop
Confidential Project Q&A System
- Processed millions of multilingual confidential documents daily
- Implemented multi-level data security controls for sensitive information
- Established full-chain audit logging for document processing
General Data Governance Platform
- End-to-end governance for text/audio multimodal data
- Integrated 20+ core algorithms: text classification, entity extraction, translation, etc.
- Served finance, legal, healthcare and other industries
Sales Conversation Recommendation System
- Processed millions of historical sales dialogues to extract 500+ responses
- Achieved millisecond-level recommendation response, reducing decision time by 30%
- Deployed in outbound sales and customer service scenarios
技能
- 自然语言处理
- 大语言模型训练与调优
- 多模态模型开发
- 团队管理与协作
Skills
- Natural Language Processing
- Large Language Model Training & Fine-tuning
- Multimodal Model Development
- Team Management & Collaboration
出版物
- Yongle Li, Zheng Zhang, 等. AtomTool: Empowering Large Language Models with Tool Utilization Skills. PRCV 2024.
- Yongle Li, Bo Liu, Sheng Huang, Zheng Zhang, Xiaotong Yuan, Richang Hong. Communication-Efficient and Personalized Federated LLM Fine-Tuning via Tri-Matrix Adaptation.
- 张俊祺、张正等:《Llama 大模型实践指南》电子工业出版社,2024
- Zijun Chen, Zheng Zhang, 等. Unveiling Uncertainty: Calibration of Multimodal LLMs.
Publications
- Yongle Li, Zheng Zhang, et al. AtomTool: Empowering Large Language Models with Tool Utilization Skills. PRCV 2024.
- Yongle Li, Bo Liu, Sheng Huang, Zheng Zhang, Xiaotong Yuan, Richang Hong. Communication-Efficient and Personalized Federated LLM Fine-Tuning via Tri-Matrix Adaptation.
- Junqi Zhang, Zheng Zhang, et al. “Llama Large Model Practice Guide” Publishing House of Electronics Industry, 2024
- Zijun Chen, Zheng Zhang, et al. Unveiling Uncertainty: Calibration of Multimodal LLMs.