YuyaoGe's Website
YuyaoGe's Website
About
Highlight Papers
Experience
Gallery
Projects
Posts
Light
Dark
Automatic
Home
Tags
Reasoning
Reasoning
Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models
We propose PRISM-Δ, a differential subspace steering method for prompt highlighting that matches or exceeds the best existing method on 19 of 20 configurations with relative gains up to +10.6%, while halving the fluency cost.
Yuyao Ge 葛钰峣
,
Shenghua Liu
,
Yiwei Wang
,
Tianyu Liu
,
Baolong Bi
,
Lingrui Mei
,
Jiayu Yao
,
Jiafeng Guo
,
Xueqi Cheng
Cite
PDF
arXiv
Hugging Face
GitHub
Project
Ask KIMI
Do Large Language Models Already Know the Answer Before They Finish Thinking?
Probing hidden states during reasoning reveals that LLMs already know the answer before finishing thinking. We detect overthinking via ‘jumps’ and intervene during inference to improve reasoning.
Yuyao Ge 葛钰峣
,
Shenghua Liu
,
Yiwei Wang
,
Tianyu Liu
,
Lingrui Mei
,
Baolong Bi
,
Jiayuan Guo
,
Jiayu Yao
,
Jiafeng Guo
,
Xueqi Cheng
Code
PDF
PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding
Abstract: We present PromptCD, a test-time method for controlling LLM behavior without additional training. The approach creates paired positive and negative guiding prompts for a target behavior and contrasts model responses at the token-probability level for LLMs and through visual attention patterns for VLMs.
Baolong Bi
,
Yuyao Ge 葛钰峣
,
Shenghua Liu
,
Yuchen He
,
Siqian Tong
,
Lizhe Chen
,
Lingrui Mei
,
Zehao Li
,
Yiwei Wang
,
Yujun Cai
,
Ming-Hsuan Yang
,
Xueqi Cheng
Cite
PDF
arXiv
Ask KIMI
Reward and Guidance through Rubrics: Promoting Exploration to Improve Multi-Domain Reasoning
Abstract: Reinforcement learning (RL) has shown great promise in enhancing LLM reasoning, but current approaches mainly focus on single domains with verifiable rewards. We propose RGR-GRPO, a rubric-driven RL framework for multi-domain reasoning that uses rubrics to provide fine-grained reward signals and offline guidance.
Baolong Bi
,
Shenghua Liu
,
Yiwei Wang
,
Siqian Tong
,
Lingrui Mei
,
Yuyao Ge 葛钰峣
,
Yilong Xu
,
Jiafeng Guo
,
Xueqi Cheng
Cite
PDF
arXiv
Ask KIMI
A Survey of Vibe Coding with Large Language Models
The advancement of large language models (LLMs) has catalyzed a paradigm shift from code generation assistance to autonomous coding …
Yuyao Ge 葛钰峣
,
Lingrui Mei
,
Zenghao Duan
,
Tianhao Li
,
Yujia Zheng
,
Yiwei Wang
,
Lexin Wang
,
Jiayu Yao
,
Tianyu Liu
,
Yujun Cai
,
Baolong Bi
,
Fangda Guo
,
Jiafeng Guo
,
Shenghua Liu
,
Xueqi Cheng
Cite
PDF
arXiv
Hugging Face
Ask KIMI
Focusing by Contrastive Attention: Enhancing VLMs' Visual Reasoning
Vision-Language Models (VLMs) have demonstrated remarkable success across diverse visual tasks, yet their performance degrades in …
Yuyao Ge 葛钰峣
,
Shenghua Liu
,
Yiwei Wang
,
Lingrui Mei
,
Baolong Bi
,
Xuanshan Zhou
,
Jiayu Yao
,
Jiafeng Guo
,
Xueqi Cheng
Cite
PDF
arXiv
Ask KIMI
PaperWeekly
Can Graph Descriptive Order Affect Solving Graph Problems with LLMs?
We present the first comprehensive analysis of how the order of graph descriptions impacts LLM performance, evaluating four graph description orders across six graph problems using six mainstream LLMs.
Yuyao Ge 葛钰峣
,
Shenghua Liu
,
Baolong Bi
,
Yiwei Wang
,
Lingrui Mei
,
Wenjie Feng
,
Lizhe Chen
,
Xueqi Cheng
Cite
Slides
Video
PDF
ACL Anthology
Ask KIMI
Poster
GitHub
a1: Steep Test-time Scaling Law via Environment Augmented Generation
Large Language Models (LLMs) have made remarkable breakthroughs in reasoning, yet continue to struggle with hallucinations, logical …
Lingrui Mei
,
Shenghua Liu
,
Yiwei Wang
,
Baolong Bi
,
Yuyao Ge 葛钰峣
,
Jun Wan
,
Yurong Wu
,
Xueqi Cheng
Cite
PDF
Ask KIMI
Innate Reasoning is Not Enough : In-Context Learning Enhances Reasoning Large Language Models with Less Overthinking
We present the first comprehensive analysis of the impacts of CoT prompting on Reasoning LLMs, finding that one-shot CoT consistently enhances performance and reduces excessive reflections by approximately 90%.
Yuyao Ge 葛钰峣
,
Shenghua Liu
,
Yiwei Wang
,
Lingrui Mei
,
Lizhe Chen
,
Baolong Bi
,
Xueqi Cheng
Cite
PDF
arXiv
Hugging Face
Ask KIMI
EMNLP2024论文分享 | Fewer is More:CoT示例要少而精
作者提出CoT-Influx方法,一种对CoT的示例和内容进行优化从而提高LLMs推理能力的方法,其核心思想是通过剪枝最大化有效信息的输入。
Yuyao Ge 葛钰峣
Oct 24, 2024
2 min read
论文分享
论文解读 | TTA:大模型回答置信度评估新方法
本文提出了一种新的方法,全面评估大模型多个候选答案的可信度,以减轻大模型对于错误答案的过度自信。
Yuyao Ge 葛钰峣
Mar 25, 2024
2 min read
论文分享
论文解读 | 3月最新用于游戏的大模型Agent综述
3月最新用于游戏的大模型Agent综述
Yuyao Ge 葛钰峣
Mar 21, 2024
1 min read
论文分享
论文解读 | Auto CoT——利用聚类自动生成CoT
在过去CoT有两种范式,一种是Zero-shot,在问题末尾添加"Let’s think step by step"。另一种Manual CoT(Few-shot CoT),每个例子由问题和推理链组成。第二种方法表现是否好取决于CoT写的好不好,不过这需要人手工来写。本文通过提出Auto-CoT这一方法使得Few-shot CoT可以自动生成,解放双手!
Yuyao Ge 葛钰峣
Mar 2, 2024
1 min read
论文分享
论文解读 | 思维链越长大模型越聪明?
思维链(Chain of thought - CoT)在过去的实践中已经证明对提升大模型的推理能力有显著帮助。然而,目前还没有一项工作解释思维链长度与推理能力之间的关系。本文围绕这一核心问题,围绕CoT做了系统实验,并给出许多有意思和反直觉的结论。
Yuyao Ge 葛钰峣
Feb 26, 2024
1 min read
论文分享
论文解读 | Graph-Guided Reasoning for Multi-Hop Question Answering in Large Language Models
提出了一种基于大模型的图引导的面向多步推理问题的推理方式。本文的主要贡献有两点:提出上述推理方式,提出允许变量定义的用于知识三元组提取的上下文学习方法
Yuyao Ge 葛钰峣
Nov 20, 2023
2 min read
论文分享
论文解读 | ReAct——LLM推理范式 推理+动作
LLM ReAct范式,在大语言模型中结合推理和动作
Yuyao Ge 葛钰峣
Oct 27, 2023
1 min read
论文分享
Cite
×