benty-fields - Search paper

1601. Layer-Wise Evolution of Representations in Fine-Tuned Transformers: Insights from Sparse AutoEncoders

Suneel Nadipalli

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.16722v1

Vote

Add to Library

Recommend

1602. Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective

Weizhong Huang, Yuxin Zhang, Xiawu Zheng, Fei Chao, Rongrong Ji

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.14770v1

Vote

Add to Library

Recommend

1603. Exploring RWKV for Sentence Embeddings: Layer-wise Analysis and Baseline Comparison for Semantic Similarity

Xinghan Pan

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.14620v1

Vote

Add to Library

Recommend

1604. Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning

Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, Wang Chen, Wei Zhang, Anh Tuan Luu

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.14356v1

Vote

Add to Library

Recommend

1605. No-reference geometry quality assessment for colorless point clouds via list-wise rank learning

Zheng Li, Bingxu Xie, Chao Chu, Weiqing Li, Zhiyong Su

Computers & Graphics, 127, 104176 (2025)

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11726v1

Vote

Add to Library

Recommend

1606. DreamDDP: Accelerating Data Parallel Distributed LLM Training with Layer-wise Scheduled Partial Synchronization

Zhenheng Tang, Zichen Tang, Junlin Huang, Xinglin Pan, Rudan Yan, Yuxin Wang, Amelie Chi Zhou, Shaohuai Shi et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11058v1

Vote

Add to Library

Recommend

1607. FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching

Hui Wang, Shujie Liu, Lingwei Meng, Jinyu Li, Yifan Yang, Shiwan Zhao, Haiyang Sun, Yanqing Liu et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11128v1

Vote

Add to Library

Recommend

1608. LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy

Zhiwen Ruan, Yixia Li, He Zhu, Longyue Wang, Weihua Luo, Kaifu Zhang, Yun Chen, Guanhua Chen

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11405v1

Vote

Add to Library

Recommend

1609. Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation

Toshinori Kitamura, Arnob Ghosh, Tadashi Kozuno, Wataru Kumagai, Kazumi Kasaura, Kenta Hoshino, Yohei Hosoe, Yutaka Matsuo

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.10138v2

Vote

Add to Library

Recommend

1610. Multi-agent systems with multiple-wise interaction: Propagation of chaos and macroscopic limit

Thierry Paul, Stefano Rossi, Emmanuel Trélat, Eth Zurich

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.09098v1

Vote

Add to Library

Recommend

1611. PLayer-FL: A Principled Approach to Personalized Layer-wise Cross-Silo Federated Learning

Ahmed Elhussein, Gamze Gürsoy

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.08829v1

Vote

Add to Library

Recommend

1612. Non-Monetary Mechanism Design without Distributional Information: Using Scarce Audits Wisely

Yan Dai, Moise Blanchard, Patrick Jaillet

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.08412v2

We study a repeated resource allocation problem with strategic agents where monetary transfers are disallowed and the central planner has no prior information on agents' utility distributions. In light of Arrow's impossibility theorem, acquiring information about agent preferences through some form of feedback is necessary. We assume that the central planner can request powerful but expensive audits on the winner in any round, revealing the true utility of the winner in that round. We design a mechanism achieving $T$-independent $O(K^2)$ social welfare regret while only requesting $O(K^3 \log T)$ audits in expectation, where $K$ is the number of agents and $T$ is the number of rounds. We also show an $\Omega(K)$ lower bound on the regret and an $\Omega(1)$ lower bound on the number of audits when having low regret. Algorithmically, we show that incentive-compatibility can be mostly enforced via the imposition of adaptive future punishments, where the audit probability is inversely proportional to the winner's future winning probability. To accurately estimate such probabilities in presence of strategic agents, who may adversely react to any potential misestimate, we introduce a flagging component that allows agents to flag any biased estimate (we show that doing so aligns with individual incentives). On the technical side, without a unique and known distribution, one cannot apply the revelation principle and conclude that truthful reporting is exactly an equilibrium. Instead, we characterize the equilibrium via a reduction to a simpler auxiliary game, in which agents cannot strategize until close to the end of the game; we show equilibria in this game can induce equilibria in the actual, fully strategic game. The tools developed therein may be of independent interest for other mechanism design problems in which the revelation principle cannot be readily applied.
Authors' comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2025

Vote

Add to Library

Recommend

1613. ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources

Jason Wu, Kang Yang, Lance Kaplan, Mani Srivastava

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.07862v1

Vote

Add to Library

Recommend

1614. Column-wise Quantization of Weights and Partial Sums for Accurate and Efficient Compute-In-Memory Accelerators

Jiyoon Kim, Kang Eun Jeon, Yulhwa Kim, Jong Hwan Ko

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.07842v2

Vote

Add to Library

Recommend

1615. Learning Accurate, Efficient, and Interpretable MLPs on Multiplex Graphs via Node-wise Multi-View Ensemble Distillation

Yunhui Liu, Zhen Tao, Xiang Zhao, Jianhua Zhao, Tao Zheng, Tieke He

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.05864v1

Vote

Add to Library

Recommend

1616. Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models

Jialiang Wu, Yi Shen, Sijia Liu, Yi Tang, Sen Song, Xiaoyi Wang, Longjun Cai

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.03199v1

Vote

Add to Library

Recommend

1617. Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection

Yassine El Kheir, Youness Samih, Suraj Maharjan, Tim Polzehl, Sebastian Möller

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.03559v2

Vote

Add to Library

Recommend

1618. Detection and estimation of vertex-wise latent position shifts across networks

Runbing Zheng

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.01947v1

Vote

Add to Library

Recommend

1619. On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning

Thomas T. Zhang, Behrad Moniri, Ansh Nagwekar, Faraz Rahman, Anton Xue, Hamed Hassani, Nikolai Matni

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.01763v1

Vote

Add to Library

Recommend

1620. Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss

Sangyeon Park, Isaac Han, Seungwon Oh, Kyung-Joong Kim

Proceedings of the 42 nd International Conference on Machine Learning (ICML), 2025

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.01342v2

Vote

Add to Library

Recommend

Benty-search

1601. Layer-Wise Evolution of Representations in Fine-Tuned Transformers: Insights from Sparse AutoEncoders

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.16722v1

1602. Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.14770v1

1603. Exploring RWKV for Sentence Embeddings: Layer-wise Analysis and Baseline Comparison for Semantic Similarity

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.14620v1

1604. Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.14356v1

1605. No-reference geometry quality assessment for colorless point clouds via list-wise rank learning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.11726v1

1606. DreamDDP: Accelerating Data Parallel Distributed LLM Training with Layer-wise Scheduled Partial Synchronization

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.11058v1

1607. FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.11128v1

1608. LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.11405v1

1609. Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.10138v2

1610. Multi-agent systems with multiple-wise interaction: Propagation of chaos and macroscopic limit

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.09098v1

1611. PLayer-FL: A Principled Approach to Personalized Layer-wise Cross-Silo Federated Learning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.08829v1

1612. Non-Monetary Mechanism Design without Distributional Information: Using Scarce Audits Wisely

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.08412v2

1613. ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.07862v1

1614. Column-wise Quantization of Weights and Partial Sums for Accurate and Efficient Compute-In-Memory Accelerators

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.07842v2

1615. Learning Accurate, Efficient, and Interpretable MLPs on Multiplex Graphs via Node-wise Multi-View Ensemble Distillation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.05864v1

1616. Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.03199v1

1617. Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.03559v2

1618. Detection and estimation of vertex-wise latent position shifts across networks

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.01947v1

1619. On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.01763v1

1620. Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2502.01342v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.16722v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.14770v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.14620v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.14356v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11726v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11058v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11128v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.11405v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.10138v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.09098v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.08829v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.08412v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.07862v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.07842v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.05864v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.03199v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.03559v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.01947v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.01763v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2502.01342v2