Publications

Matryoshka: Learning to Drive Black-Box LLMs with LLMs

Published in , 2024

The paper proposes a lightweight white-box LLM controller that guides large-scale black-box LLM generator by decomposing complex tasks. It can enhance capabilities of black-box LLMs in reasoning, planning and personalization tasks.

Download here

Training Transformers with 4-bit Integers.

Published in NeurIPS, 2023

The paper proposes a way to train Transformer architect models in 4-bit quantization. It can be implemented on current generation of GPU and is faster than FP16 counterparts

Download here