Skip to content

Commit dd5ad86

Browse files
committed
Squash commits
0 parents  commit dd5ad86

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+9383
-0
lines changed

.gitignore

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
.ipynb_checkpoints
2+
__pycache__
3+
*/__pycache__/
4+
models.zip
5+
/models/
6+
/data/
7+
/figs/
8+
9+
.DS_Store
10+
src/slurm*.out
11+
src/figs/
12+
src/*.png
13+
notebooks/figs/
14+
notebooks/**/figs/
15+
notebooks/**/*.png
16+
src/datagen/test/figs/
17+
src/conf/wandb.yaml
18+
scripts/*.png

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2025 HazyResearch
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

+63
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Towards Learning High-Precision Least Squares Algorithms with Sequence Models
2+
3+
![Prior work focuses on statistical least squares: Transformers approximate Bayes-optimal estimators (left, adapted from Garg et al. 2022). In this work, we focus on numerical least squares: Transformers struggle to obtain precise solutions (inset). Using a high-precision training recipe, we train two polynomial architectures to perform high-precision gradient descent iterates on least squares (right): applied iteratively, they reach $10^{-13}$ MSE.](assets/banner_fig.png)
4+
5+
This repository contains code for the following paper:
6+
7+
> **Towards Learning High-Precision Least Squares Algorithms with Sequence Models.**
8+
>
9+
> Jerry Liu, Jessica Grogan, Owen Dugan, Ashish Rao, Simran Arora, Atri Rudra, Chris Ré.
10+
> ICLR 2025.
11+
12+
## Dependencies
13+
Install dependencies with
14+
```
15+
conda create -n "precision-ls" python=3.10
16+
conda activate precision-ls
17+
pip install -r requirements.txt
18+
```
19+
20+
## Code structure
21+
The code is organized as follows:
22+
- `notebooks/`: contains notebooks for the experiments
23+
- `replication_scripts/`: contains scripts for running the experiments
24+
- `src/datagen/`: contains code for data generation (tasks and samplers)
25+
- `src/models/`: contains code for the models, including different sequence mixers
26+
- `src/schedulers/`: contains code for the learning rate schedulers, including the adaptive one used in the paper
27+
28+
## Running experiments
29+
To test the data generation, run
30+
```
31+
bash src/datagen/test.sh
32+
```
33+
34+
To run the least squares experiments, run
35+
```
36+
bash replication_scripts/least_squares.sh
37+
```
38+
39+
To run the experiments and ablations with linear algebra primitives, run
40+
```
41+
bash replication_scripts/primitives.sh
42+
```
43+
44+
To run the experiments and ablations with the explicit gradient and k-iterate gradient descent tasks, run
45+
```
46+
bash replication_scripts/explicit_gradient.sh
47+
```
48+
49+
To run the experiments with in-context ODEs, run
50+
```
51+
bash replication_scripts/odes.sh
52+
```
53+
54+
## Citation
55+
If you find this work useful, please cite it as follows:
56+
```
57+
@article{TODO,
58+
title={TODO},
59+
author={TODO},
60+
journal={TODO},
61+
year={TODO},
62+
}
63+
```

assets/banner_fig.png

602 KB
Loading

0 commit comments

Comments
 (0)