Online List Labeling with Predictions: What's wrong with LearnedLLA

December 05, 2023

Learning Augmented Algorithms Data Structures Research Paper Review

For a detailed explanation of the LearnedLLA, take a look at my previous blog post here.

LearnedLLA: Summarizing The Results

Here’s the gist:

With arbitrary predictions, their insertion algorithm achieves an $O(\log^2 \eta)$ amortized cost per insertion.
With perfect predictions (every predicted rank is spot on), the cost drops to $O(1)$ .
With maximally bad predictions (as far off as possible), the cost falls back to $O(\log^2 n)$ , same as classic, non-predictive LLA.

They went a step further and proved a stronger result:

When prediction errors come from an unknown distribution with mean $\mu$ and variance $s^2$ , the expected cost is $O(\log^2{(|\mu| + s^2)})$ .
In plain terms: $O(1)$ if the mean and variance are constant.

And more generally:

For any LLA insertion algorithm with an amortized cost of $C(n)$ (where $C$ is a reasonable, admissible list labeling cost function), plugging it into the LearnedLLA model as a black box yields an amortized insertion cost of $O(C(\eta))$ .

Where did LearnedLLA fall short?

Even if only a few predictions are maximally bad while most are optimal, their algorithm still incurs $O(\log^2{n})$ amortized.

This appears somewhat counterintuitive, as one bad prediction should not cause the performance of LearnedLLA to degrade by a factor of $\log^2(n)$

A simple example:

Consider a LearnedLLA instance that spans ranks $1, \ldots, 4$ . Now, consider the following insertion sequence of tuples $(\text{value}, \text{predicted rank})$ :

(1, 4) \rightarrow (2, 2) \rightarrow (3, 3) \rightarrow (4, 4)

In our example, the values 1, 2, 3 would all go in $P_4$ , and then we would have to merge $P_3, P_4$ when inserting value 4 ( $P_4$ would exceed the density of $\frac{1}{2}$ ).

Let $\eta$ and $\bar{\eta}$ denote the maximum error and the arithmetic mean of the error, respectively. In our simple example, we have:

$\eta = 3$ , $\bar{\eta} = 1$ .

A More General Example

For any LearnedLLA instance that spans $n$ elements, consider an insertion sequence of tuples $\{ (x_i, \hat{r}_x) \}$ , such that the smallest element $x_i$ has the largest predicted rank $n$ , while the rest of the predictions are exact. In other words, we have one prediction far off — causing the maximal error $\eta$ to be $n - 1$ — and the rest of the predictions are perfect.

In that case, we have:

\eta = n - 1, \qquad \bar{\eta} = \frac{0 + \eta}{n} = \frac{n - 1}{n} < 1 = O(1).

Yet, our LearnedLLA insertion cost is $O(C(\eta))$ .

My Thoughts!

I’ve been struggling with this question for the past year. It only feels natural that we can do better. Maybe there’s a smarter algorithm out there — perhaps a randomized LLA algorithm or a new framework that relies on the average error instead.

References

McCauley, S., Moseley, B., Niaparast, A., & Singh, S. (2023). Online List Labeling with Predictions. arXiv:2305.10536
Bender, Michael A., Alex Conway, Martín Farach-Colton, Hanna Komlós, Michal Koucký, William Kuszmaul, and Michael Saks. Nearly Optimal List Labeling. arXiv, 2024. https://doi.org/10.48550/arXiv.2405.00807
Bender, M. A., Demaine, E. D., & Farach-Colton, M. (2000). Cache-oblivious B-trees. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science (pp. 399–409). IEEE.