Thumbnail for Online List Labeling with Predictions: What's wrong with LearnedLLA

Online List Labeling with Predictions: What's wrong with LearnedLLA

June 23, 2025

For a detailed explanation of the LearnedLLA, take a look at my previous blog post here.

LearnedLLA: Summarizing The Results

Here’s the gist:

  • With arbitrary predictions, their insertion algorithm achieves an O(log2η)O(\log^2 \eta) amortized cost per insertion.
  • With perfect predictions (every predicted rank is spot on), the cost drops to O(1)O(1).
  • With maximally bad predictions (as far off as possible), the cost falls back to O(log2n)O(\log^2 n), same as classic, non-predictive LLA.

They went a step further and proved a stronger result:

  • When prediction errors come from an unknown distribution with mean μ\mu and variance s2s^2, the expected cost is O(log2(μ+s2))O(\log^2{(|\mu| + s^2)}).
    In plain terms: O(1)O(1) if the mean and variance are constant.

And more generally:

  • For any LLA insertion algorithm with an amortized cost of C(n)C(n) (where CC is a reasonable, admissible list labeling cost function), plugging it into the LearnedLLA model as a black box yields an amortized insertion cost of O(C(η))O(C(\eta)).

Where did LearnedLLA fall short?

Even if only a few predictions are maximally bad while most are optimal, their algorithm still incurs O(log2n)O(\log^2{n}) amortized.

This appears somewhat counterintuitive, as one bad prediction should not cause the performance of LearnedLLA to degrade by a factor of log2(n)\log^2(n)

A simple example:

Consider a LearnedLLA instance that spans ranks 1,,41, \ldots, 4. Now, consider the following insertion sequence of tuples (value,predicted rank)(\text{value}, \text{predicted rank}):

(1,4)(2,2)(3,3)(4,4)(1, 4) \rightarrow (2, 2) \rightarrow (3, 3) \rightarrow (4, 4)

In our example, the values 1, 2, 3 would all go in P4P_4, and then we would have to merge P3,P4P_3, P_4 when inserting value 4 (P4P_4 would exceed the density of 12\frac{1}{2}).

Let η\eta and ηˉ\bar{\eta} denote the maximum error and the arithmetic mean of the error, respectively. In our simple example, we have:

η=3\eta = 3, ηˉ=1\bar{\eta} = 1.

A More General Example

For any LearnedLLA instance that spans nn elements, consider an insertion sequence of tuples {(xi,r^x)}\{ (x_i, \hat{r}_x) \}, such that the smallest element xix_i has the largest predicted rank nn, while the rest of the predictions are exact. In other words, we have one prediction far off — causing the maximal error η\eta to be n1n - 1 — and the rest of the predictions are perfect.

In that case, we have:

η=n1,ηˉ=0+ηn=n1n<1=O(1).\eta = n - 1, \qquad \bar{\eta} = \frac{0 + \eta}{n} = \frac{n - 1}{n} < 1 = O(1).

Yet, our LearnedLLA insertion cost is O(C(η))O(C(\eta)).

My Thoughts!

I’ve been struggling with this question for the past year. It only feels natural that we can do better. Maybe there’s a smarter algorithm out there — perhaps a randomized LLA algorithm or a new framework that relies on the average error instead.

References

  1. McCauley, S., Moseley, B., Niaparast, A., & Singh, S. (2023). Online List Labeling with Predictions. arXiv:2305.10536

  2. Bender, Michael A., Alex Conway, Martín Farach-Colton, Hanna Komlós, Michal Koucký, William Kuszmaul, and Michael Saks. Nearly Optimal List Labeling. arXiv, 2024. https://doi.org/10.48550/arXiv.2405.00807

  3. Bender, M. A., Demaine, E. D., & Farach-Colton, M. (2000). Cache-oblivious B-trees. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science (pp. 399–409). IEEE.