Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Something that I didn't know, and may help others understand this paper better, is that there's a way to define the sorting a vector through the creative use of the Birkhoff–von Neumann theorem:

https://en.wikipedia.org/wiki/Doubly_stochastic_matrix

which is better explained here:

https://cs.stackexchange.com/questions/4805/sorting-as-a-lin...

where the sorting operation is defined as a linear program. Evidently, this has been known for at least half a century. That said, if a solution to a linear program can be found in a way that's differentiable, this means that the operation of sorting can be found to be differentiable as well. This appears to be trick in the paper and they appear to have a relatively fast way to compute this solution as well, which I think is interesting.



We did sorting-via-linear-programming as a simple exercise when I was studying maths in the mid 2000s. It was certainly not seen as a research grade problem.

In general if memory serves right, linear programming can solve every problem that's in P with some suitable linear time preprocessing. See https://en.wikipedia.org/wiki/P-complete for some background.

Inner point methods to solve linear programs should then give you the bridge to continuous domains.


While the roots have been known for a long time, my impression is that the key paper that started this line of thought was Marco Cuturi's NIPS 2013 paper "Sinkhorn Distances", which is, IMHO, a very nice read.


Certainly I may be missing something, but it seems like the advance in this series of papers is that they figured out a way to calculate a differentiable solution to the sorting problem quickly, whereas it was already known that the a differentiable solution already existed, no?


Ugh I tried to read the stackoverflow answer it immediately devolved into gobbledygook. Incredibly frustrating.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: