> As a non Rust man, how real are the problems in this article?
Real, but of more concern to folks designing widely-used libraries than to folks using said libraries.
> Anyone can give me a good read what Traits even are?
You can think of traits as analogous to interfaces in OOP languages (i.e. pure virtual abstract classes in C++ terminology).
They just define a set of methods that types can implement to conform to the trait, and then consumers can treat implementing types as if they were the trait.
The major differences are: traits are implemented outside the actual type implementation, so arbitrary trait implementations can be added after the type has been written (this is why we need coherence), and rust uses traits as compile-time bounds for generics (templates).
> Wikipedia only exsists because they refuse to sell out
Technically, they already did a long time ago. Jimmy Wales spun up a for-profit arm using the wikipedia tech stack, and it's now everyone's least favourite ad-ridden pop culture wiki[1]...
Don’t worry, it’ll come back in a couple of months. Not sure if has a timeout, or if it gets reset by app updates, but that checkbox is only sticky enough to gaslight you into thinking it works
I’m often annoyed at the 10 second timeout when installing Firefox extensions - 24 hours is beyond egregious. Telling me to come back tomorrow to install software on a device I own is a giant “fuck you”. Pretty sure I’d rather they banned side loading outright than this
The "essentially static hosting" isn't the cost centre (although with 5 million MAU, it's nothing to sneeze at). The real costs are on the input side - they have an ingestion pipeline that ensures standardised paper formatting and so on, plus at least some degree of human review.
No, I mean that the pipeline requires software engineers to build/maintain, and salaries are (as in basically every tech organisation) the dominant cost
Make it an external service then, and leave the thing that's already working great to just be.
The reason authors like and use arxiv is that it gives 1) a timestamp, 2) a standardized citable ID, and 3) stable hosting of the pdf. And readers like the no-nonsense single click download of the pdf and a barebones consistent website look.
You have to keep in mind that an increasing portion of their time and labor is going towards moderation and filtering due to a mass influx of nonsensical AI generated papers, non-academic numerology-tier hackery, and other useless drivel.
Spinning the service off forces other the labor out onto other universities rather than leaving them to solely Cornell
Is the problem the storage cost for hosting them, the HDDs? I'm sure they can be offloaded to cold storage because most of that slop won't be opened by anyone.
Arxiv doesn't need moderation. Nobody is asking for Arxiv moderation. It needs minimal checks to remove overtly illegal content.
When you stop moderating input, that's when someone builds a fuse filesystem on top of it. We had those for discord (dsfs), twitterfs, redditfs, yt-media-storage, etc. It's also when someone starts using it to distribute malware, like websites built on a combination of GitHub and a cdn.
We are talking about a different kind of moderation. People want to filter out incorrect information that in their opinion damages the reputation of Arxiv, eg covid stuff. It's not about dumping binary data.
This is a motte and bailey fallacy. The real question is about moderation with the goal of checking truth and the scientific content. Obviously illegal content and ddos type overloading attacks need to be blocked.
Very different philosophies are clashing here. Arxiv came about in an age of different zeitgeist. We may never get back to that moment.
> Is the problem the storage cost for hosting them, the HDDs?
No. Around half the cost is infrastructure. The other half of the cost is people. i.e. engineers to maintain infra and build mod tools for moderators to operate.
> Arxiv doesn't need moderation. Nobody is asking for Arxiv moderation.
This is just not true. Tons of people ask for arxiv to have moderation. Especially since covid, etc when antivaxxers and alternative medicine peddlers started trying to pump the medical categories of arxiv with quack science preprints and then go on to use the arxiv preprint and its DOI to take advantage of non academics who don't really understand what arxiv is other than it looks vaguely like a journal.
And doubly so now that people keep submitting AI generated slop papers to the service trying to flood the different categories so they can pad their resumes or CVs. And on top of that people who don't actually understand the fields they are trying to write papers in using AI to generate "innovative papers" that are completely nonsensical but vaguely parroting the terms of art.
The only reason you don't see more people calling for arxiv moderation is because they already spend so much time on it. If they were to stop moderating the site it would overflow into an absolute nightmare of garbage near overnight. And people wouldn't be upset with the users uploading this of course, they'd be upset with arxiv for failing to take action.
Moderation is inherently unappreciated because in the ideal form it should be effectively invisible (which arxiv's mostly is).
If you want to see the type of stuff that arxiv keeps out, go over to ViXrA [1] or you can watch k-theory's video [2] having fun digging through some of the quality posts that live over on that site.
The PDF formatting is all but standardised. They ingest LaTeX sources, which is formatted according to the authors' whims (most likely, according to whatever journal or conference they just submitted the manuscript to).
I'll concede that the (relatively novel) HTML formatter gives paper a more uniform appearance. They also integrate a bunch of external services for e.g., citation metrics and cross-references. Still hard to justify such a high cost to operate, but eh.
Also, the "human review" is a simple moderation process [1]. It usually does not dig into the submission's scientific merits.
> raised concerns about the proposed $300,000 salary for arXiv’s new CEO, saying it seemed high
Is a mid-to-high engineering salary outlandish for a CEO of what is likely to be a fairly major non-profit? Even non-profits have to be somewhat competitive when it comes to salary, and the ideal candidate is likely someone who would be balancing this against a tenured position at a major university
Even in the states, it’s more a distortion caused by the big tech centres. A software engineer in Ohio doesn’t command that kind of salary, but in San Francisco or Seattle that’ll buy you a moderately-senior engineer.
And while academic salaries are generally not great, tenured professors at big universities tend to make a fair bit (plus a lot more vacation time and perks than is normal in the US)
> A software engineer in Ohio doesn’t command that kind of salary, but in San Francisco or Seattle that’ll buy you a moderately-senior engineer.
On the other hand, a CEO of a well-known nonprofit might command that kind of salary in Ohio. People often underestimate how much the leaders of nonprofits pay themselves.
I'm not entirely convinced that this is entirely some sort of widespread bad behavior. Many non-profit boards conduct research on salaries and essentially size their organization and pay something akin to a market rate for the given size and scope.
However, even a small percentage of bad actors finding a way to inflate their salaries will, as a side effect, inflate salaries across the board because it influences the process that sets the salaries for the honest organizations.
I suspect abuse is more prevalent at the low end, among nonprofits that don’t do much.
I stand by the point of my original post: People often underestimate how much the leaders of nonprofits pay themselves. These are figures you can look up and quiz your friends to test the hypothesis, if they’re into that sort of thing. For a good time include some nonprofit hospitals.
That's fair, but the boards of nonprofits are as corruptible (I'm reluctant to use that word since we're talking about fairly standard practices, not outright crime, but whatever) as those in the corporate world. But I wouldn't want to keep talking about this situation as if it's all theoretical. In contrast with a lot of the corporate world, with nonprofits you can just go and look at what their officers are paid (it's public record) and decide for yourself what you feel about the figures.
Sure, but the cost of living there is significantly higher as well. Anyway, I can hardly even comprehend these kinds of sums, though I am a bit of an outlier, as I earn around $27,700 as an SWE in Europe, which is low even by the standards of companies in my own country.
> Sure, but the cost of living there is significantly higher as well.
The US is huge though, and the cost of living is astronomically lower outside of those big tech hub cities. I live in a tiny town in the midwest with a big house and a big yard that we bought for $89k USD in 2016[†]. I'm able to support myself and my wife comfortably on just my (self-employed) SWE salary.
[†] Real estate inflation index for our area says the house would have cost us around $130-$150k USD in 2026.
Silicon Valley is the only place in the United States where $300K is even close to the "middle" of anything.
I just moved to SV a few months ago from the Midwest (and not a particularly cheap part of it). Telling my coworkers who aren't from the US what a house costs in Wisconsin, you'd have thought I was the one who moved from a foreign country.
As a datapoint, I get paid just under 250k/yr and I'm an above average developer in his very late career, at a midwest company. 300k avg for SV is about right.
The local college and medical administrators are the ones that own the mansions in my city. I have a family, house and mortgage plus my large medical expenses (cardiac) I can handle...until I cant.
Holy moly, $250 in the midwest? Where do I get your job?
For reference, I just left a position in the Midwest for a job in SV that pays a little more than you're getting paid. $250 but with Midwestern rent would be life-changing. Sounds like we're in very different stages of our careers, though.
> Silicon Valley is the only place in the United States where $300K is even close to the "middle" of anything.
It does heavily cluster around SV, for sure, but Seattle/NewYork/Boston/Arlington will all get you there, and Chicago/Austin/etc aren't all that far behind at this point
Not everywhere. Switzerland exists. Also cost of living is a thing so if anything US/CH just ramp up to match that. The rest of Europe has high CoL but terrible salaries. Asia has bad salaries but low CoL (on average).
According to swissdevjobs.ch[1], the top 10% salary for a senior software developer in Switzerland is 135,000 swiss franc; that's roughly $170,000 per year.
So if this is correct, then even in Switzerland, it seems like $300,000 per year would be an obscenely high salary for a senior developer.
Oh right, well it depends on CoL doesn't it? You can reframe European salaries as 'obscene' by world standards too. Both the US and Europe have totally broken and unaffordable housing markets, for example, but at least the Bay Area compensates with salary. I would say that relative to costs it's more that other salaries are obscenely low, if anything. People in Europe should be rioting, but unfortunately only the home owners are politically active.
Does cities like San Francisco not have janitors? Waiters? Food delivery drivers? Or do those jobs command a six-figure salary too? If they can live comfortably in the city on a five-figure salary, maybe the argument that "cost of living is so high in SF that you can't live without a $300,000/year salary" is just a little bit overblown?
I can not imagine what one could possibly need $300,000 per year for unless an apartment costs like $200,000 per year.
> Does cities like San Francisco not have janitors? Waiters?
When I used to visit the Meta campus in Menlo Park, the QA folk I worked with were commuting 2 hours each way just to be able to afford housing. I've no idea how far away the janitorial staff must have lived to do the same
> I can not imagine what one could possibly need $300,000 per year for unless an apartment costs like $200,000 per year.
Being able to afford unpredictable expenses and not have it bankrupt you. In the US, that would include healthcare. Everywhere in the world, that would be useful if you were laid off.
To build an emergency fund, you just need an income that's a bit higher than your expenses. If you earn $60,000 after tax per year, and spend $50,000 per year, you have a decent $10,000 emergency fund after one year and a massive $100,000 emergency fund after a decade. You don't need $300,000 per year to save.
You get by on a low salary by living with multiple people in the same apartment. Or you live far away and commute. Or both.
Not really a tenable long-term situation for a senior employee with plans to start a family. Family homes of decent size and area are literally millions of dollars.
> The cost of living is so high in part because so many have ridiculously high salaries
Bigger problem in the SF area is that a bunch of folks who owned property before the gold rush have ended up real-estate-rich, and formed a voting block that actively prevents the construction of new housing (on the basis that it might devalue their accidental real estate investment)
It's not about deserving, programmers just have enough market power to be able to choose to go elsewhere. Janitors and other more fungible employees do not.
Besides, I did already say that everyone else was underpaid relative to costs. But that's not unique to the Bay Area. Cost of housing relative to income is terrible in almost all of the major European cities too.
Once cities become wealthy enough to develop a home owning class, they seem to cease being able to provision adequate housing supply in general.
I worked at Redwood Shores. On a walk across the 101, I discovered where the cleaning staff and food workers lived. In cars, under the bridge or parked in a quiet corner of the street next to industrial or commercial property.
To some extent, maybe, but often not. For example, London has similar cost of living to the Bay Area, and when I was at Meta experienced folks like Dan Abramov over in London were making about the same as fresh college hires in Menlo Park...
Yeah I was talking more about the definition of obscene. Like is it obscene to make 300k if housing is so expensive? I say no, and that London salaries are just bad. Although it would be preferable to fix the housing market.
To be fair though, Dan specifically is kind of notorious for messing up his comp negotiation. Did you not see the Twitter pile on at the time?
The net salary in France might be low but the overall cost of hiring is quite high. Besides, why go to the middle when you can just find even cheaper places, if that's your prime metric?
The reason the French can’t build these things is the same reason they shouldn’t be allowed to be in charge. It’s a preprint PDF host. Just make your own if you can run this one.
HAL is decidedly second-tier. Given the option, everyone would pick arXiv over HAL. Hence, HAL hosts lots of stuff that didn't (even) make it to arXiv => lots of subpar dredge.
Turns out that "better" for many people means "better moderated", since static hosting is hard to differentiate. And at present Arxiv is winning that one (at the expense of considerably higher running costs due to said moderation)
The traditional definition of high income starts at 2x the median. Looking the US as a whole, anything above $125k should be considered high income. But it doesn't feel like that, because median wages are unusually low in the US relative to mean wages. Upper middle class salaries, on the other hand, have grown very high, and they have distorted people's perceptions. Even now, we are debating whether almost 5x the median should be considered high income.
Considering the value and prominence of arxiv to the world, this seems low to me. Although more importantly the rest of the staff needs to be well paid too, and if that's the ceiling its a bit concerning. It's crazy to me that people thought this was too high.
arXiv doesn't need much. All they do is host static pdfs uploaded by someone else with free CDN services from Fastly [0]. I'm sure they could get academics to volunteer moderation services as well.
In reality you could host the entire thing for well under $50k/year in hardware and storage if someone else is providing a free CDN. Their costs could be incredibly low.
But just like Wikipedia I see them very likely very quickly becoming a money hole that pretends to barely be kept afloat from donations. All when in reality whats actually happening is that its a ridiculous number of rent seekers managed to ride the coattails of being the defacto preprint server for AI papers to land themselves cushy Jobs at a place that spends 90+% of their money on flights and hotels and wages for their staff.
I'm already expecting their financial reports to look ridiculously headcount heavy with Personnel Expenses, Meetings and Travel blowing up. As well as the classic Wikipedia style we spend a ton of money in unclear costs [1].
Whats already sad is they stopped having a real broken down report that used to actually showed things. Like look at this beautiful screenshot of a excel sheet. Imagine if Wikipedia produced anything this clear. [2]
> arXiv doesn't need much. All they do is host static pdfs uploaded by someone else with free CDN services from Fastly [0]. I'm sure they could get academics to volunteer moderation services as well.
This just isn't true. arXiv nowadays has to deal with major moderation demands due to the influx of absolute drivel, spam, and slop that non-academics and less-than-quality academics have been uploading to the site.
Moderation for arXiv isn't perfect or comprehensive but they put so much work into trying to keep the worst of the content off their site. At this point while they aren't doing full blown peer review, they are putting a lot of work into providing first pass moderation that ensures the content in their academic categories is of at least some level of respectable academic quality.
volunteer moderators are a valid option however this is also the way peer review works and the system is unfortunately very problematic and exploitative.
First pass sanity checks are also a lot less fun than proper peer review so paying moderators to do it is probably safer in the long run or else you end up with cliques of moderators who only keep moderating out of spite/personal vendettas against certain groups or fields.
> In reality you could host the entire thing for well under $50k/year in hardware
I could pay Anthropic $400 to write more code than you have in your entire lifetime.
Sure, you're able to operate a website acting as essentially the most important and highest volume venue for sharing academic research in the world, but come on, why couldn't I just ask Claude Code or some web developer in a foreign country to do the same thing?
$300k for a top executive position isn't especially high for anywhere in the US. That's around what the administrative director of a hospital would be making, which seems like a much smaller scope than leading ArXiv. For comparison, my roommate works for a non-profit that serves Philadelphia whose CEO's salary is $1.1 million. The CEO of the wikimedia foundation, which is similar in terms of role, has a salary of $450k. General average for US CEOs including for profits is around $800k and for large organizations tens of millions is not atypical.
Non-profits aren't maximizing stock value, but they do need to optimize for stakeholder value - you want to maximize the amount of money being donated in and you want to make the most of the donations you receive, both to advance the primary mission of the non-profit and to instill confidence in donors. This demands competent leadership. The idea that just because something is not being done for profit means the value of the person's contributions is worth less is absurd. So long as the CEO provides more than $300k of value by leading the organization, which might include access to their personal connections, then the salary is sensible.
When I was involved it was an x86 machine in a rack in Rhodes Hall.
I had a copy of the whole thing under my desk though in Olin Library on a Pentium 3 machine from IBM that was built like a piece of military hardware. In April the sun would shine in the windows of my office, the HVAC system was unable to cool my office, and temperatures would soar above 100F and I'd be sitting there in a tank top and drinking a lot of water and sports drinks and visitors would ask me how I could stand it.
The S3 API/UX/cost model is so seductively simple for static hosting though. I kind of think they deserve their ubiquity. Not on 90% of their products though.
I could even make those cards tradeable like NFTs, use DynamoDB as the ledger, and not worry about the cost at all.
On the other hand if you are talking about something bandwidth heavy forget about AWS. Video hosting with Cloudfront doesn't seem that difficult, even developing a YouTube clone where anybody could upload a video and it gets hosted seems like a moderate sized project. But with the bandwidth meter always running that kind of system could put you into the poorhouse pretty quickly if it caught on. Much of why YouTube doesn't have competition is exactly that: Google's costs are very low and they have an established system of monetization.
I am keeping my photo albums on Behance rather than self-hosting because I lost enough money on a big photo site in AWS that it drove my wife furious and it took me a few years to pay off the debt.
> the one child policy has backed them into a corner
A policy that ended a decade ago, and was only ever marginally successful (even at the height of the restrictions their birth rate was nearer 1.4 than 1.0)
The one child policy was only for cities anyway. Agricultural areas were permitted, even encouraged, to have more children. There were other exceptions, like twins (obviously), if the first baby was disabled, etc. Later on, couples were allowed two children if both parents came from single-child families.
This feels like absolute best-case scenario for an open-source clone interacting with a rights holder looking to re-release the original. Really glad to see their willingness to work together, instead of just torpedoing the open-source project
Real, but of more concern to folks designing widely-used libraries than to folks using said libraries.
> Anyone can give me a good read what Traits even are?
You can think of traits as analogous to interfaces in OOP languages (i.e. pure virtual abstract classes in C++ terminology).
They just define a set of methods that types can implement to conform to the trait, and then consumers can treat implementing types as if they were the trait.
The major differences are: traits are implemented outside the actual type implementation, so arbitrary trait implementations can be added after the type has been written (this is why we need coherence), and rust uses traits as compile-time bounds for generics (templates).