I think 8 months is a little short for the utility of a new tech to be fully realized and utilized. I'm pretty sure there were still horses on the roads long after 8 months after the Model T first went on sale.
I've somewhat had this phenomenon, with the added observation that if I listen to the recording multiple times it will "click" and will sound good again. Then I wonder if it actually sounds good or what.
Recurrent neural nets is the general term for nets with memory as you describe. Indeed LSTMs, a type of recurrent net, used to be state of the art on language tasks until the GPT transformer models. I'm sure somebody somewhere is working to make a transformer with recurrency. The neural turing machine mentioned in another comment is such an example but it seems to have been abandoned.
The main problem with recurrent models is its hard to train them with backprop. For example the GPT-3 can handle sequences up to ~2000 tokens? I'm not sure what the largest sequence LSTMs could be trained on but it was probably less.
LSTMs typically forget after more than a few hundred tokens (vanishing gradients?), so while you could probably BPTT 2000+ steps these days, there wouldn't be much point.
> I'm sure somebody somewhere is working to make a transformer with recurrency. The neural turing machine mentioned in another comment is such an example but it seems to have been abandoned.
Yeah, there's a bunch of Transformer variants which either use recurrency, compression for long-range, or efficient attention approximation for windows so large as to obviate recurrency. The NTM hasn't been shown useless so much as alternatives like Transformers proven to be way easier to implement & scale up to get similar performance, but it pops up occasionally; a particularly surprising recent appearance was Nvidia's GameGAN which uses a NTM-like memory module for learning to model Pac-Man: https://nv-tlabs.github.io/gameGAN/
I've recently read a paper, that enables very long unrolls in RNNs due to O(1) memory requirements (in number of unroll steps): https://arxiv.org/pdf/2005.11362.pdf
It's a gray area because "hate speech" isn't a valid concept. 1A already excepts for inciting violence but that's not what radicalized censorship-extremists mean when they use the term. They specifically mean anything that is not canon to their current worldview.
Does anybody have a definition of "hate speech" other than "points of view that I disagree with"? And if your definition is going to be something like "encouraging harm against groups of people" can you then point me to a single quote from a "hate speech" site demonstrating this?
> evidence indicated that rationality and clear arguments weren't winning
On what grounds do you base your own sense of "rationality and clear argument" as superior to that of the apparent majority of people that you refer to as "the cesspool"?
The whole conversation is a bit abstract at this point, so maybe it wasn't clear: the "cesspool" I'm referring to are virulent racists.
If you believe that either (a) arguments against such people less clear and rational than the racists' arguments, or (b) such people are actually in the majority, then I believe we disagree on things so fundamental we can't have a constructive debate about Reddit's policies.
Whoever is in power determines what is "ethical society" or what makes society "better". Censorship is a form of physically violent oppression and is never justifiable.
Society is nothing more than the aggregate form of its constituent parts, i.e. all of us. It’s not separate from us, it is us, and we all have an obligation to act to improve it. Censorship is a tool, just like free speech or liberty or law or justice or rhetoric or tradition, that we have to wield to those ethical ends.
Yes, parts of society absolutely have the right to censor other parts of society. Just as we have the right to imprison, or regulate, or chide, or squelch. None of these things are sacred and inviolable lines. All of them are tools, which we have the moral obligation to use, judiciously, to improve the human condition.
Free speech improves the human condition. Therefore, by your own argument, the authorities have the moral obligation to forcibly censor your own anti-free speech comments.
Free speech is an amoral tool, which can be used to improve or degrade the human condition, when wielded appropriately. It's our job to decide when and how to use it, and not act naïvely and pretend it's a virtue in itself.
Which means censorship vectors get to wrap things they want censored as "racist".
Soviet censorship vectors were able to wrap dissenting viewpoints as "counter-revolutionary" or "bourgeois" without regard for the consistency of content to the wrapper.
You want it to be, but in practice it's not. Enforcement of immigration policy is seen as racist by some and not by others, for example. So though _you_ claim to know (because obviously _your_ beliefs are correct), people disagree on this point.
Burning a book is physical violence. Banning a sub is physical violence. In censorship you are forcibly destroying a communication between two nonviolent third parties.
In "Algorithms" [Dasgupta, Papamdimitriou, Vazirani] they state that the memoization (e.g. querying a hash table) can have significant overhead leading to a large constant factor in the big O analysis. On the other hand, the bottom up approach using a table solves all the possible subproblems including ones that are not needed and ones that the recursive approach would avoid. So from a big O perspective the recursive top-down and the table bottom-up approaches are the same but there can be significant constant factor differences.
Then mention this in the interview. I ask an algorithms question that would likely be hated on HN but when candidates tell me the practical difference (API cleanliness, maintainability, performance, extensibility) between approaches that ultimately don't impact asymptotic runtime I love it.
Yeah I thought the recursion+memo wasn't actually "dp" until I looked it up. Recursion without the memo is not DP however since you hit exp running time/memory and the whole point of DP is to avoid this.