Hacker Newsnew | past | comments | ask | show | jobs | submit | rmelhem's commentslogin

nice one. are you using gpt3 under the hood?


I'm not that smart - my site is basically just doing some calculations on word frequencies. You can read https://academic.oup.com/dsh/article-abstract/17/3/267/92927... and https://www.tandfonline.com/doi/abs/10.1080/09296174.2011.53... and https://news.ycombinator.com/item?id=33755898 for more information.


As you mention on the site, you don't do punctuation. But I'm guessing there are some pretty good fingerprints like:

two spaces after a period

Whether someone uses an em-dash/single hyphen/double hyphens (which may correspond to house style they're used to)

Whether they use semi-colons

(Presumably harder) but consistent substitutions like loose for lose, break for brake, etc.

Use of accents


I manually determined there was an individual posing as two people (playing both the antagonist and the adversary) because they consistently misspelt certain words such as "definitely" as "defiantly".

Fingerprinting certain linguistic traits and mapping that to time-zones as well as confirming there is a partial overlap in posts but never exact worked exceedingly well. Someone can't easily maintain a fluent conversation between themself on two accounts, but they can either get close, either through unnatural delays between sentences or just never interacting with the "other" party at the same time.


Simplicity is the greatest form of sophistication! Great work!

One small nit from a user experience point of view..: it'd be easier on the eyes if you just truncated those cosine similarity scores (or whatever score you're using) after the, say, 5th digit. Showing the entire float is kinda messy to my eyes.


Don’t sell yourself short. Simplicity is smart. It’s astonishing how often the simplest thing turns out to be exponentially more effective than the so-called smart thing.

I can’t get over how phenomenal this is. Please put every one of your side project ideas into production!


I am curious whether it could pick GPT3 out of the crowd.


Its easy to write complicated systems, it takes a genius to make it simple.


cool and thanks for the clarification. i ask that mainly because of the request limit of openai, which is something that makes many scalable ideas unfeasible


was just telling my wife few days ago that my dream was to watch him live one day. it's just sad that it won't be possible anymore.


In a way we are so lucky that we have recording technologies now for both video and audio and literally 100's of hours of people like him to enjoy. Imagine what it would have been like to visit a concert where JS Bach was playing. Back then no amplification and really only the lucky few that got to see this up close. The very best your average person could do would be to go to church for some half decent music.


yes, I completely understand your point. i'm a huge fan of Erik Satie for example and I'm lucky enough to be able to play 4 of his songs on the piano. I'd love to - at least - hear him playing his own compositions, but ofc it's impossible lol.


The Amsterdam pianola museum has some Rachmaninov recorded by Rachmaninov. I wrote about it here:

https://jacquesmattheij.com/rachmaninoff-plays-rachmaninoff-...

If you ever visit Amsterdam definitely go see that, it is quite an experience.


Oh thats dope, will read it! and I'm pretty sure I read your blog once, because of the pianojacq! obs: i'm planning a visit to Amsterdam this year because a conference of recommender systems, so will for sure visit the museum if I'm able to go there. thanks!


yep, i suddenly felt in love with music after my dad passed away - was in college studying materials engineering, and made a living out of music for 6 years (until the pandemics)


where I work for, our stack is all about GCP/Airflow/Python/BigQuery ML, for recommender systems. I'm now playing around with Turicreate (Apple) to compare with BQML.


thats really cool. I'd like to help somehow, being a volunteer if needed.


Hi! Sorry, just saw this — please send an email at filippo@discofm.co


having changed profession completely 3 times in the last 10 years, I need to completely agree with the article


how one could prepare for such interviews? i know google is the place to start, but if you could point some source would really appreciate it!


I think you are missing one of the two pieces the question builds on: Linear Algebra (do you know how a dot product is defined?) or Datastructures including their complexity. You would prepare by reading up on whichever of these two you are missing.

If you know how it's supposed to work in theory but struggle to bang it out, you could do eg leetcode.


lol


Lol, yep. I always use my gf for that purpose and vice-versa.


KIC 11145123 is a good example.


"However, for a data science beginner, SQL is the best place to start."

I totally agree with that statement. Being a beginner myself in the DS field, I'm living through this right now in my job. And, as a plus, working with SQL everyday is also helping me a lot to have different perspectives in handling the Python/Pandas DataFrame.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: