Hacker Newsnew | past | comments | ask | show | jobs | submit | eli-bryan's commentslogin

Dataset:

https://www.kaggle.com/newshaikus/dataset

Search / Browse Haikus here:

https://doomhaikus.3iap.co

Context:

Dataset from an attempt to teach computers to write silly poems, given a prompt / topic.

I wrote a script to post each day's top news stories to Mechanical Turk, asking turkers to summarize each article as a haiku. I verified the syllable counts for each haiku against a syllable dictionary and/or manually (for unrecognized words).

It's been running since March. About ~2,000 people have responded and there are now ~2,700 haikus, forever memorializing the worst year of our lives, as punchy/gloomy sets of 5, 7, 5 syllables.

Semi-plausible use cases: Data art; Language models; Translation (with unusual constraints); Summarization


Yikes! I'd assumed you might reach toxicity before running out of digits! If you post the "share" link from the notebook I'll take a look. Or feel free to email me, my address is in my profile.


To reproduce: load the defaults then click the 8:00am plus button

Once: you might expect restful sleep after 5am Twice: you might expect restful sleep after 6am Three: you might expect restful sleep after 12am

But 12am (midnight) is before 5am :-), so I assume it overflowed, or it means 12pm (the following day).


The defaults are based on published papers, but should be interpreted loosely given the variability. A few considerations:

1. Even though some companies list precise #s for caffeine content, actual caffeine content can vary widely from day to day (259mg - 564mg) according to [1]

2. Caffeine half life varies from person to person, from ~1.5 to ~9.5 hours. So 5 is typical according to [2], but unless you're a smoker or on certain birth control I haven't found anything yet to say which end of the distribution to expect for yourself.

3. Caffeine sensitivity also seems to vary from person to person. And the "sleep threshold" is more of an upper bounds than a target, based on working backwards to figure out how much caffeine subjects had in their system and still saw sleep disruption in [3,4,5]. So participants in [5] slept worse when you'd expect them to have ~25mg of caffeine remaining in their system on avg, but that doesn't necessarily mean they'd have slept normally at 24mg.

These are covered in a bit more detail in the writeup: https://towardsdatascience.com/interactive-visualizing-caffe...

[1] https://doi.org/10.1093/jat/27.7.520

[2] https://www.ncbi.nlm.nih.gov/books/NBK223808/

[3] https://doi.org/10.5664/jcsm.3170

[4] https://doi.org/10.1111/j.1365-2869.2006.00518.x

[5] https://doi.org/10.1016/0006-8993(95)00040-W


I've never been a great sleeper. I knew caffeine could be a factor and I understood, at least intellectually, that caffeine’s half-life clearance meant that some of it might still be floating around my brain later in the day. But I didn't fully grok the dynamics until seeing it play out visually.

So this is a (simple) simulator, built to visually explore 2 questions:

1. How long can caffeine stay in your system?

2. When will caffeine levels fall back below a threshold with minimal sleep disruption?

The answer is, it depends... so I parameterized the notebook to try different options. Notes on the parameters / lessons learned are here: https://towardsdatascience.com/interactive-visualizing-caffe...

Take this with a big grain of salt. I'm not a pharmacist, physician, chemist, etc, and this hasn't been reviewed by one. But at least for me, being able to turn the dials on a simple half-life model and see the consequences has been eye-opening.

Default params are chosen based on: * https://www.ncbi.nlm.nih.gov/books/NBK223808/ * https://doi.org/10.1038/clpt.1987.126 * https://doi.org/10.1002/cpt197824140 * https://doi.org/10.5664/jcsm.3170 * https://doi.org/10.1111/j.1365-2869.2006.00518.x * https://doi.org/10.1016/0006-8993(95)00040-W


There's a researcher, Ethan Bernstein, who has looked into this quite a bit and has some great stories about the ways close monitoring can (sometimes) go sideways. He did a fascinating history of it here: https://www.hbs.edu/faculty/Publication%20Files/BernsteinE-M...

A representative anecdote re: a factory workfloor: "First the [embedded researchers] were quietly shown ‘‘better ways’’ of accomplishing tasks by their peers — a ‘‘ton of little tricks’’ that ‘‘kept production going’’ or enabled ‘‘faster, easier, and / or safer production.’’ Then they were told, ‘‘Whenever the [customers / managers / leaders] come around, don’t do that, because they’ll get mad.’’ Instead, when under observation, embeds were trained in the art of appearing to perform the task the way it was ‘‘meant’’ to be done according to the codified process rules posted for each task. Because many of these performances were not as productive as the ‘‘little tricks,’’ I observed line performance actually dropping when lines were actively supervised." From "The Transparency Paradox": https://journals.sagepub.com/doi/abs/10.1177/000183921245302...

I've also interviewed a few engineers about this recently and there are plenty of horror stories. My favorite 2 quotes (from the same person) about screenshot monitoring freelancers: "It almost uniformly led to worse work..." and "... but my boss loved it." https://medium.com/@elibryan/employee-performance-tracking-d...


Ouch...


Definitely (down the road). And not just health-specific things. There's all kinds of data that can serve as a (proxy) signal for health. e.g. physical activity is correlated with weather: http://rd.springer.com/article/10.1186/1479-5868-3-21 and it's suggested that your diet can be affected by a bunch of different things: http://eab.sagepub.com/content/39/1/106.abstract


Thanks! Bruno Barros did the original (@IlustreBOB). He's crazy talented.


agreed. that was definitely a concern, but the name seems to resonate with people (and short, monosyllabic domain names are scarce). maybe we'll post as "Other Notch" in gamer contexts...


=X good catch! will fix in the next build.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: