ts_zip ( https://bellard.org/ts_server/ts_zip.html ) already implements lossless compression with language models and is fast enough to be usable for some applications.
I gave it this input: Planet X has been discovered
And got this output:
From ‘Astronomer’: In September 1998, at 2.01 AM Eastern time, a team of astronomers led by T. T. Von Eltz observed a previously unknown stellar object in the constellation Cassiopeia, in the night sky. What the astronomers had discovered was a large celestial body approaching the Earth and the Sun at incredible speed. The distance of this planet from the Earth, however, was too great for it to be seen from the Earth, but not too far for it to be seen from the Moon, a mere 170,000 km away. This object, which was discovered to have a mass approximately equal to that of Earth, was named “Planet X.” For many decades, scientists on the planet were unable to determine what it was, and some even believed that it might be a natural phenomenon. However, in the past few years, it has become evident that the truth was something much different. As a result, the truth is now being revealed to the world.
Planet X is not a planet.
Planet X is an object on the ecliptic, or the same orbit as the Earth, and is one of the planets of our solar system. The object is much larger than any of the other planets, having a mass about equal to that of the Earth. It was at the center of the Great Darkness and will make a complete orbit of the Sun once every 25,000 years. Its orbit takes it through the plane of the ecliptic, which passes through the Earth’s orbit. As it approaches the Sun, Planet X starts to heat up, as the Sun’s warmth increases, and slowly rises to the Earth’s orbit. The planet rises to a height of about 10,000,000 km above the Earth’s surface. At the time of its discovery in 1998, Planet X was over 6,000,000 km above the surface of the Sun.
Due to the size of the body, its gravitational effect is not negligible. This gives rise to an interesting phenomenon: it turns out that as the planet approaches the Sun, the Earth will begin to experience its gravitational pull. Once this happens, the gravity exerted by the planet will cause a tidal bulge in the oceans and seas of Earth. According to the calculations, the Earth will have experienced an increase in water mass of about 300 billion tons over the period of about one hundred years. This is approximately ten times the combined total of water deposits on Earth’s continental shelves.
The existence of Planet X, and its possible influence on the Earth, may have some practical implications for our planet. If it were to continue in this trajectory, this could cause the planet to rotate more rapidly, something that could give rise to severe hurricanes. This means that there would be periods of high probability for earthquakes and volcanoes. The effect of the gravity would also cause a rotation of the Earth, increasing its distance from the Sun by 200 million miles. This would cause the loss of all of the continents, including Africa. The reason for this is that the loss of water in the oceans, together with the loss of the continental shelf, would reduce the Earth’s overall gravitational force, leaving the planet exposed. It could also cause the polar ice caps to melt. The presence of Planet X will also cause fluctuations in the Sun’s brightness, which would have serious effects on the plants and trees on Earth.
The discovery of Planet X shows that there is an intelligent intelligence that is controlling the destiny of our planet. However, it remains to be seen if it is a hostile, malevolent force, or if it is something else entirely. This question, and others, will be decided only after we have examined the discovery of Planet X, and its implications.
The existence of Planet X also shows the power and influence that the intelligence, or beings, of other planets have on our own. It shows how one planet’s discovery can have an effect on a neighboring planet, and eventually on another planet further away. These forces are not necessarily benevolent. But they may also be benevolent. We will soon have a better idea of what lies in store for us.
FYI, the corresponding standalone Linux command line version is available at https://bellard.org/nncp/gpt2tc.html . It also does text completion and file compression.
Here's another one, seed text in italics. Clearly some Star Trek used in the training data.
"Engage", said Captain Picard from the bridge of the Enterprise. "But sir", Commander Data began, "this must have been part of something called the Borg."
"You know I didn't tell them that," said Picard, "I have a very hard time talking to them."
- Picard, Data, and Worf while the ship is under attack by Borg drones at Federation border
"The Borg don't know that they're being infected with something dangerous. And they're not too fond of it. There's a saying: 'Borgs eat Borges. Borges eat Borges.' They'll get tired of eating their own."
The right place for a chill after work beer. Free peanuts, but don’t order “the other” in the name of science. Free peanuts, but don’t order “the other” in the name of science. The last thing you’ll see after work is a sign that says, “Sorry we are closed on Tuesday because we have peanuts.” The last thing you’ll see after work is a sign that says, “Sorry we are closed on Tuesday because we have peanuts.“
This is some Escher level shit. These words, just like his paintings, make sense totally within your window of vision as you move your eyes through, but make no sense on the whole.
Pretty impressive! This is what I got (my initial text is in italic):
One day I woke up suddenly after a bad dream. I was sitting on my bed alone. As soon as I turned over I found myself lying down in a strange bed. It had no bed, no mattress and no cover for me. I was just lying on that thick carpet. I looked around to discover nothing but a white bed. There was no window, no furniture. No door. The only clue was a single white mirror that stood beside my bed. And there was a single candle lying on my bedside table. I was not worried or scared. I was fascinated.
I could see the floor.
I could see the walls.
And I could see everything that was around me.
I could hear the wind.
I could see everything from the windows, from the door that was always slightly open. I could even see the light.
I was in a dream world and I knew that I was in a dream. But what I wanted to know was what was happening in this world.
I've always wondered if I could spend a weekend with some neural nets, do some light editing, and try to corner the market on some of the more obscure Amazon ebook categories (e.g. Sasquatch Erotica)
It seems like this model would be more than adequate. Other than some basic human anatomy mistakes (e.g. breasts are typically above the waist), it generates some surprisingly explicit stuff (which I've mostly elided) even with my misspelling of "caressed".
He slowly carresed her shuddering body back to her body. Her lips were wet and slick. Her breathing slowed and she began to shudder as a hand came down on her breasts. She shivered as it moved between her breasts gently. "Don't worry, I'm here. I'll let her go for you." He stroked down her body slowly as he stroked down her back. "Come here, my beautiful." He took a step down and began to push his body down into her. He gently pulled her closer against himself and held his breath as his hands came down from her waist to her breasts as his mouth came down to her p...
_Lucy turned off the light and slipped under the sheets, feeling Frank draw closer_ to her. He lay next to her and she reached up to caress his face softly and she kissed the side of his neck again, feeling him shiver against her. "You look beautiful." Frank looked down at her and smiled. "You look really good in those clothes, though." She kissed his cheek. "I know." "That's okay," he said and he reached out to kiss her, but then he slipped away. "Do you want anything else?" She smiled. "Nothing." He smiled back at her. He was still wearing his clothes from before. He wasn't as naked as he usually was. She reached around to cup his breast and she pressed her face against his. It was just her and him. "Are you ready for me?" Frank grinned. "Ready for what?" "For you," she said and she leaned over and kissed him again, her hands caressing his skin and his breasts, her mouth moving against his to taste his skin.
* His breasts? I mean sure, this is the 21st century I guess.
* Her mouth moving against his to taste his skin...
* They're both in bed fully clothed with the lights off, and lamenting how they're usually more naked.
I mean obvious inconsistencies aside, this is super fun to play with.
I wouldn’t be surprised if that’s already being pursued. Some online ebook publishers let you use their API... or browser automate to crank out epubs. Rich formatting will help to fool a cursory read by the filtering editors. Would be interesting to see how many 5 star reviews one would get ... perhaps from machines in a Chinese click shop.
Interesting. In both this and the parent the generation seems to center around one or two words ("bed" for the parent, "breasts" for yours) and then generate text connecting those words that is grammatically correct but is incoherent gibberish otherwise.
Just an afterthought, but you could probably cobble something together that asks someone for some basic prompts, like names, locations, etc, then uses a series of templates to generate a chapter of text that's reasonably coherent and tailored to an individual.
Feeding in the names of GOT characters seems to produce some entertaining alternate-universe stuff, looks like they're trying to generate something similar to a Wiki page:
Ned Stark (The Lord of Winterfell) is a Stark from the North of the Seven Kingdoms.
Contents show]
Biography
Background
Ned is the father of Arya, Brandon, Robb, Rickon, and Hodor. He had married Catelyn Tully after her father died in the Battle of the Whispering Wood. During the War of the Five Kings he served as Hand of the King under Lord Eddard Stark after his father's death. Ned has two brothers, Willem and Torrhen Stark. Ned, his siblings and their children were all killed during the Red Wedding
It was [setting] in the [location]. The [creature] [action], and [intent] said "
And here's some of the results of the first complete sentence for a few iterations of each prompt (the intention being the prompts can be generated at runtime)
---
It was night in the dark dungeon. The Orc came out of nowhere, and threateningly said "Who are you? Where are you coming from?"
It was night in the dark dungeon. The Orc came out of nowhere, and threateningly said "We are not going to allow the Dark Lord to rule this world. We are going to rule this world from the top. You should go back to the shadows, or you can just go to my dungeon."
It was night in the dark dungeon. The Orc came out of nowhere, and threateningly said "What are you doing?"
---
It was twilight in the enchanted forest. The white elf suddenly appeared, and invitingly said "Good evening, we are ready for our banquet."
It was twilight in the enchanted forest. The white elf suddenly appeared, and invitingly said "Hey! I've been waiting for you! This is the first time I've seen you."
It was twilight in the enchanted forest. The white elf suddenly appeared, and invitingly said "Welcome, My Lady! I am here to serve you!"
It was twilight in the enchanted forest. The white elf suddenly appeared, and invitingly said "I am Tui-Yuan. Come down and meet my parents."
---
It was damp in the filthy sewer. The mutated rat crept up, and cunningly said "I will tell you everything."
It was damp in the filthy sewer. The mutated rat crept up, and cunningly said "I have an idea" in a voice so high that all the other rats in the sewer turned pale
It was damp in the filthy sewer. The mutated rat crept up, and cunningly said "I am the rat."
It was damp in the filthy sewer. The mutated rat crept up, and cunningly said "You'll die soon".
---
It was humid in the abandoned brothel. The policeman barged in, and brusquely said "I'm a policeman".
It was humid in the abandoned brothel. The policeman barged in, and brusquely said "go to hell".
---
It was frigid in the abandoned space station. The xenomorph burst in, and acerbically said "It's warm on the other side."
It was frigid in the abandoned space station. The xenomorph burst in, and acerbically said "Hello" while it slowly closed in.
It was frigid in the abandoned space station. The xenomorph burst in, and acerbically said "I hate this cold."
---
I think there's a lot of merit to this idea hey. Some of the responses are left field but could be woven into the charm. I guess the algorithm is pretty processor intensive though - is it worth it for "flavour"? It could work for a low fidelity or text based game I think.
Edit: I think it would work better if the prompt is not displayed, you just see the bit following the quote.
I got an... interesting result when I tried it. The URL was almost plausible, until the anchor came up; I also included the beginning of the next paragraph simply because it is hilariously unrelated:
A coworker of mine came up to me earlier today and started asking about the C memory model. In response, I showed him the documentation and asked that he check out the following link:
This is pretty good. I seeded GPT-2 (1.5B) with the first TWO sentences from this biomedical passage [source: https://www.ncbi.nlm.nih.gov/pubmed/19254780 ; I show the remaining original sentences, for comparison to GPT-2].
<start>
Yeast cells lacking the mitochondrial NADH kinase encoded by POS5 display increased sensitivity to hydrogen peroxide, a slow-growth phenotype, reduced mitochondrial function and increased levels of mitochondrial protein oxidation and mtDNA mutations. Here we examined gene expression in pos5Delta cells, comparing these data to those from cells containing deletions of superoxide dismutase-encoding genes SOD1 or SOD2.
Surprisingly, stress-response genes were down-regulated in pos5Delta, sod1Delta and sod2Delta cells, implying that cells infer stress levels from mitochondrial activity rather than sensing reactive oxygen species directly. Additionally, pos5Delta, but not sod1 or sod2, cells displayed an anaerobic expression profile, indicating a defect in oxygen sensing that is specific to pos5, and is not a general stress-response. Finally, the pos5Delta expression profile is quite similar to the hap1Delta expression profile previously reported, which may indicate a shared mechanism.
<end>
Here is the generated output (following those first two sentences; I needed to do this in Opera; Firefox did not work, possibly due to add-ons and settings).
<start>
The results show that POS5Delta cells exhibit decreased expression of two of the genes, SOD1 and SOD2, but increased expression of several other genes, including aldehyde dehydrogenase subunits, NADH dehydrogenase subunits 2 (ODU2), cytochrome c oxidase isoforms (OXO3), and cytochrome b oxidase/OXO1, and increased expression of aldehyde dehydrogenase subunits 1 (ODU1) and 1a2. The results also show that loss of two genes, SOA and SOA1, and two genes, SOA and SOA2, leads to decreased expression of aldehyde dehydrogenase subunits 1, 1a2 and 1a3 and increased expression of ODO1 and ODO1a1. The results show that loss of two genes, ODO2 and ODO1b1, leads to decreased expression of ODO2 and increased expression of O
<end>
The output is grammatically / syntactically correct (but o/w nonsensical: e.g. there are no genes ODU1, ODO2 ...).
My own meta-try from the textsynth.org site, using its own description as input:
Text completion using the GPT-2 language model. It is a neural network of 1.5 billion parameters. Type a text and let the neural network complete it. Each try returns a different randomly chosen completion.
The system is still far from being perfect. I need to add a bit of support for parsing and formatting of HTML, CSS, and JSON.
If anyone has ideas on what could go in the UI, I am open to suggestions.
and wait a while for results to appear. You can use the --progress ( -- ) flag to tell the training of the system to continue, or use the --stop ( -- ) flag to stop training. If you run with no options, the system will continue on its current task.
Options:
--input text is the text to be trained --output text is the path to the text as output --progress will print a summary of the progress of the system every 15 seconds --stop will stop training the system and print a summary of
The answer to life, the universe and everything is: "I dont fucking care." The answer to life, the universe and everything is this: "Fuck you!" -Derek Smart
I started with "Donald Trump is". Ended up with this. Looks plagiarized to me :)
---
Donald Trump is a serial liar, and we're going to prove it," the group said. "You can take it to the bank: his word is worthless. His mouth is a hotbed of vulgarity and hate."
The group released a statement on Twitter, saying that it would not stop trying to make Trump accountable for his words.
Trump has also been attacked over allegations that he sexually harassed his former wife Ivana, the billionaire's first wife, during their divorce in the 1990s. Trump vehemently denied he'd ever done such a thing.
To give some perspective, the setup of http://textsynth.org consists in a single C Linux executable of 250 KB on the server and in 150 lines of Javascript code on the client without any dependency on other libraries...