PHP Markov chain generator
Alice same she shore and seemed to remark myself, as she went back to the heard at Alice hastily, after open her sister. Here, Bill! The Duchess too late it a bit had a sort of they are the Queen. An invitation a little of the ran what it was only down her to the other; the Dodo, a Lory and the please that it must as well very good making a dish of time,” she added, “It isn’t a letters”.
If you think that sounds like gibberish, you’re right. But it is a pretty interesting type of gibberish because it is generated by a Markov chain. A Markov chain (if i describe this correctly, i’m not a mathematician or very good in logic) is an algorithm that makes its next state dependent on the previous one. For example, let’s say you have a text, such as Alice in Wonderland (that’s what the text above is based on). You could scan the whole text, and then look at how many times letters follow each other and make a table of that. E.g., the letter N probably follows the letter A more often than the letter Q.
After making such a table and start with a random letter, and then make a random weighted decision on the basis of how many times other letters usually follow, you get a text that quite closely resembles the original text, but takes apart all of its original context, resulting in the gibberish you read above.
It’s quite interesting to program such a markov text generator yourself, so i did exactly that with my PHP Markov chain generator. You can either input the starting text yourself (make sure to make it pretty long for best effect) or select one of the pre-selected texts such as Alice in Wonderland, the Wikipedia article on Calvin and Hobbes or Immanuel Kant’s Critique of pure reason (the last one becomes even more incomphrensible than it already was, can you imagine that?).
For your own experimentation you can get the source here. If you want you can include the generator in your own projects, the sources are released under the MIT open source license.
If you want to know more about Markov chains, be sure to read this article by Jeff Atwood, he explains the whole thing a lot better than i can.
And for your amusement, here’s another Markov generated piece, from Kant’s Critique of pure reason:
Thus common upon nothing that is, is deceptive infinite or as I am free in its or power to the continuous vacillationes demand explanation to this time-determines which they are given therefore necessary in itself respect none those a priori with a possible men course, and determission. Today its merely the conceiving the world in space is finite, neither in the support.
Phil
Nice generator – you can also use Markov chains to detect gibberish as well as generate gibberish.
I’ve written a PHP gibberish checker for email addresses, you can find it here
http://www.idontplaydarts.com/2009/08/detecting-a-fake-email-address-using-markov-chains/
Its open source :D
Niklas
ergodicity good to know you can tell whether next is or isn’t for instance according to representation heuristica, which language incoming is or who since different sets emit likely ex ante estimates. interesting appliance is the node with at least 3 edges.
Anonymous
its funny how mathematics works sometimes
Tarwin
Took an hour out and made this fun little “Shakespeare Tweet Generator” – http://stg.touchmypixel.com/
Thanks for the script. Very inspirational.
I did a pass of all his sonnets, then saved the output serialized to save time on creating the map. Made it super quick!
This even saves each generated phrase so you can prove to others that he really did say that!
Sam Levy
Thanks for this. I made a couple of minor adjustments to it to base the chaining off full words (tends to make more legible sentences). I’m also looking at updating it to attach to a database so that it can handle large datasets better/faster.
Combination with Web API: Twitter bot programming | Have a Cup of Coffee?
[…] English version. I looked through the web and found a good resource. The resource I referred to is this. There is a source code, and I wanted to make use of it and combine it with bot program. But I could […]
Anonymous
Markov chains always make me smile :)
Markov Chains, Horse e-Books and Margins | Bionic Teaching
[…] which will help me out with the Twitterbot end of things in the near future. I also found this PHP based Markov generator which does very nearly what I want absent the Twitter-ing […]
Markov Tweet Generator Code, Path, & Potential | Bionic Teaching
[…] following is how I adapted the Markov chain generator from Hay Kranen. Thanks to the comments1 I found below Hay’s post2 this Markov + Shakespeare version […]
Ralf van Kasteren
Where does ” a text quite closely resembles the original text” refer to in paragraph 3 (meaning the latter part of the quoted line). I hope that “the one just produced”is ment by ” the original” or else we got the plagiarism fo clones in SF-novelles ;
no i’m just being cocky,
is it an idea if an implemenion of something like “a broken up sentence sequence” into it, in order to create granulated grammatical structures which follow a more natural sentence build up. for i am dumb and know nothing about xml and rotary tables.
Kant
wo nicht allein die beiden an sich selbst, dem Naturgesetze, wenn sie als dem Feld möglicher Erfahrung auf einer empirischen Verknüpfung aus allgemeinen Gesetzen abweichen. Daher sind von dem so viel, daß viele Bestimmungen, die Rede gar nicht anders urteilen; eine Anmerkung, die Form möglicher Erfahrung, nämlich die ein Regressus, auf unsere Vernunft. Idee macht, sondern ohne daß eine vollständig ist, usw., mit sich selbst, als die völlige Freiheit des Bestimmenden, sondern der menschlichen Vernunft selbst aufgehoben werden kann. Denn alles andere vorhergehende Erscheinungen, in unserer Seele, als Phänomena bestimmt, den klarsten Lichte darzustellen. Der Definitionen nur unter einem möglichen Fällen zu einem möglichen auch nur durch bloße Form ohne Stoff zum Widerspruche frei halten, als ich denke, erbaut worden, war der Bedingungen, entweder eine Reihe der Allgemeinheit beweisen? Sondern eine reine Vernunft einer einigen gleich gar keine Erkenntnis hervor, welche die ein System darlegen müsse. Es mag sein, des vorigen Zustandes unmittelbar auf keine Beziehung auf zweierlei Gegenstände gegeben werden. Selbst eine Ursache, welcher Seite das Subjekt durch sukzessive Synthesis des Denkens, irgendeine Metaphysik betrifft, und es ein Gewicht, was sich bestehende Wirklichkeit des Zugleichseins, nach unendlich. Denn ich wissen?
Anonymous
???