Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Participants: Derya Akbaba * Ben Allen * Natalia-Rozalia Avlona * Kirill Azernyi * Erin Kathleen Bahl * Natasha Bajc * Lucas Bang * Tully Barnett * Ivette Bayo * Eamonn Bell * John Bell * kiki benzon * Liat Berdugo * Kathi Berens * David Berry * Jeffrey Binder * Philip Borenstein * Gregory Bringman * Sophia Brueckner * Iris Bull * Zara Burton * Evan Buswell * Ashleigh Cassemere-Stanfield * Brooke Cheng* Alm Chung * Jordan Clapper * Lia Coleman * Imani Cooper * David Cuartielles * Edward de Jong * Pierre Depaz * James Dobson * Quinn Dombrowski * Amanda Du Preez * Tristan Espinoza * Emily Esten * Meredith Finkelstein * Caitlin Fisher * Luke Fischbeck * Leonardo Flores * Laura Foster * Federica Frabetti * Jorge Franco * Dargan Frierson * Arianna Gass * Marshall Gillson * Jan Grant * Rosi Grillmair * Ben Grosser * E.L. (Eloisa) Guerrero * Yan Guo * Saksham Gupta * Juan Gutierrez * Gottfried Haider * Nabil Hassein * Chengbo He * Brian Heim * Alexis Herrera * Paul Hertz * shawné michaelain holloway * Stefka Hristova * Simon Hutchinson * Mai Ibrahim * Bryce Jackson * Matt James * Joey Jones * Masood Kamandy * Steve Klabnik * Goda Klumbyte * Rebecca Koeser * achim koh * Julia Kott * James Larkby-Lahet * Milton Laufer * Ryan Leach * Clarissa Lee * Zizi Li * Lilian Liang * Keara Lightning * Chris Lindgren * Xiao Liu * Paloma Lopez * Tina Lumbis * Ana Malagon * Allie Martin * Angelica Martinez * Alex McLean * Chandler McWilliams * Sedaghat Payam Mehdy * Chelsea Miya * Uttamasha Monjoree * Nick Montfort * Stephanie Morillo * Ronald Morrison * Anna Nacher * Maxwell Neely-Cohen * Gutierrez Nicholaus * David Nunez * Jooyoung Oh * Mace Ojala * Alexi Orchard * Steven Oscherwitz * Bomani Oseni McClendon * Kirsten Ostherr * Julia Polyck-O'Neill * Andrew Plotkin * Preeti Raghunath * Nupoor Ranade * Neha Ravella * Amit Ray * David Rieder * Omar Rizwan * Barry Rountree * Jamal Russell * Andy Rutkowski * samara sallam * Mark Sample * Zehra Sayed * Kalila Shapiro * Renee Shelby * Po-Jen Shih * Nick Silcox * Patricia Silva * Lyle Skains * Winnie Soon * Claire Stanford * Samara Hayley Steele * Morillo Stephanie * Brasanac Tea * Denise Thwaites * Yiyu Tian * Lesia Tkacz * Fereshteh Toosi * Alejandra Trejo Rodriguez * Álvaro Triana * Job van der Zwan * Frances Van Scoy * Dan Verständig * Roshan Vid * Yohanna Waliya * Sam Walkow * Kuan Wang * Laurie Waxman * Jacque Wernimont * Jessica Westbrook * Zach Whalen * Shelby Wilson * Avery J. Wiscomb * Grant Wythoff * Cy X * Hamed Yaghoobian * Katherine Ye * Jia Yu * Nikoleta Zampaki * Bret Zawilski * Jared Zeiders * Kevin Zhang * Jessica Zhou * Shuxuan Zhou

Guests: Kayla Adams * Sophia Beall * Daisy Bell * Hope Carpenter * Dimitrios Chavouzis * Esha Chekuri * Tucker Craig * Alec Fisher * Abigail Floyd * Thomas Forman * Emily Fuesler * Luke Greenwood * Jose Guaraco * Angelina Gurrola * Chandler Guzman * Max Li * Dede Louis * Caroline Macaulay * Natasha Mandi * Joseph Masters * Madeleine Page * Mahira Raihan * Emily Redler * Samuel Slattery * Lucy Smith * Tim Smith * Danielle Takahashi * Jarman Taylor * Alto Tutar * Savanna Vest * Ariana Wasret * Kristin Wong * Helen Yang * Katherine Yang * Renee Ye * Kris Yuan * Mei Zhang
Coordinated by Mark Marino (USC), Jeremy Douglass (UCSB), and Zach Mann (USC). Sponsored by the Humanities and Critical Code Studies Lab (USC), and the Digital Arts and Humanities Commons (UCSB).

A Travesty Generator for Micros

edited February 2020 in 2020 Code Critiques

As I mentioned in a reply to @Leonardo.Flores's code critique of some NaNoGenMo works, we can think of Hugh Kenner and Joseph O'Rourke's 1984 "Travesty" program as the ancestor of many other methods for generating text based on some source text. It's a common point of reference for discussions of computer-aided creativity (Charles Hartmann, for example, demonstrates it in Virtual Muse), and it's relatively easy to understand what it does.

Since Kenner and O'Rourke first shared Travesty by publishing it in Byte Magazine, I thought it would be interesting to take a closer look at what the code does. Does it demonstrate some ideas about the original text it works on? Since the term "travesty" is pretty dramatic, is there evidence in what it's doing that validates that idea of what it's doing to the text? Is what goes on in Travesty truly similar to processes that we think of as its descendants (Markov chain and char-rnn, mainly), or is something else at work?

Travesty is written in Pascal, a language I don't have any prior experience with. Still, getting it to run was pretty straightforward. I've transcribed it line by line (including comments) from Byte and shared it in a Github repository, along with some instructions on using it.

Since the directions say to post the code you're critiquing in your thread, I'll include it here as well. I'll post some initial thoughts in a reply to this thread.


Listing 1: Travesty, a program for generating pseudo-text. The program will scan a sample text and generate a "nonsense" imitation. For an order-n scan, every n-character sequence in the output occurs somewhere in the input.

\n
\n
PROGRAM travesty (input,output);            { Kenner/ O'Rourke, 5/9/84}

(*     This is based on Brian Hayes' article in Scientific           *)
(*     American, November 1983. It scans a text and generates        *)
(*     an n-order simulation of its letter combinations. For         *)
(*     order n, the relation of output to input is exactly:          *)
(*           "Any pattern n characters long in the output            *)
(*               has occurred somewhere in the input,                *)
(*                 and at about the same frequency."                 *)
(*     Input should be ready on disk. Program asks how many           *)
(*     characters of output you want. It next asks for the           *)
(*     "Order" -- i.e. how long a string of characters will be       *)
(*     cloned to output when found. You are asked for the            *)
(*     name of the input file, and offered a "Verse" option.         *)
(*     If you select this, and if the input has a "|" char-          *)
(*     acter at the end of each line, words that ends lines in       *)
(*     the original will terminate output lines. Otherwise,          *)
(*     output lines will average 50 characters in length.            *)

CONST 
  ArraySize = 3000;       {maximum number of text chars}
  MaxPat = 9;        {maximum Pattern length}

VAR
  BigArray : PACKED ARRAY [1..ArraySize] of CHAR;
  FreqArray, StartSkip : ARRAY[' '..'|'] of INTEGER;
  Pattern : PACKED ARRAY [1..MaxPat] of CHAR;
  SkipArray : ARRAY [1..ArraySize] of INTEGER;
  OutChars : INTEGER;    {number of characters to be output}
  PatLength : INTEGER;
  f : TEXT;
  CharCount : INTEGER; {characters so far output}
  Verse, NearEnd : BOOLEAN;
  NewChar : CHAR;
  TotalChars : INTEGER; {total chars input, + wraparound}
  Seed : INTEGER;

FUNCTION Random (VAR RandInt : INTEGER) : REAL;
BEGIN
  Random := RandInt / 1009;
  RandInt := (31 * RandInt + 11) MOD 1009
END;

PROCEDURE InParams;
  (* Obtains user's instructions *)
VAR
  InFile : STRING [12];
  Response : CHAR;
BEGIN
  WRITELN ('Enter a Seed (1..1000) for the randomizer');
  READLN (Seed);
  WRITELN ('Number of characters to be output?');
  READLN (OutChars);
  REPEAT
    WRITELN ('What order? <2-', MaxPat,'>');
    READLN (PatLength)
  UNTIL (PatLength IN [2..MaxPat]);
  PatLength := PatLength - 1;
  WRITELN ('Name of input file?');
  READLN (InFile);
  ASSIGN (f, InFile);
  RESET (f);
  WRITELN ('Prose or Verse? <p/v>');
  READLN (Response);
  IF (Response = 'V') OR (Response = 'v') THEN
    Verse := true
  ELSE Verse := false
END; {Procedure InParams}

PROCEDURE ClearFreq;
(*  FreqArray is indexed by 93 probable ASCII characters,            *)
(*  from " " to "|". Its elements are all set to zero.               *)
VAR
  ch : CHAR;
BEGIN
  FOR ch := ' ' TO '|' DO
    FreqArray[ch] := 0
END; {Procedure ClearFreq}

PROCEDURE NullArrays;
(* Fill BigArray and Pattern with nulls *)
VAR
  j : INTEGER;
BEGIN
  FOR j := 1 TO ArraySize DO
    BigArray[j] := CHR(0);
  FOR j := 1 TO MaxPat DO
    Pattern[j] := CHR(0)
END; {Procedure NullArrays}

PROCEDURE FillArray;
(*    Moves textfile from disk into BigArray, cleaning it            *)
(*    up and reducing any run of blanks to one blank.                *)
(*    Then copies to end of array a string of its opening            *)
(*    characters as long as the Pattern, in effect wrapping          *)
(*    the end to the beginning.                                      *)
VAR
  Blank : BOOLEAN;
  ch: CHAR;
  j: INTEGER;

  PROCEDURE Cleanup;
  (* Clears Carriage Returns, Linefeeds, and Tabs out of            *)
  (* input stream. All are changed to blanks.                       *)
  BEGIN
    IF ((ch = CHR(13))     {CR}
      OR (ch = CHR(10))   {LF}
      OR (ch = CHR(9)))  {TAB}
    THEN ch := ' '
  END;

BEGIN {Procedure FillArray}
  j := 1;
  Blank := false;
  WHILE (NOT EOF(f)) AND ( j <= (ArraySize - MaxPat)) DO
  BEGIN {While Not EOF}
    READ (f,ch);
    Cleanup;
    BigArray[j] := ch;                    {Place character in BigArray}
    IF ch = '' THEN Blank := true;
    j := j + 1;
    WHILE (Blank AND (NOT EOF(f))
      AND (j <= (ArraySize - MaxPat))) DO
    BEGIN {While Blank}                    {When a blank has just been}
      READ (f,ch);                            {printed, Blank is true,}
      Cleanup;                      {so succeeding blanks are skipped,}
      IF ch <> '' THEN                            {thus stopping runs.}
      BEGIN {If}
        Blank := false;
        BigArray[j] := ch;                 {To BigArray if not a Blank}
        j := j + 1
      END {If}
    END {While Blank}
  END; {While Not EOF}
  TotalChars := j - 1;
  IF BigArray[TotalChars] <> '' THEN
  BEGIN   {If no Blank at end of text, append one}
    TotalChars := TotalChars + 1;
    BigArray[TotalChars] := ' '
  END;
  {Copy front of array to back to simulate wraparound.}
  FOR j := 1 TO PatLength DO
    BigArray[TotalChars + j] := BigArray[j];
  TotalChars := TotalChars + PatLength;
  WRITELN('Characters read, plus wraparound = ',TotalChars:4)
END; {Procedure FillArray}

PROCEDURE FirstPattern;
(* User selects "order" of operation, an integer, n, in the          *)
(* range 1 .. 9. The input text will henceforth be scanned           *)
(* in n-sized chunks. The first n-1 characters of the input          *)
(* file are placed in the "Pattern" Array. The Pattern is            *)
(* written at the head of output.                                    *)
VAR
  j : INTEGER;
BEGIN
  FOR j := 1 TO PatLength DO           {Put opening chars into Pattern}
    Pattern[j] := BigArray[j];
  CharCount := PatLength;
  NearEnd := false;
  IF Verse THEN WRITE (' ');   {Align first line}
  FOR j := 1 TO PatLength DO
    WRITE (Pattern[j])
  END; {Procedure FirstPattern}

PROCEDURE InitSkip;
(*   The i-th entry of SkipArray contains the smallest index         *)
(*   j > i such that BigArray[j] = BigArray[i]. Thus SkipArray       *)
(*   links together all identical characters in BigArray.            *)
(*   StartSkip contains the index of the first occurrence of         *)
(*   each character. These two arrays are used to skip the           *)
(*   matching routine through the text, stopping only at             *)
(*   locations whose character matches the first character           *)
(*   in Pattern.                                                     *)
VAR
  ch : CHAR;
  j : INTEGER;
BEGIN
  FOR ch := ' ' TO '|' DO
    StartSkip[ch] := TotalChars + 1;
  FOR j := TotalChars DOWNTO 1 DO
  BEGIN
    ch := BigArray[j];
    SkipArray[j] := StartSkip[ch];
    StartSkip[ch] := j
  END
END; {Procedure InitSkip}

PROCEDURE Match;
(*   Checks BigArray for strings that match Pattern; for each        *)
(*   match found, notes following character and increments its       *)
(*   count in FreqArray. Position for first trial comes from         *)
(*   StartSkip; thereafter positions are taken from SkipArray.       *)
(*   Thus no sequence is checked unless its first character is       *)
(*   already known to match first character of Pattern.              *)
VAR
  i : INTEGER;     {one location before start of the match in BigArray}
  j : INTEGER; {index into Pattern}
  Found : BOOLEAN;      {true if there is a match from i+1 to i+j - 1 }
  ch1 : CHAR;       {the first character in Pattern; used for skipping}
  NxtCh : CHAR;
BEGIN {Procedure Match}
  ch1 := Pattern[1];
  i := StartSkip[ch1] - 1;         {is is 1 to left of the Match start}
  WHILE (i <= TotalChars - PatLength - 1) DO
  BEGIN {While}
    j := 1;
    Found := true;
    WHILE (Found AND (j <= PatLength)) DO
      IF BigArray[i+j] <> Pattern[j]
        THEN Found := false   {Go thru Pattern til Match fails}
        ELSE j := j + 1;
      IF Found THEN
      BEGIN            {Note next char and increment FreqArray}
        NxtCh := BigArray[i + PatLength + 1];
        FreqArray[NxtCh] := FreqArray[NxtCh] + 1
      END;
      i := SkipArray[i + 1] - 1  {Skip to next matching position}
    END {While}
  END; {Procedure Match}

PROCEDURE WriteCharacter;
(*   The next character is written. It is chosen at Random           *)
(*   from characters accumulated in FreqArray during last            *)
(*   scan of input. Output lines will average 50 character           *)
(*   in length. If "Verse" option has been selected, a new           *)
(*   line will commence after any word that ends with "|" in         *)
(*   input file. Thereafter lines will be indented until             *)
(*   the 50-character average has been made up.                      *)
VAR
  Counter, Total, Toss : INTEGER;
  ch : CHAR;
BEGIN
  Total := 0;
  FOR ch := ' ' TO '|' DO
  Total := Total + FreqArray[ch]; {Sum counts in FreqArray}
  Toss := TRUNC (Total * Random(Seed)) + 1;
  Counter := 31;
  REPEAT
    Counter := Counter + 1;                         {We begin with ' '}
    Toss := Toss - FreqArray[CHR(Counter)]
    until Toss <= 0;                                   {Char chosen by}
  NewChar := CHR(Counter);                    {successive subtractions}
  IF NewChar <> '|' THEN
    WRITE (NewChar);
  CharCount := CharCount + 1;
  IF CharCount MOD 50 = 0 THEN NearEnd := true;
  IF ((Verse) AND (NewChar = '|')) THEN WRITELN;
  IF ((NearEnd) AND (NewChar = ' ')) THEN
  BEGIN {If NearEnd}
    WRITELN;
    IF Verse THEN WRITE ('     ');
    NearEnd := false
  END {If NearEnd}
END; {Procedure Write Character}

PROCEDURE NewPattern;
(*   This removes the first character of the Pattern and             *)
(*   appends the character just printed. FreqArray is                *)
(*   zeroed in preparation for a new scan.                           *)
VAR
  j : INTEGER;
BEGIN
  FOR j := 1 to PatLength - 1 DO
    Pattern[j] := Pattern[j + 1];             {Move all chars leftward}
  Pattern[PatLength] := NewChar;                       {Append NewChar}
  ClearFreq
END; {Procedure NewPattern}

BEGIN {Main Program}
  ClearFreq;
  NullArrays;
  InParams;
  FillArray;
  FirstPattern;
  InitSkip;
  REPEAT
    Match;
    WriteCharacter;
    NewPattern
  UNTIL CharCount >= OutChars;
END. {Main Program}

Comments

  • edited February 2020

    I had to add those two \n's at the top of the code block to make the line numbers in this forum match what was printed in the magazine. For some reason the forum here kept insisting on representing two blank lines as <br/>'s

    I think I've transcribed pretty carefully, but there may be errors in what I've typed. Please let me know if you find any (and I've already caught one since posting it here.)

    Also, the code is printed in Helvetica in the magazine so, because that's not a monospace font, my representation of the spaces and indentation may not be exact.

    Finally, I seem to remember that when I first tried out this transcribed code, I encountered an error that turned out to be a typo in the the published code. I can't find that now, though, so perhaps I'm mistaken. If you do see something like that, let me know!

    ETA: Also the syntax highlighting has gone all wonky here. It may not be reliably interpreting Pascal.

  • @zachwhalen said:
    Since the term "travesty" is pretty dramatic, is there evidence in what it's doing that validates that idea of what it's doing to the text?

    I suppose it depends on how we characterize a "travesty." If we use it in the more light-hearted burlesque drama sense of "an adaptation with incongruous language" then the "incongruous" part certainly seems to fit. The etymology from "dressed up" and "disguised" (travesti) is also interesting: One of the central preoccupations of text generators and chatbots as cultural interventions has always been about their outputs passing or not passing... so it is an interesting twist to see the article illustrated (caption: "with apologies to James Joyce and Henry James") by two authors lightly disguised and then unmasked as each other. Interface indeed.

    I was also interested in this part of the article, on the optimized version of the code based on Shannon's text-as-lookup-table:

    After molasses and raindrops, this was a torrent! The new version flew like a bat out of hell and got nicknamed Hellbat. p468

    I went ahead and forked the repo, added a Hellbat branch and added a rough draft of the code and conversion notes for creating that variant. I haven't gone ahead and applied them to creating an actual hellbat.pas yet, but it might be interesting -- particularly given that this is a second version of the code that is only being distributed in the artiicle conceptually as a procedure or diff rather than a transcript.

  • The article itself was fantastic - a very nice write-up of various attempts to improve the algorithm given the constraints of the machines of the day.

    This may be of interest, by way of contrast. I've thrown a gist together that uses two approaches that would've been beyond the reach of machines the authors had access to.

    The first uses the approach they describe as unworkably slow - it just scans through the whole text to produce each successive character, building a frequency table as described in the article. Over the entirety of the text of Ulysses*, that runs at about two characters per second.

    The second builds a sparse table of every combination of n-gram prefixes, mapping each to a frequency table of succeeding characters. There's a small up-front cost for preprocessing, but then the whole thing runs blazingly fast.

    It goes to show how the constraints of yesteryear are turned completely on their head. Does that matter? Is the concern with the output of those algorithms (or their equivalent), or with the effort required to express them at the time?

    * I made no attempt to split the source back up into its eighteen constituent parts. Newlines were removed; the whole converted to lower-case before processing.

  • To respond to @zachwhalen and agree with @jeremydouglass, when I think of a travesty generator I do associate it with an over-the-top, outrageous performance. And I think this sits very comfortably with the 'fake research paper' genre which seems to be routinely used to scandalize and make an example of conferences and journals suspected to have low standards, or to call-out predatory conferences which attempt to pass as legitimate ones.

    One fairly involved travesty generated fake paper performance which I have heard of is the one put on by Jeremy Stribling, Daniel Aguayo and Maxwell Krohn, who are the creators of SCIgen. It's "...a program that generates random Computer Science research papers, including graphs, figures, and citations. It uses a hand-written context-free grammar to form all elements of the papers. ...[its aim] is to maximize amusement, rather than coherence. One useful purpose for such a program is to auto-generate submissions to conferences that you suspect might have very low submission standards"
    https://pdos.csail.mit.edu/archive/scigen/#talks

    Surprisingly, I wasn't able to find references to the BYTE article on the SCIgen webpage.

    A github repository for SCIgen exists, and looks like it's written in perl: https://github.com/strib/scigen

    The creators of SCIgen used it to generate a paper which was accepted to a predatory conference, but it was later declined because of the bad publicity that the generated paper had created. The creators were later able to raise money to book a space across the hallway from the predatory conference, and on the same day hosted a session where 3 randomly generated talks were presented. Photos and a video documenting the session show that the generated talk speakers/creators were using false names and were wearing novelty disguise or costume items such as wigs, although it is not explained why this is done.

Sign In or Register to comment.