02/03/2022 A RNN Trump replica
I've came across this interesting application of Reccurent Neural Networks (RNN), using character based prediction to generate texts that matches the style of the training texts. Lucky enough, there was also a tensorflow tutorial on how to implement such magical model.
The tutorial feeds the RNN with a Shakespears work containing about a million characters. After a few minutes of training, it gave a humurous result and here I quote some:
ROMEO:
Fear me not, I.
QUEEN MARGARET:
Peace, veing Cominius, thou didst speak again.
QUEEN MARGARET:
The Lord Has it light love for none;
Unless a bring that shake I might be faint.
CLARENCE:
I fear you deport, and will have you judge
What's in him hither strike upon a crown.
TRANIO:
But sons a vision so break of a god to-do,
Be yet repeal'd; so with a villain, things for the
dearer.
COMINIUS:
Call that your grace shall I be?
It must be admitted that those texts can be quite easily identified as not Shakespears work, but the pure commedic value of this machine meant I had some motivation to investigate it further.
'Donald Trump' is the name that comes up in my head when I think about comedy, with his brutal and yet strangely humurous tweets bombing everyone he doesn't like, so let it be! Luckily I'm not the first person to ever think of generating Trump Tweets and someone had created a collection of all Trump tweets and generously shared it on Github. This will be very fun.
Let's feed this into the RNN and see what happens...
Democrats when he could not named as honor from nouline for ever time, Fake News judical for… https://t.co/VTeTC6bGwX",,Twitter for iPhone,18155,87326,False
1322333521660875744,Donald J. Trump,"Washington, DC",45th President of the United States of America🇺🇸,2009-03-18 13:46:38,88626622,51,6,True,2020-12-01 23:01:32,"Joe Biden is a corrupt politician and Texas!
https://t.co/fMPcTiERS0,['LSDCY will be going to the gives of Ohiota! I’m GREAT and… https://t.co/9EijGQiSDC",,Twitter for iPhone,38162,172720,False
1338544242865025682,Donald J. Trump,"Washington, DC",45th President of the United States of America🇺🇸,2009-03-18 13:46:38,85638946,50,4,True,2020-08-22 19:43:34,"Ho will bring & Scoter frien drugs, including my seav on Texas. We will some millions… https://t.co/eqAyQYvV8uC",,Twitter for iPhone,13087,92244,False
1322752355811031772,Donald J. Trump,"Washington, DC",45th President of the United States of America🇺🇸,2009-03-18 13:46:38,84533179,50,4,True,2020-07-29 21:00:38,Hard in Mail-In
The result seems to be a messy combination of mostly Twitter formats, however if you look carefully enough there are some very Trump things to say:
LSDCY will be going to the gives of Ohiota! I’m GREAT and…
Joe Biden is a corrupt politician and Texas!
And I have to say, not optimal but it is quite funny. Let's have a look at the csv file and see if there's any processing we can do to tidy it up a bit
id,user_name,user_location,user_description,user_created,user_followers,user_friends,user_favourites,user_verified,date,text,hashtags,source,retweets,favorites,is_retweet
1285299379746811915,Donald J. Trump,"Washington, DC",45th President of the United States of America🇺🇸,2009-03-18 13:46:38,84262854,50,4,True,2020-07-20 19:43:46,"We are United in our effort to defeat the Invisible China Virus, and many people say that it is Patriotic to wear a… https://t.co/zcdVQe2vRn",,Twitter for iPhone,57356,323246,False
1285196013020610562,Donald J. Trump,"Washington, DC",45th President of the United States of America🇺🇸,2009-03-18 13:46:38,84262854,50,4,True,2020-07-20 12:53:01,"Congratulations Allen, great job! https://t.co/jjeUk1SwGm",,Twitter for iPhone,20850,107710,False
The csv file is a table with informations like id, user_name, user_location etc. things we don't need when generating trump tweets, I mean those things are exactly the same for everyone else.
The easiest way to only take the text would be to import in a pandas dataframe and only keep text column and keep them as texts.
import pandas as pd
tweets = pd.read_csv('https://raw.githubusercontent.com/gabrielpreda/trump-tweets/master/trump_tweets.csv')
text = ' '.join(tweets['text'])
Let's start the generation with the keyword "Democrat" again and see what comes of it this time
Democrat how cin democr h… https://t.co/aFcmlkiUFI Belease Militars, Presidente Eurence to couth Kemman was today. Heritication off the GREAT 41% that happen totially, so our Country, you’re there that is us a trilitedre… https://t.co/n2S8zVXt9G Are, https://t.co/8WpcvQTkie I’s recess suikest people thinger. I with they hid?) @FoxNews The ABChtten Eeports and Sespreamed of the hon… https://t.co/y8bxbKlgAk Lest watch his sudfere & Trump on any ushed voted. Jastes a very denate is doing voters – and saf to say the geet who get you help free. Vote for our it fon absz’ned? “The will be a misisMall for Firen Pesporsey diffts to said https://t.co/0dm48klcR8 Biggest Secure our Georgia. Bor dy aspermont by @NBCNews. Textan with the @nytimit States are crazy adf tAppGREALL THE ELECTION BEATE!!! Today that they are say th… https://t.co/dxApgtW0zv Biden is a 33r LEl ALL OFS, JOE BUSCNOSEVEST! https://t.co/vnWvunDftc The “Chinming, and pailted of the U.S.… https://t.co/v86ynxVZ6Y In the people! https:
Other than some brief mention of @FoxNews and some vaguely making sense sentences like "Jastes a very denate is doing voters – and saf to say the geet who get you help free." The quality seems to have gotten worse. How could that be?
Maybe it's because the way training sets are divided, maybe some paragraph brakes will help. So let's do it
text = '\n'.join(tweets['text'])
And while I'm at it, those links the RNN generated are not particularly useful, as they just looks like twitter links but doesn't really open, so we can get rid of them in the training set to make things a bit more accurate.
This can be done the pythonian way. The batterys included wheel 're' does a perfect job with the function 'sub'
text = re.sub(r'https?://\S+','',text)
Now it's all set, let's run it again and see what happens
Democrats at 3:3% on Arizona... A dossible polling that will be the lost of the Presidenty will soou. Good Drug Count has been a disaster Biden, Crooks the Great Pountrican. Biggnt reloketting rempallevement “pootler poorcer a cance come dowh tho shorts out and VOTE and but other picclaw (Headiana for Elvering Sta…
Bod ExchropBDemmed Voter, Trump Nouch bet a STRONGEP BEK USA AMERICAND SPREMONY THANTY YOU TISTING NABHSTAVE AMERICA GREAT USD. AF SELTURE!
Presty Conored Biden his Republicans sign has noted on sardirg the extromy of 63%, inclaid on Democrats are…
Bid naming in that is doing and thi…
Peorgia: @O*LN’vomine.) Joe Biden me for fail. Thank you @VorChight for Bid Rup Well Tonster, that’s who helped for FLAR would ackoo, while apternoul nucterdows it. Big cump to courd colluce.
THANK YOU MICHiggating Ohilion, @NYCAMDF HIGT, lovely needs? All to through the vide Medic Dayen has to Dee. Step his fraj. Thank…
Tomelor so hin JimUR of Country, Faxal Earlith Media!
Shoting @Brieathorman
This comedy is of satisfactory level :)