Salesforce, Python, SQL, & other ways to put your data where you need it

Need event music? 🎸

Live and recorded jazz, pop, and meditative music for your virtual conference / Zoom wedding / yoga class / private party with quality sound and a smooth technical experience

Purging my Twitter likes with Selenium and Python

04 Feb 2020 🔖 selenium python
💬 EN

Table of Contents

I wanted to hit the “reset” button on Twitter a bit without deleting my account, choosing a new name, rebuilding all the follows, etc. (If only I could bring myself to do this with physical stuff. Is there Selenium for the real world?)

Background

I’m proud of the ideas I stand for, but I always meant to treat Twitter the way people treat LinkedIn. I mean it to be “professional social media.”

Unfortunately, as in the real world, it can be hard to keep your professional life 100% isolated from the world at large – especially when it walks right up and stares you in the face “at work.” Twitter is widely used by people to talk about real-world issues. Tech using it as a “cool LinkedIn” makes us the outliers.

And besides, humans shouldn’t completely separate their “work selves” from their “true selves.” It can cause stress to the person keeping their thoughts “hidden.” It can make a person afraid of “rocking the boat” and keep them from being a much-needed ally to others in the workplace itself.

I haven’t fully figured out how to handle that dynamic.

But in the meantime, I decided maybe I’d just semi-“start over,” as if I’d just joined Twitter this month, and see how I feel like interacting with the platform now that I’ve been on it for a while.

(It’s not like ancient “likes” really do much to signal-boost worthy tweets, anyway.)

Selenium vs. API

I didn’t really want to tell Twitter my phone number, so getting an API account was out of the question.

So I wrote a Python script leveraging Selenium to “start fresh” for me.

Import

Here are all the import statements:

import time
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import StaleElementReferenceException
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By

There might be a few more there than I actually used – I copied/pasted from older scripts.

Web Browser, Login, and Constants

I fired up a new browser window by running this code, then commented it out so I wouldn’t accidentally run it again:

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--incognito")
browser = Chrome(chrome_options=chrome_options, executable_path='C:\\Users\\gump1149\\Documents\\Program Files\\ChromeDriver\\chromedriver79.exe')
browser.get(homepageLoginURL)

I used that browser window to log into https://www.twitter.com/login.

Then I set up the following variables:

userName = 'MY_USERNAME'
likesURL = 'https://twitter.com/' + userName + '/likes'

Thoughts

I never got the automation code “perfect.” The DOM of Twitter feed web pages seem to be highly unstable, even just as you click on things. I started with a pretty small unlike_article() definition and ended up with a massive one because I simply kept adding “Oh well, will deal with this tweet later” exception handling each time I got an error.

My approach to un-liking all my old likes (except the ones that mention me explicitly, which I wanted to go through by hand) was to simply run obliterate_likes_page() over and over again. With browser.get(likesURL) included if I wanted to “start over with what remained,” and with it commented out if the page had merely gotten “stuck” while scrolling and I wanted to pick up where I left off.

Note that Twitter un-liking seems to be a bit asynchronous – it can take some minutes before a tweet you un-liked stops showing up on your “likes” feed.

Also, you have to “re-like” and then “un-like” older tweets to get them to stop showing up on your “likes” feed.

I suppose the proof of my work will come in a month when I re-export my Twitter data and see if likes.js is actually smaller.

Fetch a list of currently-visible tweets

Each tweet has an HTML <article> tagset around it, so that’s the first thing I use Selenium to grab.

Here’s a function called get_list_visible_articles() that is meant to be called after browser already has likesURL loaded:

def get_list_visible_articles(browser):
    waitinit20 = WebDriverWait(browser, 20) # Make sure "likes" page is fully loaded
    waitinit20.until(EC.presence_of_element_located((By.XPATH,"//div[contains(@aria-label, 's liked Tweets')]"))) # Make sure "likes" page is fully loaded
	liked_tweets_outermost_div = browser.find_element_by_xpath("//div[contains(@aria-label, 's liked Tweets')]")
	articles_list = liked_tweets_outermost_div.find_elements_by_xpath("//article")
	return articles_list
--- ---