’57 Classics vs Burstbucker Pros: A Pickup Comparison

Every single one of Gibson’s current offering of humbuckers seems to have the mission of recreating the sound of the legendary PAFs of old. Simple task, one would think, no? Get the specification of those, and adhere to it? Ohhhhh no. Turns out that this is basically impossible, because it seems that the golden-era PAFs were lashed together “by feel” more than adhering to any particular design spec, and so varied widely from pair to pair in their tonal characteristics. Particularly, the amount of wire wound onto the two magnets was only approximately measured out, resulting in asymmetry between the two coils of the humbucker, the degree of which affects something called the “sonic bite”.

In an effort to capture this range, or at least a few discrete points within it, Gibson now offers countless variations and twists on the original design, the descriptions of which tend to include “but with a slightly more…”. Two of the most popular of these variants, appearing on stock ES-335s and Les Paul Standards respectively, are ’57 Classics and Burstbucker Pros. These are similar insofar as they’re both based on the PAF design at heart, but with the following divergences:

’57 Classics

  • Made to the exact same specs as the original PAFs
  • The two coils of the humbucker are balanced, which is to say, each magnet has the same number of windings around it
  • The magnet itself is an Alnico II

Burstbucker Pros

  • The two coils are unbalanced, by a calculated degree to give rise to a little more attack
  • The magnet itself is an Alnico 5, which is the same only a little stronger than the Alnico II featured in the ’57 Classic

The Difference: In Words

Attempting to verbally describe the difference in sound which these differences give rise to is a hilarious business. You don’t have to look far* to find descriptions such as the following, which can only be described as poetry:

  • “a bit woodier, honkier, and more hair around the edges”
  • “ultimately, they are PAF’s with a little sumthin’ sumthin'”
  • “the bass and mids just jumps out and kicks you right in the nuts!”

* Source: the interwebz.

The Difference: In Sounds

Translating those into a sound in my head is beyond my inter-sensory capabilities, so let’s have a look and listen to these two clowns comparing guitars featuring the respective types of pickup:

The Sexiest Interracial Pairings, Based on Porn Consumption

It’s wonderful what you can get away with in the interests of science. (And it’s also true that anything you do is bound to be offensive/distasteful to somebody, so trying to avoid offending anyone is a fool’s game.)

Anything is fair material for examination. And what more interesting thing to try to describe and quantify than human sexuality?

It’s many-faceted, there are patterns, there are overlaps, it’s fairly close to the fundamentals of human behaviour rather than a derived thing in its own right… In other words, a prime sort of thing to apply a cheeky bit of Python to. Of course, you need some vast online repository of sexual material. The good news here is that, when I looked for one, I was surprisingly successful. Turns out there’s this thing called porn. Sorted on the “sufficient data” front, then…

Of course, being rather broad, trying to capture everything would be the sort of thing which required chapters, so I decided to narrow down to a certain category which seems to titillate some people, which is the old “interracial” category.

By crawling the “Interracial” section of a what I’m told is a well-known porn site1, it’s fairly straightforward to parse 1,000 pages of video titles (encompassing ~45,000 individual videos) and aggregate the views for each, ah, combination…

1Full code provided below…

Most Viewed Pairings

Wow, that “black – white” is really big… Let’s smooth the extreme range of this graph by taking \log_{10} (Views)

Well ain’t that interesting? The top 10 most popular combinations based on views alone are:

  1. black – white
  2. ebony – white
  3. asian – black
  4. asian – white
  5. indian – white
  6. black – latin
  7. black – japan
  8. black – indian
  9. arab – white
  10. arab – black

(If you struggle to read the x-values from the chart because it’s cramped, you should be able to click through and get some sweet, sweet tooltips.)

My immediate thought here is that at least some of this effect is going to be down to biases in the production rate of these various pairings, as well as the inherent “consumability” of them, so we’ll divide each count of views with the number of individual videos within that category to get a “views per video” view…

Normalised Views


These two graphs are different. The most-consumable categories on a “per video” basis are not the most produced. You can see that by overlaying the two:

If you put out one video featuring a black – Indian couple, you will get nearly 4 times as many views as if you produce one featuring a black – white couple, and even more if you go for Japan – Thai! If I were a porn producer, and I often wish I were, that’s where I’d be pumping my money…

(We’ll overlook the Asian – Japan combination as I suspect they might be two sides of the same coin and I could have misinterpreted the coincidence of the two words in titles.)

Bonus Filth

Of course, all of the above just looks for a certain combination of words in the titles of videos, but of course there’s a whole lot else also being said in those titles, and that’s something that’s just begging to be compared. Here’s how it looks for those top 10; the size of the word represents its relative popularity in titles for that combination…

1. Black – White

2. Ebony – White

3. Asian – Black

4. Asian – White

5. Indian – White

6. Black – Latin

7. Black – Japan

8. Black – Indian

9. Arab – White

10. Arab – Black

I’d attempt a commentary, but I’m blushing.


There are efficiencies which can be made to this, but I was bashing it out after a few beers and under the disapproving eye of the gf, so once it worked, that seemed a good time to stop.

import urllib2
from bs4 import BeautifulSoup
from wordcloud import WordCloud

races = ["black", "latin", "white", "asian", "BBC", "indian", "ebony", "japan", "thai", "mexican", "european", "czech", "arab"]

raceCombos = {}

vidCount = 0

x = 1
while x <=1000:
	rawPage = urllib2.urlopen("https://xhamster.com/channels/new-interracial-"+str(x)+".html").read()
	soup = BeautifulSoup(rawPage)
	vids = soup.find_all("div", class_="video")
	for vid in vids:
			mentions = []
			title = vid.find("u").contents[0].lower()
			views = int(vid.find("div", class_="views-value").contents[0].replace(",",""))
			for race in races:
				if race in str(title):
			if len(mentions) == 2: #Aint considering no group stuff fam
				mentions = sorted(mentions) #So black - white aggregates in line with white - black
				race1 = mentions[0]
				race2 = mentions[1]
				combo = race1+" - "+race2
				if combo in raceCombos:
					raceCombos[combo]["views"] += int(views)
					raceCombos[combo]["text"] += " "+title
					raceCombos[combo] = {}
					raceCombos[combo]["views"] = int(views)
					raceCombos[combo]["text"] = title
			vidCount += 1
	print("Done "+str(x)+" pages, "+str(vidCount)+" vids...")
	x += 1

for combo in raceCombos:
	print(combo, raceCombos[combo]["views"])
	text = raceCombos[combo]["text"]
	for word in combo.split(" - "):
		text = text.replace(word, "")
	wordcloud = WordCloud(width = 1200, height = 400).generate(text)

How Sharing a Bed with My Girlfriend Changed My Sleep

It’s not easy to catch some much-needed Zs with someone doing the horizontal version of the Macarena in bed next to you.


And I can tell you from experience, after many months of nights like this, one begins to fear for one’s health and sanity, but it’s a difficult one to broach. Polite requests such as “OMG CAN YOU FUCKIN KEEP STILL FOR A BIT?” inevitably lead to tension, but fortunately data solves arguments, so, delicate tact having failed, my way forward was clear…

As part of a short-lived and laughable health kick, I invested in a Fitbit in spring 2016 and therefore have a presumably reasonably-accurate record of my sleep since that time. In what can only be described as a generous nod, Fitbit allows you access to your own bodily information by way of an API, which facilitated a simple comparison of my sleep quality since we began sharing a bed, to that of a corresponding single-sleeping period beforehand.

At a naive first glance, the evidence was damning:

0.6 hours, or ~35 minutes, less sleep every night. Over the course of a lifetime that equates to 1.9 YEARS less time sleeping, assuming that the sleep deprivation itself didn’t shorten your life, or you weren’t murdered as part of the sleep arguments.

Makes you think, though: was my previous sleep profile simply shrunk by a linear factor, or was there a more interesting effect going on?

Turns out, since we started sharing a bed, most nights I have the same amount of sleep as before. The change in average has happened because the distribution has squished a bit: I have more nights with few hours’ sleep, and fewer nights with a massive sleep (goodbye lie-ins):

Back in the epoch of solitude, I pretty much always got at least 5.5 hours sleep, and often as much as 9.5. As we say in East Lancashire, them were the days.

But now: look at some of those nights! There were a fair few when I got less than 5 hours sleep, and considering that my employer only starts the working day at 10:00 and therefore I need to be awake at about 08:30, that speaks volumes about when I actually went to bed. Fortunately, one of the things tracked by Fitbit is what it thinks is your “hours in bed”, which presumably is a period when your bodily signals are consistent with being horizontal, still, and quiet. Comparing this metric for the Apart and Together periods yielded the following…

Whoa whoa whoa. What’s going on here? There are several deeply disturbing conclusions to draw:

  1. I’m spending a shit load less time even giving myself a chance to sleep.
  2. I’ve become a shit load less consistent in how much time I spend in bed.
  3. When I was solo bunking, there were days when I would spend as much as 12.5 hours in bed. Did I have deep-seated problems? Almost certainly.
  4. I presume some of the lower extremes are from times when the Fitbit had its battery run out, cease tracking because I was in a weird position, etc etc. Seems consistent in both the Apart and Together samples, which would support this.

So basically, I’m in bed less, and that’s why I sleep less. How does that look?

So most of the actual effect here comes from getting up earlier, with the secondary influencing factor being going to bed later…



The natural progression from this is to ask, so of the time I am even eligible for drifting off, how much of it am I sleeping? The answer is reassuring:


Whilst spending nights with my girlfriend results in less time in a state which Fitbit recognises as “lying in bed ready to sleep”, because we go to bed later and get up earlier, when I am in that state, I sleep better.


There isn’t really much to the code needed to rive the data down from Fitbit, but for anyone interested, here it is:

import fitbit
import datetime

authd_client = fitbit.Fitbit('client id here', 'client secret here'
	,access_token='you get the idea'
    , refresh_token='i am your biological mother')

outfile = open("output.txt", "a")
outfile.write("Period, Date, Sleep Start, Sleep End, Mins Asleep, Mins Awake, Times Awake, Mins Restless, Times Restless, Minutes Until Sleep, Minutes After Aleep, Mins in Bed\n")

def writeData(sDate, eDate, period):

	startDate = datetime.datetime.strptime(sDate, "%Y-%m-%d")
	reportingDate = startDate
	endDate = datetime.datetime.strptime(eDate, "%Y-%m-%d")
	while reportingDate <= endDate:
		response = authd_client.sleep(date=reportingDate, user_id=None)
		for thing in response["sleep"]:
			if thing["isMainSleep"] == True:
				start = thing["startTime"]
				end = thing["endTime"]
				asleepTime = thing["minutesAsleep"]

				awakeTime = thing["minutesAwake"]
				awakeNumber = thing["awakeningsCount"]

				restlessTime = thing["restlessDuration"]
				restlessNumber = thing["restlessCount"]

				fallAsleep = thing["minutesToFallAsleep"]
				lieIn = thing["minutesAfterWakeup"]				
				inBedTime = thing["timeInBed"]

		reportingDate += datetime.timedelta(days=1)
		print("Done "+str(reportingDate))

writeData("YYYY-MM-DD start of apart period", "YYYY-MM-DD end of apart period", "Before")
writeData("YYYY-MM-DD start of together period", "YYYY-MM-DD end of together period", "After")


And the charts were made using the magnificent Pygal.

Oxford Physics Admission Interview 2: Questions and Answers

It’s difficult to concisely say how my first interview left me feeling, but if I were forced to choose a word, I’d go with “bemused”. I’d gotten through one question fairly cleanly (like, I thought?), and one only with a significant amount of assistance from the tutors. Was that good? Was that expected? Having arrived 10 minutes early to the second interview on account of being extremely over-cautious in my estimates of how long it would take me to walk from my room to Tom Quad, there was nothing for it but to agonise over the dilemma whilst listening to the muffled voices in the room. Again I speculated how big the effect of having a good/bad candidate precede me would be.

After an awkward brushing past the previous candidate – it was a different one to before my first interview, so at least they were mixing it up that way – a face poked through the door and informed me that the interviewers would be with me in a minute. That was good. That was another unknown removed.

The setup was much the same as in the first interview: two tutors on one side of a desk upon which was some paper and biros, and me on the other. Once again, there was absolutely zero chit-chat about what I’d written in my personal statement, and question one came out…

Question 1

Imagine that you have two clocks. You can’t tell anything about their inner workings, just watch them tick. You take them both to the moon, and one goes crazy.

This was made no less funny by his saying it in a thick German accent. (I mean, he was German, he wasn’t putting it on just for the question.) I wouldn’t rise to that kind of cheap shot though, and merely nodded my understanding: sure, crazy clock. What of it? The other tutor took over…

Intuitively, what might you put this difference down to?

“GRAVITY?” I exploded, before quickly qualifying it with the more sober “I mean, a less-strong gravitational field?”

Dramatic shifts in seating positions.

What sort of clock might be affected by the gravitational field strength?

At this point, being aware that they were supposed to ask you about AS-Level stuff, I had an idea of where they might be going. I answered that a pendulum clock would.

If I were to tell you that the field strength on the surface of the moon is approximately a sixth of that on the surface of the earth, what would be the difference between the two clocks?

Ha! Rumbled. Fortunately, the formula for the time period of a pendulum clock was fresh in my mind from AS Level physics, and, thanks to that interview, still is to this day:


… where l is the pendulum length and g is the field strength. In trying to compare the behaviour of the two clocks, I wrote two versions for earth-clock and moon-clock:



And the interviewer had just told me that g_M=\frac{1}{6}g_E, so I could write


… or, to form a more direct comparison I suppose,


There was, again, that awkward period where I felt I’d gotten to an answer and the tutors just stared at me, so I vocalised the result I’d just derived with “the time period of the clock on the moon is root six times the time period of the one on the earth.”

And root six is about, what?

WHAT THE HELL KIND OF QUESTION IS THAT? Looking back, I suppose they were trying to see whether I thought that was a realistic answer or that alarm bells should be ringing, but at the time all I could think was “I’m being asked what the square root of 6 is in my Oxford interview.” I reasoned that it must be slightly above 2, so went with “about 2.3”. No feedback other than another awkward pause. The next question came out.

Question 2

Intuitively, what angle to the ground would you fire a cannon at in order to get the maximum range on the projectile?

I knew full-well that the answer was 45 degrees, having proved it at some stage during AS Level mechanics, and also having read recently that some guy had been the first to prove it. I was up front with this, but the interviewer was very keen that I was answering out of “intuition” rather than knowing the answer. Ah. Yes. Do let’s play that game. Talking about a reasonable compromise between vertical motion (for time of flight) and horizontal motion (for a greater rate of range progression), I went with 45 degrees with the air of one speculating that next summer might be warm.

Predictably, the extension was to prove this result, so I cracked out a cheeky diagram (framed prints are available – contact me):

Screen Shot 2014-08-20 at 12.50.14

… and wrote down the expressions for the vertical and horizontal components of the starting velocity:



The time of flight is decided solely by the vertical motion: the projectile undergoes constant acceleration due to gravity, until it reaches a vertical velocity of the same magnitude but in the opposite direction, by which point it will hit the ground again. (Neglecting effects due to air resistance and other important real-world factors obviously.) Using one of the constant acceleration equations (difference between initial and final velocities is equal to the product of the acceleration and time of flight), and denoting the time of flight as t and the acceleration as g, you get:


ie, the time of flight is


In the horizontal direction, no forces act, so the range of the projectile is just speed multiplied by time:



r=\frac{2V^2sin\theta cos\theta}{g}

This is all we need to answer the question – an expression for r as a function of θ, involving some constants V and g.

At this stage in the proceedings, the deities of trig identities threw me down a bone in the form of a memory. To find the maximum of this with respect to theta, you’d normally have to go through the rigmarole of differentiating, but by a lucky quirk of fate the relation sin(2\theta)=2sin\theta cos\theta was fresh in my mind, so I rewrote the above as


At this stage I took a chance that the interviewers would back me to know what a sine curve looks like, and that the (first) maximum occurs when the argument of the sine function is equal to 90 degrees, ie r_{max} occured when



The tutors looked at each other. One was clearly satisfied with this, and the other clearly wanted me to go through the differentiation rigmarole. Entertainingly, for some reason neither of them wanted to say this, so there was a good deal of grunting and widened eyes before one of them caved in and came out with the next question.

Question 3

You have a blob of metal that you can deform in any way you like. You start off with it in the form of a cuboid, and measure the resistance between each “end”. Now you want to deform it into another cuboid, where the resistance between the ends is double that of the original cuboid. How do you change its shape?

Ie, I wanted to go from something like this…

Screen Shot 2014-08-20 at 13.33.36

… to something like this…

Screen Shot 2014-08-20 at 13.34.28

… so that the resistance across the block in the latter case is double that of the block in the first place.

Given that we were talking about changing the length and cross-sectional area of a piece of metal, it seemed that the resistivity formula would be relevant, so I wrote that down:

R=\frac{\rho l}{A}

… where ρ is the resistivity of the material, l is the length of material, and A is the cross-sectional area.

One of the tutors encouraged me to write down “before” and “after” equations, and impose the relation between them that the “after” resistance be twice the “before” resistance.

R_1=\frac{\rho l_1}{A_1}

R_2=\frac{\rho l_2}{A_2}


At this stage I supposed I needed some relationship between the various lengths and areas in order to eliminate some stuff and solve, so given that I was only changing the shape of the same bit of metal, I argued that the volumes before and after must be the same. If I multiplied the numerators and denominators on each side by the appropriate length, I could express the two in terms of volumes before and after thus:

\frac{\rho l_2^2}{A_2l_2} = 2\frac{\rho l_1^2}{A_1l_1}

\frac{\rho l_2^2}{V_2} = 2\frac{\rho l_1^2}{V_1}

But, same piece of metal, so V_1=V_2, so…


… ie,


So there you go, stretch it into a cuboid about 1.4 times as long and you’d have double the resistance. The interviewers nodded. There were a few very short questions about some areas of physics I’d mentioned in my personal statement, which amounted to little more than “so, you like that, do you?” And with that, my grand total of 40 minutes under the Oxford admissions microscope were over. In a haze of surreality I went and got my train.

Oxford Physics Admission Interview 1: Questions and Answers

Finding yourself at the culmination of two years’ anticipation and attempted preparation is oddly giddying. I’d visualised the beginning of my interviews so many times and in so many ways that standing in a dingy staircase in Blue Boar quad counting down the clock was absurdly low-key. For something that I myself had long-since defined as an important crossroads in my life, there should have been more drama. Maybe a murder. Something like that.

There I was though, sitting in one of those mass-produced plastic chairs that are ubiquitous through school, making forced conversation with some posh twat interviewing for History of Art or some bullshit, whose Mum turned out to be from the same town as me. Looking back, it was sort of disheartening how utterly flabbergasted he was to have come across another person who came from the town.

The interviewee before me scuttled out of the room and gave me no more than an inscrutable glance before going down the stairs. My competition; my rival. It’s strange to think that now, as we became, and remain to this day, great friends and shared most of the ups and down of uni. But in that moment, statistically, there was half a place that the two of us were going head-to-head for. Grr.

None of this was helpful to be thinking about, of course. A few minutes later I was sitting across the desk from the two interviewers, with a few sheets of blank A4 and a couple of biros between us. One was a tutorial fellow of the college who I’d stalked at length, the other a younger post-doc type. Someone had drawn a Gaussian and the number “1089” on the whiteboard. The manner of my showing in struck me at the time as bizarrely muted – was I in the bad books already? It didn’t occur to me until years later that the majority of professional physicists are simply awkward AF.

With very little in the way of preamble, the first question was forthcoming.

Question 1

Imagine that you have a length of fence, which you can bend at arbitrary points to make “corners”. You also have a wall, whose length is very much greater than the length of the fence. You’re going to use the fence to make three sides of a rectangle, and the fourth side will be some of the wall. What’s the maximum area you can enclose, and how do you arrange the fence to do that?

That was such an odd thing to hear said out loud that I thought I’d better draw a diagram to confirm whether I’d even understood the question…

Screen Shot 2014-08-12 at 12.23.05

(I know. Should have been an artist.)

The free variable was to be the length of wall incorporated, labelled x. Given that our enclosed area is rectangular, then the opposite side of the rectangle must also be of length x. If we denote the length of our fence by a, then we must have (a – x) length of fence left over for the “sides”. This must be shared equally, so each “side” must have a length of (a-x)/2.

Bringing out some pure Year 7 Maths gold, I proceeded to declare that the area must then be


All going well. There were a few weary, affirmative nods. The lad is shit hot on rectangles. No worries there.

So what I wanted to find out was what value of x would result in the maximum A, which sounded comfortingly like a “differentiate, set to zero to find stationary point” thing. I announced that this was what I was going to do and sincerely hoped that they wouldn’t want me to go to first principles and show why that gave rise to a max/min. Thankfully they were happy with me just going ahead and doing it, so I wrote down the agreeable-looking expression


Firmly in my comfort zone by this stage, I ploughed on…



I underlined this with a flourish. There were frowns and tilting of heads as the tutors attempted to read upside down what I’d written. “So… what’s the area?” one said. Good point. I hadn’t actually answered the question. My value for x represented the length of wall which gave you the maximum area, rather than the area itself.




Yes? Yes. Great. That went well.

Can you draw a graph of the area enclosed as a function of x?

Erm… Well… My eyes strayed towards the Gaussian on the whiteboard. But that couldn’t be right. It must be a parabola. And it must have zero value when x=0 and x=a. AND I’VE MOTHERFUCKIN JUST FOUND the max point! With those three points I sketched something. With a=4 it looks like this:

Screen Shot 2016-04-20 at 20.53.13

The interviewers nodded to each other. From that, I took that my answer sufficed. HA! Gaussian, pfft.

The younger of the two interviewers piped up to move us onto the next question.

Right, let’s go on to something that’s not easy.

A layer or two beneath the nerves I was feeling, I felt a ripple of anger at this. “Don’t feel pleased to have done that because it was easy. This won’t be, though.” Since that night I’ve been on the other side of the interview table many, many times, and I can’t imagine being motivated to undermine or discourage someone like that. Anyway, there was no time for that, because…

Question 2

Choose any 3 digit number. The only constraint is that the first digit within it has to be bigger than the third. (Eg, 321). Reverse the order of its digits, and subtract this from the original number. (Eg, 321 – 123 = 198). Reverse the digits of this new number, and add the result of that, to the new number. (Eg, 198 + 891 = 1089). You’ll notice that the answer has been written on the whiteboard behind you for the duration of the interview. Why is it always 1089?

Now, I won’t lie, my initial reaction was simply to think “oh shit”, because I had absolutely no idea and didn’t really know where to start. The proof is as follows, and eventually I got to it, but it was very much with the prompting and cues or the interviewers. It would not be fair in any way to say I proved it myself, even laboriously.

The reason it happens is because the process of reversing and subtracting twice effectively recovers and cancels out the original digits you chose, but with a few artefacts left over from all the dicking about reversing them (flipping digits between the hundreds and units columns), which sum to 1089.

Denote your initial 3 digits (comprising your number) as A, B and C. (Recalling that it is stipulated that A>C.) The number is then given by


Following the sequence of steps set out in the question, then,



Now we are instructed to reverse this number, which will be a 3 digit number because of the constraint A>C. We need to therefore rearrange it slightly to represent it in units of hundreds, tens and ones, and therefore be able to algebraically reverse it…


But again, A>C, so we have to borrow a 10:


Reversing again,

100(C-A+10) +10(9) +(A-C-1)

And adding this to the last variation, we get






With the tutors prompting me through, I was as surprised as anyone when this number appeared on my page, but made some agreeable “ahhh” sounds of understanding. You can see just from the above that all it boils down to is addition and subtraction algebra, but of the kind that’s awkward to keep track of when you’re nervous and under pressure.

There was an awkward pause where nobody made eye contact. Obviously I felt pretty foolish. Apparently not being able to bring themselves to speak to me about maths and physics any further, The Man Who Found Things Easy piped up with “I’ll show you out, then” and proceeded to walk carefully to the door, open it, and watch me through it, all in utter silence. It was weird.

And with that, I was one interview down. I had literally no idea how I’d done. It was about half past five in Oxford in the pit of mid-winter. I headed back to the room I’d been loaned to de-suit and whiled away the 21 hours until my next interview. Read all about it in part 2…

Automated Anonymous Interactions with Websites Using Python and Tor

The debate around privacy and anonymity on the internet really grinds my gears, because tbh shouldn’t it be everyone’s basic right to make a script to vote 10,000 times from 10,000 different IPs in a poll on the website of your local newspaper? I think so. The battle against anonymity is kinda like the battle against piracy, in that thermodynamics favours the dissenters. What I mean by that is that whilst there may be many solutions to a problem like blocking The Pirate Bay, there are many MORE workarounds, so, although a certain system may be working now, there are inevitably many higher-entropy states. In a scenario where you’re trying to block, or even discern, a path between two entities, somehow there’ll be a way. And people are smart. Bless ’em. Leaky buckets.

ANYWAY, let’s bring things right back down and set out how you can do stuff on websites en-masse. Anonymity is only relevant because most sites are clever enough to recognise that when the same identity tries to do the same thing a few times, something a bit dodge may be afoot. To mitigate that, we’ll be harnessing the power of Tor. If you don’t know what Tor is, I hope you’re having a pleasant retirement.

A few swift disclaimers:

  • Don’t do anything bad. It’s not my fault if you do.
  • Whilst everything below seems to maintain anonymity, it’s not my fault if you end up living in the Ecuadorian embassy.

Python, wonderful, wonderful language that it is, has libraries that both interact with Tor, and interact with web pages. With a little jiggerypokery, you can get these to talk to each other and do amusing things.


  1. Install the Tor browser bundle. We’re not actually going to use Tor browser for this shizz, but it’s the easiest way to get a fully configured Tor setup, and to start that service up.
  2. Install the Python splinter library. This is what we’ll use to control a browser window in code: send it to websites, click on things, fill in forms, and anything else we fancy.
    sudo pip install splinter
  3. Install the Python stem module. We need this to talk to Tor.
     sudo pip install stem
  4. Install the Firefox web driver, which is included in the Selenium installation. We need this because Firefox plays most nicely with the rest of the setup, and I’m too lazy and/or inept to figure out how to get everything working nicely with my preferred browser.
     sudo pip install selenium

Now we’re laughing. HAHAHA.

Connecting to Tor

Open the Tor Browser, allow it to go through it’s little initialisation, then once you get to the green screen congratulating you on being connected, minimise but don’t close that window. Having the browser open means that a Tor connection is running on your computer, and, whilst we aren’t going to use the Tor Browser window we just opened because we can’t control it automatically through Python, we are going to funnel our traffic through said Tor connection.

What we now do, is instantiate a Splinter browser, and tell it to use the Tor instance running locally on port 9150 as a proxy for SSL, socks and ftp…

import stem.process
from stem import Signal
from stem.control import Controller
from splinter import Browser

proxyIP = ""
proxyPort = 9150

proxy_settings = {"network.proxy.type":1,
    "network.proxy.ssl": proxyIP,
    "network.proxy.ssl_port": proxyPort,
    "network.proxy.socks": proxyIP,
    "network.proxy.socks_port": proxyPort,
    "network.proxy.socks_remote_dns": True,
    "network.proxy.ftp": proxyIP,
    "network.proxy.ftp_port": proxyPort
browser = Browser('firefox', profile_preferences=proxy_settings)

If you run this and it’s working, a Firefox browser will open itself, go to www.icanhazip.com (which, for the rain men among you, is a website which tells you the IP you’re using to access it), and receive an IP other than your actual one.

If you get some error related to loading “the profile” or similar, check you have the most recent version of selenium via

sudo pip install --upgrade selenium

If you get some error about the proxy server not accepting connections, make sure you have Tor Browser open. (I did mention that earlier, if you recall.)

NB: you have to include the “http://” in the website url.

Getting a New IP

Given that this is potentially something that we’ll want to do often in…. whatever it is that we’re doing, it makes sense to make it into a cheeky function:

def switchIP():
    with Controller.from_port(port=9151) as controller:

The “NEWNYM” signal is apparently what you send to your Tor connection when you want a new identity. Who knew?

To test that this is working, we can request a new IP 10 times, and after each request visit www.icanhazip.com to verify that we are indeed coming via a different exit node…

for x in range(10):

Greatness beckons.

Interacting With Websites

Ah yes. Doing stuff. This is where Splinter delivers.

What this code looks like depends very much on the website you want to do stuff on, and what you want to do. For this example I’m going to a bit of commenting, but if that doesn’t cover whatever you want to do, then Splinter is extremely well-documented so just zip over there and look it up.

Purely by way of an example, let’s head over to dogdogfish.com and register our opinion once or twice…

def interactWithSite(browser, deduplication):
    browser.fill("comment", "But the thing is... Why would anyone ever want to do this? I must have thought that "+str(deduplication)+" times...")
    browser.fill("author", "Pebblor El Munchy")
    browser.fill("email", "barack@tehwhitehouz.gov")
    browser.fill("url", "https://upload.wikimedia.org/wikipedia/en/1/16/Drevil_million_dollars.jpg")
    button = browser.find_by_name("submit")

For finding the names of textboxes, buttons, etc, just do a cheeky “inspect element” on the page:

Screen Shot 2016-04-19 at 20.53.43
Screen Shot 2016-04-19 at 20.25.19

Then finally you can draw all of this together with the following few lines:

for x in range(1000):
    interactWithSite(browser, x)

Internet, I am thy master.

Using Machine Learning to Generate Lyrics in the Style of your Favourite Artist

I once saw a pyramid depiction of human needs. At the bottom was basic sustenance, and on top of that shelter and safety, and on top of that was art and culture and that. It left me confused, because it omitted something which I think all of us want on a deep and fundamental level: A way to capture the lyrical idiosyncrasies of an artist in a machine learning model, and thereby churn out an arbitrary amount of pure poetry in the style of that artist. Fortunately that’s going to be addressed in this post.

We need three things.

  1. A means to get a bunch of lyrics from a given artist, in order to learn about their style.
  2. Some way of capturing said style – the way this artist tends to put lyrics together – in a model.
  3. Off the back of 1 and 2, something that uses the rules to spit out some pure artistry.

And so, without further ado!

1 – Getting Lyrics to Learn From

The good eggs at lyricsnmusic.com have got this covered for us with their dank API. Sign yourself up for an API key then it’s as simple as this to get some lyrics for, say, Coldplay. I heard that the favourite browser of the guys over there is Mozilla/5.0 so I’ve added that as our “browser signature” as a gesture of good faith… 😉

url = "http://api.lyricsnmusic.com/songs?api_key="+apiKey+"&amp;artist=coldplay"

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')] #lol
response = opener.open(url).read()

feed = json.loads(response)
for song in feed:
He said I'm gonna buy this place and burn it down
I'm gonna put it six feet underground
He said I'm gonna buy this place and watch it fall
Stand he...
Come on, oh my star is fading
And I swerve out of control
If I, if I'd only waited
I'd not be stuck here in this hole

Martin, you pretentious beauty! What we have here is a JSON structure of songs, each of which has properties like “title” and “snippet”, which, lamentably, is not the full lyrics. The boys have got us covered again, though, because they also provide a URL to the page containing the full lyrics, so we can just mosey on over there and scrape the shit out of those.

While we’re at it, let’s think ahead, and simply condense all this band’s lyrics into one big string, which we’ll call trainingText. Strange name, you say? No stranger than “Nigella”, I respond.

trainingText = ""
for song in feed:
    snippet = song["snippet"]
    fullUrl = song["url"]
    fullPageHTML = opener.open(fullUrl).read()
    page = BeautifulSoup(fullPageHTML, "html.parser")

        lyrics = str(page.findAll("pre")[0]).replace("
&lt;pre itemprop=\"description\"&gt;","").replace("&lt;/pre&gt;

        #There's probably a better way of doing this, but I'm not, nor have I ever been, the queen.
        trainingText += lyrics
        try: trainingText += snippet+"\n"
        except: continue

What this gives you is essentially one big song of everything available for the artist. AND THAT’S IT. Let’s generalise this to a function which accepts a band and an API key as inputs, helpfully keeps you posted on its song-learning progress, and then gives you back the compiled vocal material of the artist you give it…

def getLyrics(band, apiKey):
    encodedBand = urllib.quote_plus(band)
    url = "http://api.lyricsnmusic.com/songs?api_key="+apiKey+"&amp;artist="+encodedBand

    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')] #lol
    response = opener.open(url).read()
    feed = json.loads(response)
    trainingText = ""
    songsProcessed = 0
    for song in feed:
        snippet = song["snippet"]
        fullUrl = song["url"]
        fullPageHTML = opener.open(fullUrl).read()
        page = BeautifulSoup(fullPageHTML, "html.parser")
            lyrics = str(page.findAll("pre")[0]).replace("
&lt;pre itemprop=\"description\"&gt;","").replace("&lt;/pre&gt;

            #There's probably a better way of doing this, but I'm not, nor have I ever been, the queen.
            trainingText += lyrics
            try: trainingText += snippet+"\n"
            except: continue
        songsProcessed += 1
        print("Learned "+str(songsProcessed)+" songs...")


2 – Capturing the Artist’s Poetic Nuances

Where would such-and-such-an-artist go with a certain theme? How would their lines play out? How long would they be? When would they decide it was time for a new verse?

Based on the back catalogue provided by step 1, we can build a probabilistic model of this, called a Markov Chain. The below is a slight simplification, but captures the essence of how it works.

Supposing we feed into our model the line “the boys are back in town”.

The model takes the first two words – “the boys”, and looks what follows. It finds “are back”. So at this point, we know that “are back” follows “the boys” in 1 out of 1 occurences, ie, 100% of the time. If we’re asked the question of continuing something starting with “the boys”, we will choose “are back” every time, because it’s all we know.

But then supposing we read a little further in the catalogue and see the line “if the boys wanna fight, you better let ’em”. Now we enrich our model with the knowledge that if you see “the boys”, half the time “wanna fight” comes next. Half the time “are back” comes next. Based on that knowledge, if you asked us many times to carry on from “the boys”, on average half the time we would choose “are back”, and half the time we would choose “wanna fight”.

Of course, you wouldn’t just look at what follows “the boys”; you’d look at what follows every other combination of words in your “training” text. That’s what we’re going to do with the massive long list of lyrics. But rather than looking ahead one word, we’re going to look ahead one character, which will be helpful in deciding when to put punctuation in, when to break to a new line, and when to break into a new verse. And, similarly, rather than looking at what’s likely to come next after two words, we’re going to look at what’s likely to follow a certain number of characters. We’ll move this “window” all the way through the past lyrics we got in step 1, like so:

The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town
The boys are back in town

And every time we move the window, we have a look what the next character is, and amend our record of how many times we’ve seen that next character, for this window.

The width of that window is called the order of the Markov Chain, and is essentially a measure of how long the “memory” of our model is. It has a profound bearing on the outcome of the lyric generation, because declaring that “s” follows “y” is a lot different to declaring that “s” follows “the boy”.

Anyway. Here’s the code that does this with our lyrics:

def generateModel(text, order):
    model = {}
    for i in range(0, len(text)-order):
        fragment = text[i:i+order] #Range is exclusive at upper bound
        nextLetter = text[i+order] #So this is the next letter
        if fragment not in model:
            model[fragment] = {}
        if nextLetter not in model[fragment]:
            model[fragment][nextLetter] = 1
            model[fragment][nextLetter] += 1

Here’s what the model looks like for the line “sometimes I run sometimes I hide”:

('un somet', {'i': 1})
(' sometim', {'e': 1})
(' run som', {'e': 1})
('ometimes', {' ': 2})
(' I run s', {'o': 1})
('I run so', {'m': 1})
('imes I h', {'i': 1})
('metimes ', {'I': 2})
('imes I r', {'u': 1})
('es I run', {' ': 1})
('mes I ru', {'n': 1})
('times I ', {'h': 1, 'r': 1})
('sometime', {'s': 2})
('etimes I', {' ': 2})
('mes I hi', {'d': 1})
('n someti', {'m': 1})
('run some', {'t': 1})
('es I hid', {'e': 1})
('s I run ', {'s': 1})

Your way forward is pretty well determined, until you hit “times I “, at which point you have a 50/50 chance of the next letter being “r” or “h”. Once you choose on of those, you’re on a one-way track to either running or hiding. This is a trivial example, but running the entire back catalogue of a band through creates an interesting tree of probabilities which will take you in a different direction every time.

3 – Letting the car drive itself

What we need to do now is create a means of using the model to build up some lyrics. The way this works in Markov Chains is to look at the current window, get the next character probabilistically, then with that, move the window one place along, and repeat.

First let’s make a function which, given a current “fragment” and a model, gives you a next character…

def getNextCharacter(model, fragment):
    letters = []
    for letter in model[fragment].keys():
        for occurences in range(0, model[fragment][letter]):

This is doing something very simple. It looks up the possible next characters for this fragment in the model, creates a list where each character appears as many times as its weighting, then then chooses one at random and returns it. Skill.

So all that remains now is to use these functions to spool out some lyrics. Our function for that is going to look like this…

def generateLyrics(trainingText, order, length):
    model = generateModel(trainingText, order)
    currentFragment = trainingText[0:order]
    output = ""
    for i in range(0, length-order):
        newCharacter = getNextCharacter(model, currentFragment)
        output += newCharacter
        currentFragment = currentFragment[1:]+newCharacter

So what we’re saying to this function is: here’s some lyrics to learn from, use these to generate me some lyrics of a certain length, using a window size or order of such and such. The function uses the training lyrics to generate a model, then uses that to form, character-by-character, a meandering continuation of the first few characters of the training text. Once the lyrics are built up to the requested length, the string is returned.

The first thing you’re going to want to do with this is to try it for loads of different artists, and even try it for the same artist multiple times to appreciate how non-deterministic it is, so I’d suggest that the following is a nice way of calling the various functions:

band = raw_input("Enter artist:\n")

lyrics = getLyrics(band, apiKey)

newLyrics = generateLyrics(lyrics, 8, 600)


I Can Haz Lyrics?

Yep. Who are some artists with distinctive styles?


Oh noooo
Let it burn
Oh oh ohhhh
Let it burned while I cried
‘Cause I heard it screaming out your name,
You said I’m stubborn and raised
In a summer haze bound by the surprise
And he will feel like he’s been there for hours
And you’ll walk that mile
Until you kissed my lips and you prefer the floor
God only known each other,
Think of me in the deep (Tears are gonna fall, rolling in the pavements?
Even if, it leads nowhere
And I hear but our eyes, and settle for wrong


Cradle of Filth

Devildom voyeurs
Ascend to smother the spite seething Draconist
And commit this wolf of the graveyard, of the moon
Lowered Her mask to me
Your soul
Live for the reams
Of verses and curses
That heavenly brow
Crippled seraph shalt cower in illustrious courts
Whilst She entranced divined from the wolves are the rustic summers of my excess
Expurse of a whore
Receiving sole communion from fate
By alighting to discredit rebirth
Alone as a stone cold wish
To see the witch scholared Her
In even darker spheres
Delighting in my cold cell, when the priest comes

Kanye West

yeah yeah yeah, I got packs to get the clouds to break me down
The only rapper AND a producer and a glove, but didn’t have a buzz bigger than the souls of men
Louboutin on the street trying to get by
Stack ya money to buy her a few pairs of new Airs
Cause lately’s been a whole life (Ohh)
And I wonder if you used to feel invisible
Now that wet mouth
Uh, I know she find out what he is owed?
And throw away my bus pass any and every class
Lookin’ at every ass
Cheated on every song and
save their whole deal, Their wrist is on chill
They house warmin’
Sittin’ here, grillin’ people say


Bob Dylan

Zanzinger killed for no reason to roam
Don’t forget to flash
We’re all gonna meet
At that million dollar bash
Well, the comic book and me, just us, we caught ’em
And that is not
It doesn’t matter of minutes, on bail was out walking
But you and I, we’ve been deceived by the Fountain Bank
One bird book, and a bottle of bread
Yea, heavy and a bottle of bread
Yea, heavy and a buzzard and his weapon took from him
As they rot
While paupers change possessions
Each one means
At times I think that you do
Make me glad I’m in love with your money, pull up your shawl
Won’t you descend from

Led Zeppelin

Brother, I brought you together baby, I’m sure my shot-gun will.
Gonna go walkin’ through the country lanes,
I’ll be singing a song,
Hear me call your friends coming back home.You know I’m the one you want.
I must be time I’m leavin’,
Baby, dry those silver,
I brought you smiling at me,
That’s alright, I’d be the western shore
So now you’d better lay your mornin’ time is now
To sing my song
I’m going ’round the world
I got to find you remember times like these?
To think of us again?
And I do
Tangerine, Tangerine, Tangerine, Tangerine, Tangerine

(The Tangerine line is interesting – I think what’s happened is that at some stage the fragment “Tangerine, Tangerine” has appeared, so literally the only thing that has a chance of following “Tangerine” is “, Tangerine”, because the word is longer than the order.)

Justin Bieber

Rumors spreading ’bout this other guys?
I can see right from wrong
Help me when I got-got your body
Baby no no nobody has got what I need
‘Cause I didn’t believe
When I need is one love, one heart
My one heart
‘Cause I’m in love with you I’m losin’ you
I’m a me tell you one time (girl I love you)
You look so deep
You know you’re standing in front of the camera
She don’t stop until I find) my runaway love
Why can I choose between us no one else,
Want me to,
Baby we can share mine!
I know you care
Just shout whenever


Swing the scene
In the city tonight
We are gathered here to maim and kill
‘Cause this is what we have done unto you
But what is truth?I cannot die
Trapped far beyond my fate
I give
You take
This life off from me
Hold my breath as I wish for death
Oh please, God, help me
Hold my breath as I wish for death
Oh please, God, help me
Death in the fast lane is just how it seems to fade away
Drifting further everyday
Getting lost within myself
Nothing can save you
Justice is lost
Justice is raped


Your mother’s eyes, from your eyes gonna make a big noise
Playin’ in the sand cannot heal me like a jelly fish
I kinda like it
You call me Mister Fahrenheit
I’m trav’ling at the peak of the land
I seen every Wednesday evening
There´s no ending
The Seer he said
Beware the championsFlash a-ah
Savior of the night followed day
And the hub caps all gleam
When I’m cruisin’ in overdrive
Don’t you take me back to you
In rain or shine
You say shark I say bite
You say your folks are telling you
Write my letter feel much better
And use my fancy patter on the multit

You get the idea.

And of course (there’s always an “of course”) the fact that we’re using the Markov Chain approach to generate song lyrics is incidental. This code can be used to generate text that looks like any other text you care to feed it. That could be religious texts, product reviews, sales pitches, anything.

There’s a pretty diverse amount of material already out there, so, now we have this, we probably don’t need humans any more.

The complete code is available on Github.

SMS Spoofing with Python for Good and Evil

It all started with the best of intentions. I was an excitable graduate going through the second puberty of discovering that if you propositioned customers in the right way, a small percentage of them would buy your stuff.

Having been reasonably successful with personalized emails, it seemed that SMS was fair game as an extension. Fuck, Dominos and Royal Mail do it, and they’re only partially more time-relevant than a sale on a website you bought a novelty Luis Suarez biting bottle opener from two years ago. Turned out that my at-the-time boss didn’t agree, reacting in a way that can be summarized as “strongly negative” when he looked at a shirt on our site and simultaneously received a text which said something to the tune of “that shirt would go great with the brown size 11 shoes you bought 82 days ago.”


(But bosssssss, it’s a nexus of many technologies! Absolutely no way? Bahhh, okay then. No, yeah, I’ve totally been working on other stuff too.)

I digress, though. Nobody cares about that.

It was the “misusing the tech to banter your friends” stage of that development which, ultimately, led to the discovery of something much creepier.

Experience gained from that same stage of development during my email days led me to start off with volume and persistence, because if there’s one thing that computers are good at (and there is), it’s the old water torture. Clockwork SMS, who will do your bidding at £0.05 per message, seemed as good a bet as any to me, so that’s who I “pip install”d. (And then angrily “sudo pip install”d, of course.)

Lolling delightedly to myself, I set this baby loose…

from clockwork import clockwork
import time

api = clockwork.API("Your API code here")

lyrics = open("Bohemian Rhapsody Lyrics.txt", "r").readlines()

for line in lyrics:
    payload = line.replace("\n", "")
    message = clockwork.SMS(from_name = "F Mercury", to = "447000123456", message = payload)
    response = api.send(message)

… and, with the warm fuzzy feeling that accompanies the knowledge that a computer is busy doing one’s dirty work, retired to enjoy my evening partaking in my favourite pastime of throwing stones at traffic.

My friend was sent a line from “Bohemian Rhapsody” every 5 minutes until the song was done. Picture it: the numbingly-familiar “ding” of an incoming text and the muscle memory that responds by reaching for your nearby phone, maybe even the rational thought that, in all probability, this was just the latest in the pontifications of one F. Mercury, but the agonizing, irresistible, small-but-not-negligible chance THAT IT’S A LEGIT TEXT. No real hardship just to check. That bastard. Wish I hadn’t checked. Not going to check the next one. Sublime.

Text from Freddie.

Turned out he was really ill and I made a basic arithmetic error which resulted in the texts continuing through the night instead of being spread over a couple of hours, so that was rather less funny than it could have been, but I was hardly to know that at the time. And anyway, the fires of curiosity had been stoked in my mind.

Because of course, annoying though this is, it’s fairly obviously banter. Beyond seeking a restraining order, it’s not really going to influence people. Now, saying that, I should also say that this is where the “in the field” element of my noble research ended: what follows involved the phones of others, but also always their knowledge and consent. The life-wrecking potential of unleashing this shit on people fo’ real should be reserved for elected governments and security agencies. DISCLAIMER: don’t do it etc.

You’ll have noticed that the “from” argument in the call is what appears on the recipient’s phone when the SMS comes in. That in mind, the logical extension from “F Mercury”, or any other experimental name, is “Mum”.

message = clockwork.SMS(from_name = "Mum", to = "447000123456", message = "SURPRISE MUTHAFUCKA! Yo punk ass goin down bitch boi!")

Mum checking in.

Tehehehehe. Excellent. What a time to be alive.

There’s one pig in the poke, though: my phone is clever enough to know that this is only from “Mum” in a superficial sense. If I were forced to guess why this was, I’d probably go with the fact that this is from a string, not a number, which is where my real messages from “Mum” come from. So this message isn’t part of any thread. The real messages labeled as “Mum” come, at a presumably more fundamental level, from a phone number, which is saved under an alias in my contacts.

Ahh, but it's not real.

I wonder. I JUST WONDER.

The magic step is absurdly simple. What would happen if I put the from address as a phone number, the phone number stored under “Mum” in my contacts?

message = clockwork.SMS(from_name = "07999654321", to = "447000123456", message = "You don't know anything, and you stink.")

Seems Legit

Long have critics conjectured on the inspiration behind Lionel Ritchie’s time-immemorial opus “Hello, is it Me You’re Looking For?”. Now we know.

Here you have a trivially simple method of injecting legit-looking messages into text conversations with anyone, providing you know how your target stores the spoofee sender in their contacts. In the UK, that’s largely the difference between starting with 07, and +447.

The utterly exquisite reality is that, if your target then replies, that reply goes to the real sender, who is overwhelmingly likely to reply something like “what the hell are you on about?”, this being the first they’ve heard of this shady business. Does that look like backtracking or denial? I ain’t no Member of Parliament, but it sure sounds like it to me.

And if I may quote Mouse from The Matrix, that makes you wonder about a lot of things. How many friendships and/or relationships have fallen afoul of a simple misunderstanding, a simple miscommunication, one statement taken the wrong way, such as could be easily achieved through the above means?

How many fledgling romances need a seductive kick-start to get them over the stalemate of shyness?

Given that your communication is limited to one-way, what series of messages should you go with for maximum impact, and maximum imperviousness to whatever the replies might be?

Could you even construct some confusing pseudo-two-way thing, if you knew the numbers of both parties? Like two colleagues maybe?

Doesn’t bear thinking about, really. Much.