Every now and then, I find a Kickstarter campaign worth of backing. The problem is, I don't browse that service on a regular basis, so usually when I learn about a product I really want, all of the better reward tiers are gone. These are usually labelled "Early Bird" or "Super Early Bird". Sometimes there are even different tiers of these early bird rewards. The difference can be in price (usually early bird is about $10 cheaper, sometimes more), earlier delivery time or what's included in the reward.

Random Kickstarter campaign with the Super Early Bird reward gone.

After seeing this, I'd think "damn, I should've browsed Kickstarter more frequently! Let me install their mobile app..." and that would be it. But a few years ago, after backing one campaign, I entered its page and noticed that there was one remaining spot for the early bird reward. What the?! An obvious in retrospective realisation came to me that it's normal that people cancel their orders. Or change tiers. You can do that freely before the campaign ends, without any charges—the money is collected when the campaign ends.

What if I managed to snatch one free spot in a more attractive reward just as it became available? Doing it manually would mean I'd have to refresh the campaign page every few minutes during the entire period when I'm awake. I'm too lazy to do that, especially if the prize is "just" $10 discount. But turn this into an automation challenge, and suddenly the reward turns into satisfaction that the code I wrote works.

Proof of concept

The algorithm is pretty straightforward:

  1. Check if there are empty spots in the reward tier I'm interested in.
  2. If there are, send a notification.

Checking for free spots

Fortunately, this particular aspect of Kickstarter is easy to interact with from machine's point of view. My language of choice for such tasks is still Python, especially because of availability of great libraries like requests for making HTTP requests, and BeautifulSoup for allowing to navigate in the HTML.

It's really simple: (if it doesn't look simple, just trust me it is!)

from bs4 import BeautifulSoup
import requests
import re

campaign_url = 'https://www.kickstarter.com/projects/...'
reward_id = 123456789

# Load the page and parse it using BeautifulSoup
resp = requests.get(campaign_url)
soup = BeautifulSoup(resp.content, features='html.parser')
# Find the reward box tag using the data-reward-id HTML attribute
li = soup.find('', attrs={'data-reward-id': reward_id})
# Check if that tag has pledge--available class
available = 'pledge--available' in li.get_attribute_list('class')
if available:
  # Extract number of remaining spots by finding the appropriate
  # li tag and using regular expressions
  limit = li.find('', class_='pledge__limit')
  match = re.match(r'Limited \((\d+) left of \d+\)', limit.text.strip())
  remaining = int(match.group(1))
  print(f'{remaining} spots remaining')
else:
  print('No spots left')
Barebones code that will check availability of rewards

Above script will, after being provided the campaign URL and reward ID you're interested in, print either 5 spots remaining or No spots left depending on whether that reward is still up for grabs.

Notifying me

That script wouldn't be very useful on its own. We also need a way to notify whenever there's a free spot. Any solution that works can be used, however the one I found to be really simple and effective is just sending an email. Which is easy, thanks to Python's builtin smtplib.

import smtplib

msg = f'''From: me@example.com
To: <my-email>
Subject: Kickstarter reward available!

Hey! It looks like the Kickstarter reward became available!

(this message has been generated automagically)
'''
smtp = smtplib.SMTP('localhost')
smtp.sendmail('me@example.com', ['<my-email>'], msg)

Scaling it up

This isn't the first time when I tried to have the machine notify me when something was ready. For example, looking at some old files lying around, I put my trust in code to notify me when Nintendo Switch was being released over two years ago. The supply was scarce, and every restock disappeared in a matter of minutes. The script checked two online shops and notified me as soon as the price changed from the dummy value (eg. 1 zł) to something more real—meaning it was available for preordering. It then sent me an email, allowing me to place the order.

Another time I did something similar was when my wife was awaiting her new passport after changing her last name. For some reason we didn't trust the promises we heard in the Consulate of Poland that they'd definitely surely 100% notify her when the passport was ready. Fortunately, there's the government website where you can check it yourself. Tired of checking manually every few days I automated it, and forgot about it. And then one time, when we were coming back from a dinner, we got surprised by this email:

Our surprise stemed from a few things: 1) the script actually worked properly, and 2) it took less than a month for the passport to be prepared, where waiting time can be up to 3 months.

Scaled algorithm

The revisited algorithm from above, in the webscale version, needs to work on multiple scenarios that can be vastly different. All of them can be roughly summarised to:

  1. Check if it still needs to be done
  2. Do it
  3. Notify about results (optionally)
  4. Mark the job as done (optionally)

Since I've done this a few times already over the past few years, I finally grew tired of always duplicating the same code. So I created something more generic. I consider this function to be the heart of the machinery:

def run(name: str, recipients: List[str], args: List[str]):
  if not should_execute(name):
    return
  module = import_module(f'.{name}', 'scripts')
  arguments = module.parse_args(args)
  response = module.fetch(arguments)
  status = module.is_finished(response)
  email = module.get_email(response)
  if email:
    send_mails(recipients, email)
  write_status(name, status)
The run function that calls all appropriate functions from an imported module (chosen by the name argument) in proper order.

It's responsible for finding and loading the appropriate module, and then doing these actions in order:

  • parse command line arguments
  • crawl the page and extract necessary data
  • check if the task is finished and shouldn't be crawled again (eg. the passport is ready, so there's no need to check anymore)
  • check if it's necessary to notify—if it is, call the send_mail function
  • write the status of the operation to the state management layer (described below)

In the first version, sending the email and considering the job to be done was pretty much the same thing. But it doesn't quite work with places like Kickstarter, where it's easy to miss out on the short timespan when there's a free spot available. Due to my laziness I opted for having to terminate the script after I managed to secure my desired reward, as opposed to having to restart it after it the task was marked as done.

And here's how an example module looks like:

@dataclass
class Arguments:
  pass

@dataclass
class Response:
  success: bool

@dataclass
class Email:
  subject: str
  body: str

def parse_args(args: List[str]) -> Arguments:
  """Parses command line arguments (eg. using argparse) and returns
  a structured object containing these arguments.
  """

def fetch(args: Arguments) -> Response:
  """Crawls over the page, extracts necessary data and puts it in the
  response object, which is then returned
  """

def is_finished(response: Response) -> bool:
  """Indicates whether the task is considered to be finished and should
  not be retried.
  """

def get_email(response: Response) -> Optional[Email]:
  """Returns an email to be used when sending the notification.
  
  Will return None if it determines that it's not yet time to
  """
It's literally the example.py module.

The best thing? The runner part (run.py above) doesn't care what's inside a module. It only cares that it has this specific interface (ie. these specific functions which return specific output).

State management

There are also a few helper functions that do their best to manage the state:

def get_filename(name: str) -> str:
  return join(dirname(abspath(__file__)), 'status', name)

def should_execute(name: str) -> bool:
  try:
    with open(get_filename(name), 'r') as f:
      return f.read().strip() != '1'
  except IOError:
    return True

def write_status(name: str, success: bool):
  with open(get_filename(name), 'w+') as f:
    f.write('1' if success else '0')
These are used in the run function above.

It's an extremely easy system for state management, using files in the filesystem to store 0s if tasks still need to be run, or 1 if they are completed. There are several caveats:

  1. It's limited to one instance of the script to be operational at time: meaning it's not possible to run the Kickstarter script for multiple campaigns at the moment, as they'd overwrite their statuses. Fortunately for me I'm rarely even buying anything there, and it's easy to improve this part when necessary.
  2. It's binary: either the job succeeded, or not. In the passport case, that's sufficient—I get an email, and the next action is going to the consulate to retrieve it, in real world, so I don't need another email. But in the Kickstarter case, what if I missed the first email and someone already took the reward? I'd like the script to continue working until I turn it off.

Scheduling

Because my needs are not webscale yet, I used the crontab as a simple way to make the script run every now and then. The frequency depends on the job: picking up passport had to be organised in real life, so one day of delay wouldn't going to make any difference—therefore that task ran daily. On the other hand, popular campaigns on Kickstarter with a lot of demand change rapidly, so much bigger frequency was necessary.

0 12 * * * run.py passport --passport-no=<passport_no> --recipients <my email> --recipients <my wife's email>
*/2 * * * * run.py kickstarter --url-path=<author/campaign> --reward-id=<id> --recipients <email>
This crontab entry will make the Kickstarter script run every two minutes, and passport one once per day at noon. Since I'm too lazy to always remember the syntax, I use helpers like crobtab.guru to make sure my schedule is alright.

The downside of this approach is that after the job is completed, the cron task will keep executing according to the schedule. If everything goes right, then it should exit very early thanks to the state management layer. But in the worst case (ie. I messed something up) it'll keep executing, and executing... until it's manually disabled.

What's the profit?

The obvious profit is that I saved some money. Whee!

That's not really the best part. I consider the biggest saving to be my time. By putting trust in my code, I don't have to franctically refresh the Kickstarter campaign page, or passport status page, or any other page and manually see if something changed. I can simply forget about it until I receive the notification, and use my brain cycles on something less repetitive. This is the same reason why I still use RSS to get notifications about new posts on blogs I read, instead of visiting them manually.

Moreover, isn't it satisfying to be able to forget about this script, and then be surprised by a sudden email—meaning that it's working correctly behind the scenes?