Building my first Shopify App

A few weeks ago in my first quarterly hackathon at Shopify I joined a team that was building a Shopify App to help charities issue donation receipts for orders on their store. We got pretty far during the hackathon and afterwards I kept working on it in my free time. It was a good way to dog food our API and tooling which was my responsibility at work.

I finally finished the app and a few days ago it launched on the Shopify App Store. The app automates the process of sending customers tax receipts for their donations to a non-profit Shopify store using webhooks. It’s a pretty cool little app and a good example of how to build a simple piece of automation using webhooks.

I knew the scope of the app was going to be small so I wanted to pick an appropriately minimalist framework instead of Rails. I went with Sinatra and ended up extracting a small gem shopify-sinatra-app for others to use. The app itself is also open source, you can check out all the code and follow the ongoing development and maintenance here.

Testing javascript with python

I was recently tasked with adding Mailcheck.js to some of our production pages and I want to describe a bit of the process I went through because I did some things a bit differently and had some fun along the way.

Lets start with a PSA - do not simply drop Mailcheck onto your website as is! In my opinion / findings the default algorithm is way too greedy - aka it will mostly suggest all emails should be ____@gmail.com. It is worth taking the time to tweak mailcheck for your particular userbase, on one wants to see a correction for their proper email address!

The first thing I did was dumped a ton of emails from our database to create a dataset to work with. I could have used Node to write some scripts to test out the Mailcheck behaviour but Python is just so much more convient for doing numerical analysis. Plus it’s what our data team uses so I could leverage some of their knowledge and code. So now for the fun part - I ended up using PyV8 (a python wrapper for calling out to Google’s V8 javascript engine). With this setup I was able to slice and dice through our production emails using python and pandas calling the exact javascript mailcheck algorithm and collecting my results. After tweaking the algorithm I could take the settings and new js code and put it in production.

Check out this wacky franken script that got the job done (pandas not included):

import PyV8

def init_mailcheck():
  global ctxt
  ctxt = PyV8.JSContext()
  ctxt.enter()
  ctxt.eval(open("mailcheck.js").read())


def run_sift3Distance(s1,s2):
  script = "Mailcheck.mailcheck.sift3Distance('%s','%s')" %(s1,s2)
  return ctxt.eval(script)


def run_splitEmail(email):
  script = "Mailcheck.mailcheck.splitEmail('%s')" %(email)
  return ctxt.eval(script)


def run_mailcheck(email):
  script = """ Mailcheck.mailcheck.run({
         email: "%s",
       })
   """ % (email)
  result =  ctxt.eval(script)
  if result:
    try:
      result = result.address + '@' + result.domain
    except(AttributeError):
       pass

  return result

if __name__=="__main__":
  init_mailcheck()
  print run_mailcheck("kevinhughes27@gmil.com")
  # >>> @kevinhughes27@gmail.com

A python library for Incremental PCA (pyIPCA)

I extracted some of the useful code and nifty examples from the background of my Thesis as a python library for your enjoyment. PCA or Principal Component Analysis is a pretty common data analysis technique, incremental PCA lets you perform the same type of analysis but uses the input data one sample at a time rather than all at once.

The code fully conforms to the scikit-learn api and you should be able to easily use it anywhere you are currently using one of the sklearn.decomposition classes. In fact this library is sort of on the waiting list for sklearn.

IPCA on 2D point cloud shaped like an ellipse

Check it out if you’re interested and holla at sklearn if you want this feature! github.com/kevinhughes27/pyIPCA