Testing javascript with python

I was recently tasked with adding Mailcheck.js to some of our production pages and I want to describe a bit of the process I went through because I did some things a bit differently and had some fun along the way.

Lets start with a PSA - do not simply drop Mailcheck onto your website as is! In my opinion / findings the default algorithm is way too greedy - aka it will mostly suggest all emails should be ____@gmail.com. It is worth taking the time to tweak mailcheck for your particular userbase, on one wants to see a correction for their proper email address!

The first thing I did was dumped a ton of emails from our database to create a dataset to work with. I could have used Node to write some scripts to test out the Mailcheck behaviour but Python is just so much more convient for doing numerical analysis. Plus it’s what our data team uses so I could leverage some of their knowledge and code. So now for the fun part - I ended up using PyV8 (a python wrapper for calling out to Google’s V8 javascript engine). With this setup I was able to slice and dice through our production emails using python and pandas calling the exact javascript mailcheck algorithm and collecting my results. After tweaking the algorithm I could take the settings and new js code and put it in production.

Check out this wacky franken script that got the job done (pandas not included):

import PyV8

def init_mailcheck():
  global ctxt
  ctxt = PyV8.JSContext()
  ctxt.enter()
  ctxt.eval(open("mailcheck.js").read())


def run_sift3Distance(s1,s2):
  script = "Mailcheck.mailcheck.sift3Distance('%s','%s')" %(s1,s2)
  return ctxt.eval(script)


def run_splitEmail(email):
  script = "Mailcheck.mailcheck.splitEmail('%s')" %(email)
  return ctxt.eval(script)


def run_mailcheck(email):
  script = """ Mailcheck.mailcheck.run({
         email: "%s",
       })
   """ % (email)
  result =  ctxt.eval(script)
  if result:
    try:
      result = result.address + '@' + result.domain
    except(AttributeError):
       pass

  return result

if __name__=="__main__":
  init_mailcheck()
  print run_mailcheck("kevinhughes27@gmil.com")
  # >>> @kevinhughes27@gmail.com