Using AI to Discern Cats from CDC Vaccination Cards
As of 8/9/2021, the NYC Covid Safe app can’t tell the difference between a valid CDC vaccination card and a picture of a cat. Doing this isn’t just possible, it’s pretty easy. With AI – using node.js, textract, and rekognition – in fewer than 300 lines of code.
NYC Covid Safe recap
What happened, who was involved, and why I got interested.
😬 7/29/21 9:52pm, NYC Covid Safe was launched Via tweet from Council Member Mark Levine
This is an app promoted by the city that certifies a person’s vaccination status. It’s not the only one out there, but it’s the one that the city is getting behind to promote.
😺 13 minutes later, an intrepid New Yorker uploaded an image of a cat to the app, and got a ✅
Huge Ma, developer of Turbovax then did what no one over at NYC Covid Safe thought of doing – he found out if the app actually worked. Then he tweeted his findings. Here. To be honest, it made me a little bit sad as a New Yorker.
😢 8/4/2021 – The New York Times rained on Mark Levine’s parade
🤔 8/6/21 – I thought: Wait-justa-minnit
What if I could do a little bit better than NYC Covid Safe. Not like, a lot better. Just a little bit better. Seriously, this app can’t tell the difference between a valid CDC vaccination card versus a cat versus Mickey Mouse. So on Friday – which was a day off for me – I looked into it.
Google’s Vision API
– a really awesome api with highly usable code samples, and a jaw-dropping landmarks feature. I actually got a working code sample completed in about 30 minutes that can detect well-known architectural structures like the Seagram’s Building in New York City, just from partial image samples. I will definitely use this in the future!aws Textract
– This Amazon API handles OCR from scanned text, receipts, and even images of handwriting. It’s easy to work with, has multiple language bindings, and there are many solid usable code samples. I used textract to verify the presence of specific words within the uploaded image (ie: CDC, Vaccine, 1st Dose, 2nd Dose, etc.) and validate a minimal viable document structure following the key-value-pair layout of a standard CDC card.aws Rekognition
– This Amazon API handles object detection, facial recognition, sentiment inference, and a whole lot more. I used this in conjunction with textract to verify that a user’s image met baseline criteria for being vaccination-card-like (ie: that it resembles a Document, Id Card, or a piece of paper). I didn’t want to make this too stringent, and reject images with extraneous objects like: a cat or a tree or a bowl of cereal – because weird stuff winds up in hastily taken selfies. My solution was to simply ignore that stuff, and find the card.
💡 Using AI to discern cats from CDC cards
So here’s how I think I cleared the bar for just a little bit better: I manually upload an image to an S3 bucket I control, then run my code (locally) to analyze that image (in its bucket location). There are only 2 API calls that happen:
textract.analyzeDocument(params)
rekognition.detectLabels(params)
Demo Video
Available here: https://youtu.be/0xaeA7-bLl0
Code
👇🏽 Here’s the main async function that dispatches the API calls.
The git repo is here.
Card Scanner Part 2…
It’s an interactive file uploader. You can read about it here.