Week 47 — Lessons Learned from Regulatory AI Projects Part 1

Scott McNaughton
4 min readJan 31, 2020

What have we learned doing AI regulatory projects?

As I’m sure you have noticed, I’ve switched writing my weeknotes from Thursday evening to Friday evening. Doing it at the start of a weekend is much easier for me and for whatever reason it feels easier to write words after a long and fulfilling week than on a Thursday night when I still have one more day to the weekend.

This week I spent a lot of time plotting out the future of the projects I’ve shared with you in my weeknotes for 47 weeks. I’ve also started to look to what comes next in terms of projects which I’ll take a few lines to describe some of the ideas I am most excited about. I can’t promise that everything will stick but maybe it will inspire others to reach out or to do the work themselves. I like to think that as public servants we are not in competition with each other, there is enough space for anyone who wants to innovate and ideally we all should be working together.

Hindsight is 20/20 — What it takes to be successful with AI?

Most of the week has been spent reflecting on what has been learned while doing the various regulatory artificial intelligence projects — incorporation by reference and the regulatory evaluation platform. In past weeknotes, I shared some of the lessons learned especially around procurement but for this week’s entry I wanted to share one particular lesson learned that I think is applicable for anyone who is thinking or planning to do start an AI project.

It should come as no surprise that data: both the quantity and quality is important for an AI project. Without data, you have nothing. I think most people understand how important data is for any AI project. I’m not arguing that point because I think at some level we all understand that AI isn’t magic and it doesn’t just “learn” whatever we want it to (and if you think this… we should talk). Rather, I’d like to add detail about what I mean when I talk about quantity and quality of data.

Once you have defined your problem and you have a pretty good idea of why you want to solve that problem, it’s time to do a data assessment. Be deliberate and honest in assessing what data you have, how good is the data and whether you have enough of it. There are no hard and fast rules about what these qualifiers mean because it can vary from use case to use case. However, you want to make sure that your data (and data collection practices) are consistent, standardized, commonly understood, and ideally machine readable. You want to make sure people understand that the data field “apples” refers to “all granny smith, red delicious apples” and that everyone who counted apples did so in the same way. You want to make sure you have enough data points to train a model (usually at least a couple thousand) and that the problem you are trying to solve is actually a machine learning problem. If you don’t have enough data (whether via quality or quantity) then do you have another way to fill the gaps in your data? For example, if its an area where an expert distinguishes items between two categories, can you ask experts to label data points for you?

Upon reflection and hindsight being 20/20, a lot of the work of the past year would have benefited from a more robust up-front analysis of the data available, strategies to fill in gaps and an honest assessment of whether the problem we were trying to solve was indeed one that could be solved with machine learning, AI and/or natural language processing. Sometimes the up-front investment of time and resources to only find out that you aren’t ready for AI is worth it.

AI Demonstrator Projects (Incorporation by Reference, Regulatory Evaluation Platform, Rules as Code)

Regulatory Evaluation Platform: We have exercised the option on one of our contracts and getting ready to kick-off the next phase of this project. We took the time to reflect with a core set of regulators and data scientists the results of the first phase of work. I am planning a post for next week where I talk about the next lesson learned around the scope of an AI project but for now we are taking stock. Our initial thoughts are to focus almost exclusively on 1 use case, hammer out an impressive result (with lots of iteration) and then move to another use case. Our selected use case is to associate a regulation to 1 or many industry codes so we can conclusively know which regulations impact which industries across the supply chain. Using this information, we can start making inferences about how many regulations are faced by specific industries and how much burden that industry faces.

Incorporation by Reference: Nothing major to update on this project. We are meeting with our contractor next week before the major milestone on February 14 so I will have more to share next week.

Rules as Code: We had our introduction meeting between our contractor and the Labour Program who is the owner of the use case we selected. We are looking at the Canada Labour Standards Regulations Section 12 and 13 which detail vacation pay entitlements for eligible employees. These regulations are exciting ones to start with and involve a number of interesting concepts for us to work through like: what is an employer? what is an employee? how do we define time and capture that? A number of other initial issues have been identified as well but we are hopeful that starting next week when we have our first formal working/facilitated session that we can share significant progress on this Discovery project on Rules as Code.

It’s been a cold one in Ottawa but Week 47 is done. Have a good week!

--

--

Scott McNaughton

Working on public sector innovation one problem at a time. Found biking and hiking on weekends. Father of young baby… what is sleep?