Democratizing AI - new paper

Let’s discuss a new paper I have out this week:

This study asks whether and to what extent individuals who are not experts in AI or machine learning can contribute to creating and designing machine learning problems. Machine learning is a technical and challenging field. Practitioners often have years of experience, so there may simply not be room for non-experts to design their own prediction tasks. Yet, machine learning is becoming increasingly automated and automatic. The promise of the subfield of AutoML may be a democratization of machine learning, broadening its accessibility.

Of course, non-experts have long contributed to pre-existing problems by contributing training data, but only for pre-established problems 3. It’s easy for someone to help train a machine learning classifier that distinguishes between objects in images by labeling those images 4. But what if you don’t want to predict the contents of a photo? What if you have a new task in mind, say you want to predict how someone will respond positively or negatively to a piece of music, but you don’t have any experience in building binary classifiers? As a non-expert (in machine learning; you could be a deeply experienced and talented musician), what can you do?

Enter my paper. Here’s the abstract:

Non-experts have long made important contributions to machine learning (ML) by contributing training data, and recent work has shown that non-experts can also help with feature engineering by suggesting novel predictive features. However, non-experts have only contributed features to prediction tasks already posed by experienced ML practitioners. Here we study how non-experts can design prediction tasks themselves, what types of tasks non-experts will design, and whether predictive models can be automatically trained on data sourced for their tasks. We use a crowdsourcing platform where non-experts design predictive tasks that are then categorized and ranked by the crowd. Crowdsourced data are collected for top-ranked tasks and predictive models are then trained and evaluated automatically using those data. We show that individuals without ML experience can collectively construct useful datasets and that predictive models can be learned on these datasets, but challenges remain. The prediction tasks designed by non-experts covered a broad range of domains, from politics and current events to health behavior, demographics, and more. Proper instructions are crucial for non-experts, so we also conducted a randomized trial to understand how different instructions may influence the types of prediction tasks being proposed. In general, understanding better how non-experts can contribute to ML can further leverage advances in Automatic machine learning and has important implications as ML continues to drive workplace automation.

In essence, I used a crowdsourcing platform not to collect training data for a pre-existing prediction task, as is traditionally done, but to ask members of the crowd to propose their own prediction tasks. Here each prediction task is a set of input questions and 1 target question. Those familiar with machine learning will see that gathering answers to these questions will give us training data for a supervised learning problem: Try to predict an answer to the target question given only answers to the input questions.

Here are some examples of prediction tasks (Table 1 in the paper), all generated by study participants:

Prediction task
Target What is your annual income?
Input You have a job?
Input How much do you make per hour?
Input How many hours do you work per week?
Input How many weeks per year do you work?
Prediction task
Target Do you have a good doctor?
Input How many times have you had a physical in the last year?
Input How many times have you gone to the doctor in the past year?
Input How much do you weigh?
Input Do you have high blood pressure?
Prediction task
Target Has racial profiling in America gone too far?
Input Do you feel authorities should use race when determining who to give scrutiny to?
Input How many times have you been racially profiled?
Input Should laws be created to limit the use of racial profiling?
Input How many close friends of a race other than yourself do you have?

By couching the problem in terms of target and input questions, the participants don’t need to know supervised learning details. Here’s a screenshot of the instructions and part of the web interface that participants used to build prediction tasks 5.

After gathering a bunch of new prediction tasks, I asked other members of the crowd to categorize the tasks (is it about health? Politics?) and describe the task in several ways, then vote on the “quality” of the task. Using these votes, I ran a ranking algorithm to efficiently determine the few best (according to the crowd votes) tasks, which I then sent out as traditional data-collecting jobs on the crowdsourcing platform. These data were then used to automatically train predictive models to determine if accurate predictions could be made.

I found that machine learning methods could train accurate predictive models 6 but challenges remain. For example, if the question calls for a numeric answer with units associated with (how tall are you?) but those units aren’t given, then the different answers won’t necessarily be comparable. I generally found that regression tasks (where the target question had a numeric answer) were more challenging for automatic predictive models. In contrast, classification tasks (where the target question was, in my study, binary) were more likely to lead to predictive models. Participants were overall more likely to propose true/false questions, so it is plausible that they are more comfortable with this format.

There’s a lot more in the paper, including a randomized trial to see if an example prediction task helps participants understand the problem or if it biases them to certain types of prediction tasks (Do participants who see an example about predicting obesity go on to propose more health-focused tasks than participants not given an example?).

From the conclusion:

In general, the more that non-experts can contribute creatively to ML, and not merely provide training data, the more we can leverage areas such as AutoML to design new and meaningful applications of ML. More diverse groups can benefit from such applications, allowing for broader participation in jobs and industries that are changing due to machine-learning-driven workplace automation.

Check out the paper for more.

  1. Additional links:

  2. Here’s a BibTeX code if you want a quick cite:

        title = {Democratizing AI: non-expert design of prediction tasks},
        author = {Bagrow, James P.},
        year = 2020,
        volume = 6,
        pages = {e296},
        journal = {PeerJ Computer Science},
        issn = {2376-5992},
        doi = {10.7717/peerj-cs.296}

    Other reference formats available for download on the journal page↩︎

  3. At this point, it may be worth mentioning Kaggle, a platform for crowdsourcing machine learning models. But problems on Kaggle are designed by the data providers, not the crowd. The crowd builds the predictive models; Kaggle is an expert crowdsourcing market, where the crowd are, or will be, experts in machine learning. ↩︎

  4. Heck, we all do this every time a Google ReCaptcha appears. ↩︎

  5. Full details are in the paper’s Supplemental Information↩︎

  6. Specifically, I used random forest models. Random forests are a little out-of-date in the era of deep learning, but they are generally robust to overfitting, work for both supervised regression and classification, and are easy to get quite accurate predictions without much fine tuning. ↩︎

Jim Bagrow
Jim Bagrow
Associate Professor of Mathematics & Statistics

My research interests include complex networks, computational social science, and data science.