back · main · about · writing · notes · d3 · contact


AlphaD3M now has grammar, baby
27 Jun 2019 · 533 words

There’s this lovely little paper called Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar. It’s by Drori et al (ICML workshop 2019), and here’s the paper link: https://arxiv.org/pdf/1905.10345.pdf

But, what is AlphaD3M? Don’t know? Read this summary before continuing, brave adventurer.

I won’t give you an overview of this paper here (but the title kind of already does).

So, what’s interesting about it? The takeaways? Give it to me nice and straight, you cry, and don’t you dare waste words on frivolous sentences.

Sure. Here you go.

Intresting shit

Constraining primitives with a context free grammar works well. A sentence is made from words, and a pipeline is made from primitives. But only some word combinations make a sentence, and only some primitive combinations make a pipeline. It’s clear why a grammar is useful here. Why allow pipelines that don’t make sense?

The grammar vastly reduced the MCTS search space. The branching factor went down three-fold and the average search depth by an order of magnitude. Now the MCTS searches through less stuff and it’s a lot faster.

The model performance was unaffected by using a grammar. For some tasks performance was much better.

When comparing AutoML methods, the first part of computation is the most important. Many methods, given enough time, will eventually reach the same performance level. These methods will search the entire space until an optimal solution is found. What’s important is how quickly you get this optimal solution.

Pre-trained AlphaD3M trains twice as fast as AlphaD3M learning from scratch. AutoSklearn is twice as slow again.

Code is available for this variant of AlphaD3M, which is cool. It comes as a Dropbox folder.

Some other stuff

I think the architecture of the model is the same as the original AlphaD3M paper. The paper doesn’t emphasise any changes, at least.

The pre-trained model is pre-trained using “other datasets”, which I guess means datasets not considered in the 74 ones from OpenML used here. You could say the context free grammar is kind of pre-trained as well, since it’s based on machine learning pipelines that already exist.

Let’s talk a bit about primitives:

I appreciated the authors giving more detail about primitives in this paper. It helped me understand it a bit better.

Conclusion

AlphaD3M got faster and a bit better. It did this through: (a) using a context free grammar to restrict the space of ML pipelines, and (b) using a pre-trained model.


back · main · about · writing · notes · d3 · contact