Choosing the best algorithm to solve your Machine Learning problem can be challenging when you first apply Machine Learning techniques. However, it is important to note that you do not need hundreds of algorithms to get the best results, instead, better choose the right one at each stage. We recommend that you split your Machine Learning project in three phases and apply the following models:
- Early stage - Rapid prototyping: when you first start applying Machine Learning techniques, decision trees and logistic regressions are the most appropriate options since they are very fast to train and easy to interpret. This is your opportunity to quickly see if the Machine Learning task is going to work fine based on available data, and if so, understand the features where bigger gains in performance are hiding. You are basically looking to minimize the risk that a more complex algorithm may hide problems in your data that may later come back when you try to generalize your model.
- Middle stage - Proven application: after you learn that your data contains a good amount of signal (and not just noise), thus suitable to apply Machine Learning on, the next step is to get better performance with BigML ensembles. With BigML ensembles you can create highly performant models that can then be used to build a smart application supported by BigML's workflow automation capabilities.
- Late Stage - Critical performance: finally, when the Machine Learning problem is well understood, and it is worth spending the additional compute time to squeeze out the last few percentage points in model accuracy and evaluation performance, we advise you to use deepnets or boosted trees (with a small learning rate). This won't make sense in every case due to the tradeoff between performance and complexity, but you can make this judgment for yourself based on the costs associated with incorrect predictions, e.g., showing the wrong ad to a user is not nearly as costly as diagnosing a patient incorrectly.