Structuring Machine Learning Projects学习笔记(一)

文章作者:Tyan
博客:noahsnail.com  |  CSDN  |  简书

1. Introduction to ML Strategy

1.1 Why ML Strategy?

Teach you ways of analyzing a machine learning problem that will point you in the direction of the most promising things to try.

1.2 Orthogonalization

Chain of assumptions in ML

  • Fit training set well on cost function
  • Fit dev set well on cost function
  • Fit test set well on cost function
  • Performs well in real world

Orthogonalization or orthogonality is a system design property that assures that modifying an instruction or a component of an algorithm will not create or propagate side effects to other components of the system. It becomes easier to verify the algorithms independently from one another, it reduces testing and development time.

When a supervised learning system is design, these are the 4 assumptions that needs to be true and orthogonal.

  1. Fit training set well in cost function
    - If it doesn’t fit well, the use of a bigger neural network or switching to a better optimization algorithm might help.
  2. Fit development set well on cost function
    - If it doesn’t fit well, regularization or using bigger training set might help.
  3. Fit test set well on cost function
    - If it doesn’t fit well, the use of a bigger development set might help
  4. Performs well in real world
    - If it doesn’t perform well, the development test set is not set correctly or the cost function is not evaluating the right thing.

2. Setting up your goal

2.1 Single number evaluation metric

Set up a single real number evaluation metric for your problem.

Dev set + single number evaluation metric.

2.2 Satisficing and Optimiziong metric

multiple metrics

2.3 Train/dev/test distribution

2.4 Size of the dev and test sets

2.5 When to change dev/test sets and metrics

坚持技术分享,如果觉得有收获就打赏吧!