본문 바로가기
머신러닝

머신러닝에서 분류 문제를 처리할 때 Cross entropy를 손실함수로 쓰는 이유

by Play_With 2023. 2. 19.
반응형

Cross entropy is a commonly used loss function in machine learning classification problems, especially for problems with multiple classes, because it provides several advantages over the mean squared error (MSE) loss function. Here are some reasons why cross entropy is preferred over MSE

 

1. Cross entropy is a better fit for classification problems: In classification problems, we are interested in predicting the probability of a sample belonging to each class. Cross entropy is a natural fit for this problem because it measures the dissimilarity between the predicted class probabilities and the true class probabilities. MSE, on the other hand, measures the squared error between the predicted values and the true values, which is not well-suited for the classification task. 

2. Cross entropy encourages better class separation: Cross entropy loss is more sensitive to the differences between the predicted probabilities of the correct and incorrect classes. This means that cross entropy encourages the model to assign high probabilities to the correct class and low probabilities to the incorrect classes, which improves the separation between the classes.

 

3. Cross entropy is more robust to outliers: MSE is sensitive to outliers, as the squared error term can become very large for samples that are far from the predicted values. Cross entropy, on the other hand, only considers the logarithm of the predicted probabilities, which is less sensitive to outliers.

 

 

4. Cross entropy is mathematically more tractable: Cross entropy has a simple and efficient gradient calculation, making it computationally faster and easier to optimize than MSE in many cases.

 

In summary, cross entropy is a better choice for classification problems because it is a better fit for the problem, encourages better class separation, is more robust to outliers, and is computationally efficient.

 

평균오차제곱함수(MSE)는 예측값-손실값을 계산해주는데 반해, Cross entropy는 예측된 클래스와 정답 클래스 간의 다른점을 측정하기 때문에 머신러닝의 분류문제에 적용하기에 더 적합하다.

반응형

댓글