Cross entropy is a commonly used loss function in machine learning classification problems, especially for problems with multiple classes, because it provides several advantages over the mean squared error (MSE) loss function. Here are some reasons why cross entropy is preferred over MSE
1. Cross entropy is a better fit for classification problems: In classification problems, we are interested in predicting the probability of a sample belonging to each class. Cross entropy is a natural fit for this problem because it measures the dissimilarity between the predicted class probabilities and the true class probabilities. MSE, on the other hand, measures the squared error between the predicted values and the true values, which is not well-suited for the classification task.
2. Cross entropy encourages better class separation: Cross entropy loss is more sensitive to the differences between the predicted probabilities of the correct and incorrect classes. This means that cross entropy encourages the model to assign high probabilities to the correct class and low probabilities to the incorrect classes, which improves the separation between the classes.
3. Cross entropy is more robust to outliers: MSE is sensitive to outliers, as the squared error term can become very large for samples that are far from the predicted values. Cross entropy, on the other hand, only considers the logarithm of the predicted probabilities, which is less sensitive to outliers.
4. Cross entropy is mathematically more tractable: Cross entropy has a simple and efficient gradient calculation, making it computationally faster and easier to optimize than MSE in many cases.
In summary, cross entropy is a better choice for classification problems because it is a better fit for the problem, encourages better class separation, is more robust to outliers, and is computationally efficient.
평균오차제곱함수(MSE)는 예측값-손실값을 계산해주는데 반해, Cross entropy는 예측된 클래스와 정답 클래스 간의 다른점을 측정하기 때문에 머신러닝의 분류문제에 적용하기에 더 적합하다.
'머신러닝' 카테고리의 다른 글
연속적인 데이터를 예측하는 RNN(Recurrent Neural Network) (0) | 2023.02.25 |
---|---|
Softmax 활성함수의 특정한 그래프가 없는 이유 (0) | 2023.02.19 |
정보이론, 엔트로피, 크로스 엔트로피의 개념과 머신러닝 (0) | 2022.12.02 |
이미지분류를 위한 딥러닝 : CNN(Convolution Neural Network) (0) | 2022.11.30 |
머신러닝으로 어떤 문제들을 해결할 수 있을까 ② logistic 회귀 (0) | 2022.11.28 |
댓글