1st Problem: Data Table and Processing
The pre-processing in data table including
- Missing data
- Categorical data
- Numerical data
=> How can we solve it when data is missed?
Solution for 1st Problem:
There are a huge number of encoding method. However, what is the most relevant to your model?
- Ordinal
- One-Hot
- Binary
- Frequency
- Hashing
- Helmert
- Backward Difference
- Target
- Leave One Out
- Weight Of Evidence
- James-Stein
- M-estimator
However, it always worth tries all the techniques that apply to the feature and decides which one works best for your model.
2nd Problem: Ranking in Competition
For some of competitions and advertisements, the ranking results are central parts of many information retrieval problems, such as document retrieval, collaborative filtering, online advertising and racing competition. We can consider the experimental in the figure such as ranking problem for Horse Racing.
Solution for 2nd Problem
For this problem, we use the technical algorithms named
- XGboost Ranking
- LightGB Ranking
- CatBoost Ranking
for solving.