Odds Ratio Preference Optimization Trainer