News

In the new round of AI technology application competition, Unicom (Sichuan) Industrial Internet Co., Ltd. recently obtained a patent titled "A Weak Target Detection Method Based on Decomposition Model ...
Currently, mainstream AI alignment methods such as Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO) rely on high-quality human preference feedback data.