fromChina MobileOfficials have learned that during the 2024 China Mobile Global Partner Conference, China Mobile joined forces with the Electronic Standards Institute and 16 keycentral enterpriseJointly work on the construction of a large model evaluation system and release theUniversal large modelEvaluation Criteria》.
According to reports, the standard is an important result of the construction of the big model evaluation system for the industrySelection of quality large modelsProvide an important reference point. The first phase will be organized around the generic areas and the four key industry sectors, starting with theAssessment standardization, assessment base construction, assessment pilot applicationetc. to carry out their work.
The Generic Large Model rubric is based on the "2-4-6" framework as follows:
- "2": two types of evaluation perspectives, oriented to the actual use of key industry needs, and aligned with the national standard on the model capability requirements, the evaluation task is divided into two types of perspectives: comprehension and generation.
- "4": four categories of assessment elements, extracted from the full assessment lifecycleEvaluation tools, evaluation data, evaluation methods and evaluation metricsFour types of key elements to ensure the implementability of the assessment.
- "6": six evaluation dimensions that synthesize the process of applying the big model to theCore competencies that set functionality, accuracy, reliability, security, interactivity and applicationSix dimensions.