according toMicrosoftThe latest research paper is planned to be Excel, Google Sheets and other spreadsheet applications to develop new AI Large Language Model--SpreadsheetLLM.
Researchers say that existing spreadsheet applications are rich in functionality and provide users with a large number of options in terms of layout and formatting, so traditional AI large language models are difficult to cope with spreadsheet processing scenarios.
SpreadsheetLLM is an AI model designed specifically for spreadsheet applications. Microsoft has also developed SheetCompressor (compressed spreadsheet) to enable SpreadsheetLLM to better understand and process spreadsheet data.
According to the abstract of the paper, the SpreadsheetLLM model mainly consists of three modules: compression based on structural anchors, reverse index transformation, and data format-aware aggregation.
SpreadsheetLLM significantly improves the performance of the spreadsheet table detection task, outperforming the general method by 25.6% in the contextual learning setting of GPT4; using tokens reduces the cost by 96% and provides better processing results.
There is no news yet on when or if Microsoft plans to make SpreadsheetLLM available to the public. The paper notes that the model still has some limitations, and cannot efficiently handle spreadsheets that use background colors and borders; SheetCompressor cannot currently compress cells containing natural language, etc.