AI: Striving to Become a Trusted 'Future Advisor'

AI: Striving to Become a Trusted ‘Future Advisor’

Can you imagine what predictive technology looks like? When the foundational capabilities of general large models, the precision of specialized predictive models, the practical value of external tools, and the assurance of trustworthy mechanisms are organically integrated, AI will develop a new insightful perspective on the future. This will enable it to become a trusted ‘future advisor’ in critical areas such as financial risk control, weather forecasting, public governance, and industrial production, providing intelligent support for humanity to grasp future trends and becoming a significant force in empowering social development and serving national governance modernization.

Four Technical Paths for ‘Predicting the Future’

Faced with the increasingly complex prediction demands of the real world, researchers have developed two core lines and four specific technical paths around large model prediction technology. These paths are not mutually exclusive competitors but complement each other in different scenarios, collectively building a complete research framework for large model predictions.

The essential difference between the two core lines lies in whether a dedicated model is tailored for the prediction task: one path is to “borrow a boat to go to sea,” skillfully utilizing existing mature large language models for predictions; the other is to “build a ship for long voyages,” reconstructing specialized foundational models for predictions. Both paths advance together, adapting to diverse task requirements.

Directly calling upon large language models is the easiest way to implement large model predictions. Researchers translate various prediction tasks into common natural language questions, providing historical information, event backgrounds, and constraints for the model to directly assess future trends and output predictions. This method has a low threshold, requiring no significant modifications to the model; it merely changes the usage of existing tools, performing well in open-world problems such as news event analysis and business trend judgments. However, it falls short in meeting the stringent requirements for high-precision numerical predictions in fields like meteorology and finance due to the inherent limitations of large language models in numerical computation.

Time series tokenization modeling is a cross-domain “intelligent borrowing.” It cleverly introduces classic natural language processing ideas into time series data analysis, transforming continuous time series data into token representations similar to words in language through discretization, scaling, and quantization techniques. The representative model, Chronos, achieves probabilistic predictions and cross-dataset generalization by mapping time series to a fixed vocabulary, significantly reducing R&D costs by reusing mature language model architectures. However, this convenience comes at a cost, as the data transformation process inevitably leads to loss of numerical details and quantization errors, akin to a rough polishing of fine parts, which can affect prediction accuracy.

Constructing dedicated foundational models for time series marks a shift from “borrowing strength” to independent innovation in large model prediction research. Researchers no longer view time series merely as pseudo-text but design pre-training schemes and model architectures tailored to the essential laws of time series data and the core requirements of prediction tasks. Google’s TimesFM utilizes a decoder architecture, showcasing strong zero-shot prediction capabilities; Lag-Llama, developed by multiple U.S. universities and research institutions, focuses on probabilistic predictions and cross-domain generalization; and Moirai, developed by an American AI company, boldly attempts to adapt to more scenarios using a unified training approach. These models serve as customized “armor” for prediction tasks, closely aligning with the characteristics of prediction tasks and achieving higher precision in numerical predictions, making them the preferred choice for high-precision prediction scenarios.

Reprogramming large language models and multimodal integration provide low-cost thinking for large model predictions. Research related to Time-LLM confirms that without retraining massive time series models with hundreds of billions of parameters, simply reprogramming to align time series with text prototypes allows “frozen” large language models to participate in prediction tasks. This approach opens a feasible channel for the technical route of general large models plus specialized adaptations, further promoting the deep integration of text, numerical, and contextual knowledge modeling, allowing predictions to synthesize and integrate multi-source heterogeneous information, aligning more closely with the complex and variable prediction scenario needs of the real world.

These four technical paths do not have absolute advantages or disadvantages; they are like different keys that fit different locks. When prediction tasks require the integration of general knowledge and textual backgrounds for open trend judgments, routes related to large language models act as a universal key, providing a distinct advantage; when tasks pursue high-precision numerical outputs and stable cross-domain generalization capabilities, dedicated time series foundational models become the customized key for precise matching. They support and enhance each other under different R&D resource conditions and actual task requirements, collectively advancing large model prediction technology steadily forward.

Moving Towards Real Application Scenarios

In the research race of large model prediction technology, international research started earlier, with a more systematic technical framework, delving deeper and further into fundamental research and frontier exploration. Although domestic research started slightly later, it has rapidly caught up, forming unique advantages in scenario adaptation, open-source ecology, and application implementation.

International academic research on large model predictions has evolved from text reasoning to diverse predictions. Early studies primarily focused on using large language models for text reasoning and event development judgments, akin to cultivating a small plot of land; in recent years, research has gradually broken boundaries, expanding into time series, spatiotemporal data, and even scientific predictions, entering a new stage of “expanding territory.” In the more complex field of scientific predictions, Microsoft’s ClimaX has pioneered the establishment of foundational model frameworks for weather and climate tasks, while Aurora, also developed by Microsoft, extends foundational model ideas to the Earth system, capable of handling various prediction tasks such as weather, air quality, and wave forecasting, akin to equipping the Earth with an intelligent early warning system, showcasing the vast potential of scientific foundational models in complex system predictions.

Notably, the international academic community maintains a rational and cautious attitude towards the predictive capabilities of large models. Research has found that the excellent performance of large models in standardized tests does not equate to reliability in real-world future event judgments—GPT-4, for instance, performed worse than the median human group in open-world prediction competitions. Addressing this core issue, international researchers have successively conducted competition studies, retrieval enhancement studies, and uncertainty detection studies, leading to a distinctive characteristic of international research that emphasizes the balance of “model capability enhancement + prediction result validation + trustworthy mechanism construction,” laying a solid foundation for the practical application of technology.

Domestic research has leveraged the rapid development of general large models to achieve impressive late-stage catch-up, gradually forming a positive development pattern characterized by rapid iteration of general large models, systematic review research, and steady progress in application implementation. In the arena of general model ecosystem construction, various players showcase their strengths: Qianwen 3 has developed a complete system for multilingual support and inference efficiency optimization, akin to building a multilingual intelligent bridge; DeepSeek-V3 has achieved breakthroughs in high-performance open-source models, making core technologies more accessible; and Wenxin 4.5 continues to refine multimodal integration and engineering deployment, increasingly aligning with practical application needs. Although these general large models are not solely focused on prediction, they provide a solid capability foundation for domestic large model prediction research, allowing researchers to stand on the shoulders of “giants” to conduct more targeted studies.

In terms of application implementation, the domestic sector is actively exploring ways to bring large model prediction technology out of the “ivory tower” and into real application scenarios across various industries. Some studies have deeply integrated expert knowledge with large language models for strategic warning, accurately achieving trend judgments and risk identification in complex situations; others have closely combined large models with meteorological monitoring data to attempt to enhance the accuracy and timeliness of short-term precipitation predictions. While these studies are not entirely equivalent to pure numerical time series predictions, they signify that domestic large model prediction technology is transitioning from theoretical discussions to practical applications, beginning to explore technology paths that align with local needs and industry realities.

Overall, while foreign research delves deeper into the development of specialized foundational models for predictions and scientific predictions, forming a relatively complete technical system, domestic research stands out in Chinese scenario adaptation, low-cost open-source ecology construction, and industry application implementation, akin to building high-rise buildings that fit local conditions on the ground. With the continuous accumulation of high-quality time series data and industry-specific data in the domestic context, as well as the gradual improvement of specialized evaluation systems, there remains significant room for improvement in domestic foundational models aimed at prediction tasks, which will undoubtedly contribute unique and valuable Chinese wisdom to the development of global large model prediction technology.

Bridging the Gap from Powerful to Trustworthy

Compared to traditional prediction methods, large model prediction technology has achieved a profound transformation from “point calculations” to “comprehensive judgments,” evolving from cold mechanical calculation tools into intelligent entities capable of understanding context, weighing factors, and providing rational judgments. This unique ability stems from its inherent core advantages, yet like a growing star, it is steadily evolving towards “from powerful to trustworthy,” striving to become a reliable “future advisor” for humanity.

The core advantages of large model prediction technology are its innate exceptional capabilities, particularly evident in practical applications. Firstly, it has strong cross-task transferability. Traditional agricultural yield prediction models cannot be directly applied to stock market trend analysis; switching fields requires starting from scratch. In contrast, large models, with their general representation capabilities from extensive pre-training, can quickly adapt to various fields such as agriculture, finance, and industry with minimal samples. Secondly, it has great potential for handling complex dependency relationships. For instance, predicting river water levels during flood seasons is influenced by multiple factors such as rainfall, upstream discharge, and terrain, which traditional models struggle to capture. In contrast, time series foundational models can learn patterns within contextual ranges, akin to having “fire-eye golden eyes” that reveal the connections behind the data. Thirdly, it excels in multi-source information fusion. Traditional meteorological predictions rely solely on numerical monitoring data, while large models can integrate satellite cloud images, meteorological text broadcasts, geographic information, and other multi-source content, transforming predictions from “looking at a leopard through a tube” to “panoramic observation.” Fourthly, it offers excellent prediction interpretation and decision support capabilities. It can not only predict the trend of a particular stock but also explain the influencing factors behind it, such as industry policies and market supply-demand dynamics, even providing risk control suggestions, becoming a professional intelligent partner for decision-makers.

Despite its significant advantages, large model prediction technology is not without flaws; there remains a gap that urgently needs to be bridged from the laboratory to real-world application scenarios. Firstly, the model’s generation and reasoning capabilities do not equate to actual predictive capabilities. Some models perform excellently in simulated meteorological prediction tests but repeatedly “fail” in real severe convective weather warnings, simply because the test answers are hidden in the training data, while real predictions require comprehensive judgments of unoccurred events. Secondly, retrieval enhancement addresses symptoms rather than root causes. While pairing models with information retrieval improves prediction accuracy, it also highlights that models rely solely on their memory of knowledge, akin to guarding an old library, struggling to keep up with real-world changes; real-time access to the latest knowledge is crucial. Furthermore, hallucinations and unstable facts pose core obstacles, akin to hidden time bombs. Additionally, constraints related to cost, data, and evaluation systems make large-scale applications challenging. Training high-precision models requires massive computational resources, leading to high R&D costs; in reality, time series data is fragmented and lacks uniform labeling, making it difficult to produce high-quality outputs from poor-quality materials. Moreover, existing evaluation systems prioritize numerical errors over factual stability, causing many models to appear excellent yet struggle to be implemented.

Looking ahead, the development direction of large model prediction technology is clear and focused, centered around “from powerful to trustworthy,” to build a mature technical system that can stably serve real-world decision-making. Firstly, general large models will evolve into specialized foundational models for predictions, demonstrating stronger competitiveness in high-precision demand scenarios such as meteorology and finance. Secondly, tool enhancement will become an important direction, allowing models to autonomously invoke external tools like search and simulation, akin to equipping intelligent agents with a toolbox to better tackle complex scenarios. Thirdly, trustworthiness, controllability, and interpretability will become research priorities; future prediction systems must not only be numerically accurate but also quantify risks and trace judgment bases, which is key for high-risk scenario implementations. Fourthly, accelerating low-cost deployment and industrialization will transform technology from being exclusive assets of a few institutions into universal tools across various industries as inference costs decrease and open-source ecologies improve. Lastly, domestic research will focus on localized adaptation, creating specialized models that integrate the Chinese context and local data, making large models more accurate, stable, and trustworthy in domestic scenarios such as financial risk control and government early warnings.