Microsoft ends support for Internet Explorer on June 16, 2022.
We recommend using one of the browsers listed below.
Please contact your browser provider for download and installation instructions.
July 9, 2025
NTT, Inc
News Highlights:
TOKYO - July 9, 2025 - NTT has established a new learning framework called "Portable Tuning" that eliminates the need to retrain specialized AI models when updating or changing the foundation model. This technology enables the transfer of the acquired knowledge to different foundation models by training and reusing an independent model that adjusts the outputs of the base models, achieving high specialization performance without additional retraining on new foundation models.
The results of this work will be presented at the prestigious International Conference on Machine Learning (ICML 2025)1, held in Vancouver, Canada, from July 13 to July 19, 2025.
In recent years, high-performance and diverse AI foundation models (hereafter, foundation models) have become widely available, leading to increased adoption of generative AIs across various companies and organizations. While foundation models can generally handle common tasks without additional training, it has become common practice to fine-tune these models on organization-specific datasets to create specialized models tailored for particular tasks or domains with higher accuracy.
As the use of generative AIs continues for a long time, the maintenance costs for such specialized models have become a significant issue. Particularly, foundation models are regularly updated to incorporate the latest knowledge or changes in their model architectures. To keep specialized models aligned with these updates, retraining is required each time the foundation model is updated. Training requires substantially more computational resources than inference, and hyperparameter tuning is also required for each foundation model, leading to challenges in both computational and human resource costs.
At NTT, to reduce these costs fundamentally, we had previously proposed a new principle called "learning transfer", which transfers existing training results across various foundation models. As a proof-of-concept, we demonstrated that learning trajectories in the parameter space can be transferred at low cost between distinct foundation models2. However, two challenges remain for this learning trajectory transfer:
In this research, aiming at the practical implementation of the above-mentioned "learning transfer" technology that transfers fine-tuned results between foundation models, we reinterpreted the conventional fine-tuning method and derived a new learning framework called "portable tuning" suitable for the principle of learning transfer (see Figure 1).
Conventional methods perform fine-tuning by directly optimizing the parameters of the foundation model for the given task and domain. To address Challenge 1, we introduced a "reward model" that corrects the outputs of the foundation model for each task or domain, which is trained as an independent model separate from the foundation model. Due to this independence of the reward model, it can be reused in inference with new foundation models, achieving performance comparable to conventional fine-tuning without actual retraining or additional learning.
Regarding Challenge 2, the reward model does not depend on the architecture of the foundation model but only on the output format (e.g., classification labels for image classification, token vocabulary for language generation), making it applicable even when the source and target foundation models have different architectures.
Although this technology incurs overhead proportional to the reward model during initial training and inference, it eliminates the need for retraining regardless of how many times the foundation model is updated, enabling its maintenance by a constant training cost (see Figure 2).
Point 1: Reformulating fine-tuning as reward learning
This research first focused on the insight that conventional fine-tuning can be reinterpreted as reward learning: during training, the model implicitly learns a task-specific reward function (i.e., a value indicating the desirability of each output for a given task), and during inference, it implicitly aims to maximize this reward (see Figure 3).
Based on this interpretation, we derived a more natural learning approach in which, instead of learning the reward implicitly, a reward model is explicitly trained during the learning phase. During inference, the output of the foundation model is then corrected in a direction that maximizes the values predicted by this reward model.
Furthermore, our analysis of the learning process revealed that the reward model learns in a desirable direction—increasing the reward for correct outputs, while suppressing the expected reward for outputs generated by the model itself—thus confirming that this approach functions as effective reward learning.
Point 2: Learning transfer via reuse of the reward model
In this method, the trained reward model can be reused during inference with other foundation models, thereby transferring the effects of specialization even to models that were not involved in the original training.
Theoretically, it is also guaranteed that the closer the probabilistic distributions of the foundation models are, the more successful the transfer will be. While there is a slight increase in cost during initial training and inference due to the reward model, this approach enables high-accuracy transfer without any additional training.
Moreover, since the transfer involves only correcting the output of the foundation model—and does not rely on its internal structure—the method is applicable across different model architectures.
In our experiments, we successfully transferred the results of domain-specific fine-tuning (for vision foundation models) and instruction tuning (for language foundation models) to other foundation models of varying sizes and pretraining datasets. These transfers achieved performance comparable to models that were fine-tuned individually for each foundation model.
(See Figure 4: In this experiment, reward learning was performed on ViT-B-163, a vision foundation model pretrained on the LAION-400M4 dataset (leftmost model), and the learned reward model was transferred to the other foundation models.)
This technology not only reduces the cost of retraining specialized models within companies and organizations, but also opens up new possibilities—such as simulating the expected effects of retraining in advance using this approach.
NTT will continue to contribute to the research toward next-generation AI technologies, especially aimed at addressing the growing cost challenges of large-scale AI and realizing the concept of AI Constellations5, where multiple AI systems operate in coordination.
This research will be presented at ICML 2025 (International Conference on Machine Learning), one of the most prestigious international conferences in the field of machine learning, to be held from July 13 to 19, 2025, under the following title and authorship.
Title: "Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models"
Authors: Daiki Chijiwa*, Susumu Takeuchi (NTT Computer and Data Science Laboratories), Taku Hasegawa*, Kyosuke Nishida, Kuniko Saito (NTT Human Informatics Laboratories)
*: equal contribution
1ICML 2025
The International Conference on Machine Learning—one of the world's top-tier conferences in the field of machine learning.
https://icml.cc/Conferences/2025
2May 7, 2024 Press Release:
"Realize the World's First "Learning Transfer" to Reuse Past Learning Trajectories, Significantly Reducing the Cost of Retraining AI Models"
https://group.ntt/en/newsrelease/2024/05/07/240507b.html
3Dosovitskiy et al. "An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale" (2020)
4Schuhmann et al. "LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs" (2021)
5AI Constellation
https://www.rd.ntt/e/cds/ai-constellation/
NTT contributes to a sustainable society through the power of innovation. We are a leading global technology company providing services to consumers and businesses as a mobile operator, infrastructure, networks, applications, and consulting provider. Our offerings include digital business consulting, managed application services, workplace and cloud solutions, data center and edge computing, all supported by our deep global industry expertise. We are over $90B in revenue and 340,000 employees, with $3B in annual R&D investments. Our operations span across 80+ countries and regions, allowing us to serve clients in over 190 of them. We serve over 75% of Fortune Global 100 companies, thousands of other enterprise and government clients and millions of consumers.
Media contact
NTT, Inc.
NTT Service Innovation Laboratory Group
Public Relations
Inquiry Form
Information is current as of the date of issue of the individual press release.
Please be advised that information may be outdated after that point.
WEB media that thinks about the future with NTT