Microsoft ends support for Internet Explorer on June 16, 2022.
We recommend using one of the browsers listed below.

  • Microsoft Edge(Latest version) 
  • Mozilla Firefox(Latest version) 
  • Google Chrome(Latest version) 
  • Apple Safari(Latest version) 

Please contact your browser provider for download and installation instructions.

Open search panel Close search panel Open menu Close menu

August 27, 2024

Upgrade 2024: Vision

One of the responsibilities and privileges of tech companies is the requirement to tell the world what it thinks about the future. Where it sees humanity heading and what it plans to do to help get there.

The Upgrade 2024 event in San Francisco was an opportunity for NTT Group to demonstrate its vision for the future. One technology that symbolizes how it intends to upgrade reality is "tsuzumi," its new Large Language Model (LLM). tsuzumi adheres strongly to the mindset: bigger isn't necessarily better; smarter is better.

Unlike traditional, huge LLMs, tsuzumi is lightweight yet powerful, designed to operate efficiently with fewer resources. The LLM comes in two flavors: the larger version, which has a parameter size of 7 billion and which stands out for its nimbleness, capable of running on a single GPU, and an even smaller version that has 0.6 billion parameters and is capable of being run on one CPU. A consequence of tsuzumi's compact size is its speed. It generates text much faster than models like ChatGPT 3.5, due to the nature of its architecture and efficient tokenization process—tsuzumi's tokenizer uses a compact vocabulary which minimizes redundant tokens through word segmentation constraints. This efficiency in tokenization greatly speeds up processing and has been demonstrated by tsuzumi's excellent performance in Rakuda benchmarking.

Another feature of tsuzumi that strongly aligns with NTT's vision of a rapid and clever AI system that works for humans is its ability to understand and answer questions based on document images. Real-world documents often contain both text and visual elements, which AIs have struggled to interpret up until now. NTT's "Visual Machine Reading Comprehension Technology" addresses this by enabling tsuzumi to understand and process visual information in documents.

Previous techniques required extensive training for specific tasks, however tsuzumi's capabilities take advantage of its high reasoning abilities to handle arbitrary tasks much more effectively. It uses an innovative adapter technology that converts document images to be understood by the program, along with a comprehensive visual instruction tuning dataset. The dataset is made up of 12 tasks, such as question answering and information extraction, which helps tsuzumi perform well on new tasks without extra training. NTT sees tsuzumi as being capable of web search and question answering based on visual documents, ultimately allowing tsuzumi to work with humans and automate complex tasks.

NTT's long-term vision for tsuzumi is to create an Artificial General Intelligence (AGI) that can naturally coexist with humans in any environment, promoting human well-being. Technology that is not designed for the sake of technology itself, but with the specific intention of helping humanity. This AGI would be designed to collaborate with humans, evolving into a life partner that grows with its human counterparts.

The AI Constellation concept is central to this vision, in which multiple, diverse and compact AI models collaborate to deliver unbiased solutions from blending varied perspectives, addressing complex issues that often elude single-model systems. By deploying multiple small AIs in a distributed configuration and linking them efficiently, NTT aims to create an architecture where they operate as an aggregate, solving complex social problems and creating unprecedented knowledge and value. And doing so in a way that is sustainable, using energy as efficiently as possible.

tsuzumi. It's fast, compact, very clever, doesn't gobble up resources, and makes it possible for different AI systems to play nice with each other. It's a vision for the future that focuses on humanity and what NTT Group can do to help upgrade reality.

For more details about the Upgrade 2024 event, please see this link:
https://ntt-research.com/upgrade/Open other window

We hope you have enjoyed this short series of articles on the recent Upgrade 2024 event held in San Francisco.
If you'd like to check out some features you may have missed, here are the links you need!

Our first Upgrade 2024 article introduced the event and explained the peace of mind coming from our Autonomous Closed-Loop Intervention System (ACIS) technology. You can see it here.

The following article was all about beauty: the "Kirameki" display technology. Click here for details.

Do you believe that technology can be kind? Read about NTT's Connected-AI system and let us persuade you!

Upgrade 2024: Vision through Technology

Picture: Daniel O'Connor

Daniel O'Connor joined the NTT Group in 1999 when he began work as the Public Relations Manager of NTT Europe. While in London, he liaised with the local press, created the company's intranet site, wrote technical copy for industry magazines and managed exhibition stands from initial design to finished displays. 

Later seconded to the headquarters of NTT Communications in Tokyo, he contributed to the company's first-ever winning of global telecoms awards and the digitalisation of internal company information exchange. 

Since 2015 Daniel has created content for the Group's Global Leadership Institute, the One NTT Network and is currently working with NTT R&D teams to grow public understanding of the cutting-edge research undertaken by the NTT Group.