top of page

[GENIAC Phase 2] The Story Behind Developing a Large-Scale Generative AI Project That Produced "Competitive Results"

  • NABLAS
  • 3 days ago
  • 4 min read
ree

GENIAC, a project implemented in collaboration with METI and NEDO, aims to strengthen Japan's generative AI development capabilities. NABLAS undertook the development of a large-scale Vision-Language Model (VLM) as part of GENIAC Phase 2, from October 2024 to April 2025. This time, we interviewed Mr. Niidate, a research engineer at NABLAS, to hear about the story behind this development!

*Please note that our company was also selected for Phase 3, which was granted in July 2025.


What is GENIAC?

GENIAC (Generative AI Accelerator Challenge) is a project implemented in collaboration with METI and NEDO, with the aim of strengthening Japan's generative AI development capabilities. It primarily provides computing resources for the development of foundational models, which are core generative AI technologies, and supports demonstration surveys for the utilization of data and AI.



Please give an overview of the project you undertook in GENIAC Phase 2.

In the GENIAC Phase 2 project, we worked on developing two models. The first was the development of a general-purpose large-scale Vision-Language Model (VLM). A VLM is a multimodal generative AI model that processes both images and text, and we aimed for the highest domestic standards in both Japanese and English performance. The second was the development of a VLM specialized in the food domain. For that development, since no benchmark existed, we started from scratch, from creating the dataset to building the benchmark, and our goal was to achieve performance that surpassed GPT-4o.


Why did you decide to develop a large-scale Vision-Language Model?

Well, one of the major reasons we undertook this project was a strong in-house need to build a Japanese model specialized in food for our own services. We believed that the development of a highly versatile VLM, which would serve as its foundation, was essential for the future. In addition, the current situation where there were hardly any powerful Japanese-compatible VLMs also motivated our development. In particular, high-performance Japanese-compatible VLMs that could handle images, videos, and even multiple images as input at once were very limited.


After development, almost everything, including the training code, has been open-sourced. This has, I believe, created an environment where researchers and developers can fine-tune with their own datasets and freely test the model's potential. Although it may be a long-term goal, I hope this will help create a cycle for building powerful large-scale models in Japan.

ree

What were the challenges and difficulties you faced in the GENIAC project and in VLM development?

In the GENIAC project, we felt the difficulty of coordination, as we needed to carefully estimate computing resources and overall schedules in collaboration with multiple stakeholders. Particularly in multi-GPU and multi-node training, checkpoint creation and log analysis took more time than expected, and technical trial-and-error occurred frequently.


From a technical perspective, the biggest challenge was the lack of high-quality, large-scale Japanese datasets. Most datasets were English data translated into Japanese, or only partially composed of Japanese, so we spent a lot of person-hours on preprocessing, noise removal, and fixing minor bugs ourselves. Through this process, we gained significant learning about "what kind of dataset will yield what kind of model."


Also, when adding additional modules based on existing open-source models, identifying and correcting errors specific to large models took time, and initial development was a race against the clock. Balancing multiple benchmarks was also difficult; we faced a dilemma where optimizing one would degrade others. In the absence of sufficient existing literature and standard methods, independently exploring and building solutions was a very challenging aspect throughout the entire project.


We were required to develop in a way that led to results within limited time and resources, so we constantly felt a sense of responsibility to move the project forward. For that reason, I believe we were able to approach the project with both caution and speed, from the planning stage to the completion of development.


How did you overcome the above challenges and difficulties?

We thoroughly implemented the policy of "starting small, rather than moving big right away." This approach involved accumulating small verifications to minimize risks while steadily increasing accuracy and stability for the production phase. This was a point we were conscious of in all processes, including dataset preprocessing, model implementation, and computing environment construction.


Furthermore, regarding troubleshooting, by organizing priorities and aligning understanding within the team, we were able to consistently achieve results even within limited time and resources.


Above all, what was significant was that the team shared a common understanding of "creating a competitive model" and the responsibility of "seeing it through to the end." I felt that not only technology but also the desire to advance the project and teamwork were great strengths in overcoming difficulties.


ree

What parts of the project did you find rewarding or enjoyable?

Seeing the model we developed become competitive in Japan was a great source of satisfaction. It wasn't just about creating a model; we developed it with the intention of it actually being used in the services and products we would handle in the future, so the feeling of "being able to create a model that would be used in the field" was extremely gratifying. In terms of performance, even if it didn't reach the level of the latest frontier models, a sufficiently satisfactory level was achieved, and I felt a great deal of reward in that achievement.


Furthermore, in large-scale model development, it is extremely important to divide roles and proceed as a team in areas such as dataset creation, pre-training, and post-training. Being able to proceed with the project while being conscious of such a division of labor made working collaboratively with team members a great source of motivation and enjoyment.


What are your future prospects and goals?

Our immediate goal is to utilize the VLM model developed in this project in our own services and commercialize it as a competitive service. And we want to develop services using this VLM model, not just release them to the market, but create attractive services that can truly compete with other companies.


新立さん、インタビューありがとうございました!


NABLAS is currently recruiting new members as we expand our business!


bottom of page