Nota AI, an artificial intelligence optimization technology company, has unveiled a breakthrough quantization technology that dramatically reduces the memory footprint of Solar, a high-performance large language model (LLM) developed by Upstage. With this innovation, the company successfully compresses the model size while maintaining strong accuracy and performance. As a result, the new approach lowers inference costs and improves processing speed, making large-scale AI models easier to deploy across various environments.
The development took place as part of the “Sovereign AI Foundation Model Project,” an initiative led by South Korea’s Ministry of Science and ICT. Through this program, Nota AI applied its proprietary lightweighting and optimization technologies to Solar Open 100B. Consequently, the company significantly enhanced memory efficiency without compromising the model’s capabilities. By reducing the heavy memory requirements typically associated with 100-billion-parameter models, the solution enables more practical deployment of Korean AI foundation models in physical AI environments, including robotics, mobility systems, and other edge devices.
AI Authority Trend: Nota AI Partners with FuriosaAI to Boost AI Model Optimization for Data Center Chips
Furthermore, the newly introduced technology addresses key challenges associated with the Mixture of Experts (MoE) architecture. This architecture is increasingly gaining traction in next-generation large language models because it improves computational efficiency and scalability. However, traditional quantization techniques often compress the entire model uniformly, ignoring the unique characteristics of individual expert models. This limitation frequently leads to reduced accuracy or performance degradation.
To solve this issue, Nota AI developed a proprietary algorithm specifically optimized for MoE structures, called “Nota AI MoE Quantization.” Unlike conventional methods that apply uniform precision reduction across all operations, this algorithm selectively preserves precision in critical components of the model while compressing less sensitive areas. Therefore, the technology minimizes quantization distortion during the inference process and maintains overall performance even after significant compression.
When Nota AI applied this technique to the Solar 100B model, the results demonstrated a substantial improvement compared with standard quantization methods. The company successfully reduced Solar’s memory usage from 191.2GB to 51.9GB, representing an impressive 72.8% reduction. At the same time, the model maintained performance levels close to the original version. Specifically, Solar achieved a Perplexity (PPL) score of 6.81, which remains close to the baseline model’s 6.06. In contrast, several generic quantization methods caused performance degradation exceeding five times the original level. To protect its innovation, Nota AI has already filed a patent application for this technology.
AI Authority Trend: Nota AI Optimization Boosts Traffic Innovation in Africa After Middle East Success
Traditionally, many quantization techniques reduce memory usage at the cost of performance. However, Nota AI’s approach demonstrates that companies can achieve both efficiency and accuracy simultaneously. By enabling faster AI services and supporting more users on limited GPU infrastructure, the technology offers significant operational advantages for enterprises.
Moreover, the reduced memory footprint opens new possibilities for deploying high-performance AI directly on devices used in robotics, automotive systems, and other real-world applications. Organizations that lack access to large GPU clusters can now serve more users using the same hardware, which ultimately helps reduce operational costs and infrastructure limitations.
“This achievement is meaningful because we were able to apply Nota AI’s proprietary quantization technology to Solar 100B, a Korean AI foundation model, significantly reducing memory usage while maintaining performance,” said Myungsu Chae, CEO of Nota AI, “As demand grows for deploying large-scale models directly on devices, Nota AI’s lightweighting and optimization technologies will play a critical role in enabling high-performance AI.”
AI Authority Trend: Nota AI Partners with Samsung to Boost On-Device AI on Exynos 2500
To share your insights, please write to us at info@intentamplify.com




