Image Quality Assessment Using Machine Learning

Published: Apr 20, 2024

Image Quality Assessment (IQA), specifically Objective Blind or no-reference IQA, is a crucial function to determine image fidelity or the quality of image accuracy. Further, IQA helps maintain the integrity of visual data, ensuring its accurate representation. Here, we share an analysis of the best machine learning models that support IQA, diving deeper into their operations, the challenges and advantages, and its significance in the ever-evolving field of image quality assessment.

  • Damon WangDamon Wang / Android Engineer
  • Roy XieRoy Xie / Tech Lead


In this article, we delve into the intricate process of Image Quality Assessment (IQA) - a crucial function designed specifically to ascertain and determine the level of accuracy or fidelity of images. This measure of precision is vitally important in maintaining the integrity of visual data and ensuring its accurate representation.

Fundamentally, IQA algorithms are sophisticated tools that use an image of any random classification as a model for its analysis. The primary result of this quantification process is the generation of a unique quality score that corresponds directly to the image initially inputted into the system.

The IQA has three distinct forms or methodologies, each varying greatly based on the type of data and the degree of reference you have at your disposal when measuring an image's quality:

• Full-Reference IQA:
This methodology uses a 'clean' or undistorted reference image as a benchmark standard against which the distorted test image's quality deviation is measured and quantified.
• Reduced-Reference IQA:
This approach does not require an intact reference image. It instead employs an image that features only selected information elements about the standard reference, such as a watermarked version. This reduced information is then used as the comparative standard to decipher the distorted image's quality.
• Objective Blind IQA, also known as no-reference IQA:
This algorithm operates under the most stringent conditions. It has only one item of input data - the image under analysis. Consequently, its assessment of image quality is performed without the advantage of any comparative or reference data, meaning it relies solely on the inherent attributes of the image itself.

The primary focus of this article, however, is Objective Blind IQA, also generally recognized as no-reference IQA. This methodology is unique in its approach as it operates with incredibly minimal resources, the sole input being the image being analyzed. Lacking comparison data from any reference image, the algorithm assesses the image quality based on the image's inherent attributes.
While each approach to IQA has its distinct prowess, the premise of Objective Blind IQA is particularly intriguing as it presents an unbiased, independent assessment of image quality. In the following sections, we aim to delve deeper into its operations, the challenges and advantages it offers, and its significance in the ever-evolving field of image quality assessment. No-reference IQA forms the crux of our discussion in this article, providing you with a comprehensive understanding of its sophisticated mechanics.

No-Reference IQA

Several significant research studies and projects have delved into the realm of No-Reference Image Quality Assessment (IQA) Metrics, contributing a multitude of methods for evaluation:

Before immersing ourselves more deeply in the theory of these methodologies, it's crucial to familiarize ourselves with two fundamental terms.

Image Distortions

These are commonly manifested as White Noise (WN), Gaussian Blur (GB), JPEG compression, and JP2K compression. White noise distortion, for instance, often occurs during low-light conditions such as taking photographs at night with a mobile device. Additionally, inadvertent application of gaussian blur can occur if images are not correctly focussed before capturing them.

Fig. 1. (a) reference image, (b) JPEG compression
Fig. 2. (a) gaussian blur, (b) white noise

Natural Image

This term refers to an image that is directly captured by a camera with no subsequent post-processing. These images retain their original, undiluted form, preserving the essence of the shot as seen through the lens of the camera.

Fig. 3 Natural Image (left) and Noisy Image (distorted, right)

Possible Solutions

Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE)

At its core, Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) operates by leveraging the unique characteristics that differentiate distorted images from natural ones. Specifically, it capitalizes on the variations found in pixel intensity distributions between natural and anomalous or distorted images.

In a natural image, the pixel intensities often have a distinct pattern. After undergoing a process called normalization, wherein pixel intensities are uniformly rescaled, a characteristic distribution emerges within these pixel intensities. Remarkably, this distribution typically adheres to a Gaussian Distribution, also known as a Bell Curve - a pattern that is ubiquitous across natural images.

In contrast, images subjected to distortion or unnatural alterations exhibit a different pattern. Post-normalization, the pixel intensity distribution of these images does not fall into the predictable pattern of the Gaussian Distribution. Instead, they depict a noticeable deviation from this ideal bell curve.

BRISQUE takes advantage of this marked contrast pattern. It considers the degree to which the pixel intensity distribution strays from the Gaussian Distribution - a measure that effectively predictions the level of distortion in the image. Essentially, the larger the deviation from the ideal Bell curve, the more significant the distortion detected in the image. This analytical approach provides a simple yet highly effective tool for evaluating and predicting image quality without any reference image.

Fig. 4 On the left: shows a natural image with no artificial effects added, fits Gaussian distribution. On right: An artificial image, doesn’t fit the same distribution well.

Deep CNN-Based Blind Image Quality Predictor (DIQA)

Navigating the complexities of Image Quality Assessment bears its unique challenges, one of the most notable being the taxing task of tagging images. However, the developers of the Deep CNN-Based Blind Image Quality Predictor (DIQA) ingeniously bypassed this obstacle by implementing a two-step training process that benefits from large data volumes.


Fig. 5. Overall flowchart of DIQA. Source: http://bit.ly2Ldw4pz

This can be seen in Figure 5, which maps out the overall flowchart of DIQA. The details of the process are as follows:

  1. Involves the initial training of a Convolutional Neural Network (CNN). The CNN is tutored to learn and identify an objective error map, a process that does not necessitate the use of any subjective human opinion scores. Instead, the CNN is trained to pinpoint the variations between a natural image and its distorted counterpart, effectively learning to map the noticeable discrepancies or errors.
  2. This stage delves into the realm of subjective human perception, where two fully connected layers are incorporated after the convolution 8. The CNN is then fine-tuned using human opinion scores - subjective evaluations that reflect human perceptions of the image quality.


Fig. 6. The architecture for the objective error map prediction

The structure for this objective error map prediction can be seen in Figure 6. The diagram showcases the dual-stage flow of processing, with the red and blue arrows indicating the first stage and second stage, respectively. The use of this two-step training process has effectively augmented the predictability and reliability of the DIQA system.

Google's Neural Image Assessment (NIMA)

Google's innovative approach to image quality assessment comes in the form of Neural Image Assessment (NIMA). NIMA operates by predicting the distribution of human opinion scores. These scores play a crucial role in illustrating subjective human perspectives of image quality. To accurately reflect this spectrum of human opinions, NIMA employs the capabilities of a convolutional neural network (CNN).

The CNN is trained to forecast the range of human opinion scores on images, offering potential scores that could be assigned by human viewers. By doing so, NIMA successfully captures a wide spectrum of human aesthetic preferences and perceptions, ultimately offering predictions that can score images in a manner closely correlating to human reception.

This methodology is unique as it seeks to integrate the subjectivity of human perception within its objective framework, by predicting how a variety of humans would perceive, interpret, and grade the quality of an image. As such, NIMA offers scores that resonate well with the human means of evaluation, making it a unique and effective tool in the field of Image Quality Assessment.

The OpenCV Library

Apart from utilizing machine learning methodologies, the OpenCV library presents another viable option for image quality assessment. An integral function used within this library is the Laplacian operation, which is employed to compute the second derivative of an image. This mathematical operation is significant as it offers insights into the edge information of the image.

For an identical subject matter, images of higher definition will typically yield a greater variance when filtered by the Laplacian. This method has been subjected to multiple tests using various images from the live dataset and consistently delivered excellent results, barring a few issues with images containing white noise.

Furthermore, the OverCV library enables assessments of overexposure or underexposure within an image. This is achieved by analyzing the mean and variance of the grayscale elements of the image. These calculations help in determining the aesthetic quality of an image, specifically concerning its exposure levels. However, it must be noted that while effective, this method can also lead to some inaccuracies when applied to images with white noise.

Suggestions for Machine Learning and OpenCV Solutions

Machine learning models, when employed for Image Quality Assessment, produce outputs represented by floating fractions. Nevertheless, these results do not definitively identify the type of distortion within an image of low quality. They can only afford approximations or educated guesses that the image might seem blurry, or that it has other display issues. As such, the communication to the user may be limited to hinting that the image appears blurry, without specifically identifying the kind of distortion present.

On the other hand, solutions utilizing the OpenCV library possess an expanded range of capabilities. Not only can these methods alert users about potential blurriness within the image, but they also provide information about the image's exposure. The system can indicate whether the exposure level is under ideal conditions or if it may be too strong, thereby guiding users to adjust their approach for achieving the desired image quality.

The versatility of both machine learning and OpenCV solutions allows them to be applied in various scenarios. This can include real-time analysis during photo previews, providing instant feedback about the image's quality. It could also involve analyzing individual images post-capture, offering a detailed assessment of various image quality parameters. Ultimately, it is the combination of these solutions that can offer a comprehensive Image Quality Assessment, thereby enhancing user experience and photographic outcomes.

Advantages Disadvantages Dependencies
BRISQUE Simplistic implementation Moderate accuracy (around 0.8) Python, OpenCVlibsvm
DIQA High accuracy (over 0.9), can be incorporated within MediaPipe Requires the presence of a subjective dataset Deep Convolutional Neural Networks
NIMA High accuracy, compatible with MediaPipe Absence of notable disadvantages Deep Convolutional Neural Networks
OpenCV Simplistic implementation Moderate accuracy, similar to BRISQUE Python, OpenCV


Utilizing machine learning solutions for Image Quality Assessment presents its unique set of challenges.

Firstly, the task involves identifying an efficient machine learning model that can effectively assess image quality. Once the model is determined, it must be trained with data. This data collection and organization that goes into training can be a complex and laborious process, requiring adept handling, sorting, and labeling of massive amounts of image data.

Secondly, proficiency in Machine Learning itself is essential. The user needs to have sufficient understanding and experience in Machine Learning principles, algorithms, and applications to effectively implement and refine the chosen model. This learning process may be time-consuming and utilizes considerable resources, which could serve as a barrier for those new to the field.

Implementing solutions with OpenCV presents a separate challenge. Its efficacy in detecting images with white noise remains to be validated. White noise refers to random pixels spread throughout an image, potentially distorting the image quality. Essentially, the robustness and accuracy of OpenCV in discerning white noise within images need to be assessed and confirmed.

Ultimately, despite the potential these techniques hold, considerations on these challenges have to be made for effective and efficient application in image quality assessment.


[1] Image Quality Assessment: BRISQUE,

[2] Image Quality Assessment: A Survey,

[3] Automatic Image Quality Assessment in Python,

[4] Deep CNN-Based Blind Image Quality Predictor in Python,

[5] Research Guide: Image Quality Assessment for Deep Learning,

[6] Perceptual image quality assessment using a normalized Laplacian pyramid,

[7] NIMA,

Build With Us