I was given the task of finding a solution to update product images, especially in scenes like kitchens and family rooms where the product tends to blend in. I also had to deal with images that were too wide or tall, causing the system to add unwanted whitespace. After searching Google without finding a suitable solution, I developed two Python scripts, both heavily powered by AI, to address the issue.
For the image cropping script, I used Roboflow to train an AI model on a series of product images, identifying the subject type in each photo (e.g., "pendant," "chandelier," etc.). The trained Roboflow model helps the code locate the product's X and Y pixel coordinates, along with its pixel width and height. The code then extracts the product and surrounding areas into a new image.
For AI quality upscaling, I used a forked version of the Real-ESRGAN Python script to enhance the cropped image. The tool can also increase the pixel size by up to 4x. Additionally, I implemented logic to shift the process from the CPU to the GPU using NVIDIA's CUDA driver, which significantly speeds up processing times.
Examples of the tools in action:
Visual diagram showing how the AI focal-cropping and upscaling tools work.
Demo Images:
Languages and libraries:
- Python 3.11
- Numpy
- OpenCV
- InferenceHTTPClient and COCOn
- Roboflow and Roboflow API
- Real-ESRGAN modified fork
- NVIDIA CUDA
- Authored comprehensive README and setup guide
- Authored setup guide for Roboflow