AI Images Generated with ExpLoRa
Taking A Photo of Food is Difficult
Humanity has gotten pretty good at faking appetizing food for advertising content (any mcDonald’s ad). However, that sorcery tends to be gate-kept behind $$$ not accessible to the average person. When a friend tried to launch a Turkish Delight brand, here is what our DIY photoshoot came out like.
This is with studio lights & a DSLR camera… Clear skill issue.
However, it turns out that AI is really good at generating product images, if you properly train it.
We researched different models and training procedures and came upon a concept called LoRA training. LoRA training is a way to customize an AI model to be trained on a dataset of images of your choice (ex. your face). After using a LoRA Trainer to train a model on your images, you can prompt it to generate new images of the object in your dataset (ex. you in a new country, at a cafe, eating a bowl of rice, etc.) We used the Flux Pro model from Black Forest Labs, and we did our training on Replicate.
While this may sound complicated, the process only takes just a few button clicks. The issue is that the quality of the results is directly correlated with having a high quality dataset. Prepping datasets is extremely tedious though, as each image needs to be high resolution, a .png, and converted into a zip file with .txt files containing captions for each image.
So, I set out to make an image processing tool to streamline and finetune the LoRA training process…
Planning Process
Product Workflow
Project Scoping
Requirements:
- Easy to use - I want LoRA training to be an accessible tool for all and simplify the entire process
- Accurate results - needs to provide a clean and perfect dataset every time, with accurate captions
Wireframes
Build Process
The build was relatively straightforward: Next.js project, OpenAI API for caption generation, Vercel for deployment, and JSZip API for zip file creation. At the time, I was learning how to implement Tailwind Components, so I had a lot of fun with those in this project.
An issue that I kept running into while building was rate limiting from the OpenAI API. To prevent this, I used intervals when fetching data from the API, and used exponential backoff to retry a request if it did not go through. These tactics have helped to keep the website running quickly, but also ensure that a user never runs into a rate limit. Next, I originally planned on using an external PNG converter, but that was overcomplicated, so instead, I created a function that converts images using toDataUrl(). The implementation of JSZip for the zip file creation pushed me in terms of reading and executing on documentation. Overall, a great learning experience, and super helpful to my friend! Just look at the images we created with our polished dataset!\
Future Features
- Better Landing Page - I want before and afters with clear descriptions so users can see the difference my tool makes
- Fully Developed Tool - I would love to abstract away the training process for the user, ideally, they upload images and ExpLoRA returns back their fully trained model that they can interact with from my site