This blog was written by Alex Beckner, 4th year SDSU Lavin entrepreneur, as part of his internship program.
In the dynamic realm of life sciences, the power of visual representation cannot be overstated. Whether you’re a researcher seeking to communicate complex findings, an educator striving to engage students, or a communicator bridging the gap between science and the wider world, the need for striking and informative images is paramount. Enter Midjourney – a cutting-edge tool that is revolutionizing the way we develop and present images related to life sciences. In this blog, we’ll delve into the incredible potential of Midjourney, exploring how it empowers professionals across the spectrum of life sciences to visualize and convey intricate concepts with unprecedented clarity and creativity. Say goodbye to the limitations of traditional imagery; it’s time to embark on a transformative journey through the world of visual storytelling in the life sciences, with Midjourney as our trusted guide.
My first attempts at using Midjourney’s text-to-image AI to generate images related to the medical field looked something like this:
Midjourney’s text-to-image AI to generate images related to the medical field looked something like this:The artificial intelligence took a very “artistic” approach to the prompt, creating these gothic scenes of painful and bloody medical procedures that were far from the light-hearted and happy images I was expecting. So, I told the AI to make the doctor and patient “happy” and “smiling” and put them in a “well lit” room:
This was a slight improvement, but their overly happy expressions somehow made the generated images even more eerie. I wanted the images to have a “stock photo” feel to them, like you would see in a generic pharmaceutical commercial. I figured that the easiest way to accomplish this was to simply put “stock photo” in the prompt:
This was by far the biggest improvement; the people looked more realistic, the colors were dulled but bright, and, overall, it suppressed the AI’s artistic tendencies in favor of a more photorealistic approach. Interestingly, however, prompts that included the word “photorealistic” had a tendency to generate more cartoonish, non-photorealistic images:
Anyway, I spent some time in Midjourney’s public servers–in which anyone can type their prompt and generate images for all to see–in order to get an understanding of how other people engineered their prompts to get their desired outcome. Many people used words like “photorealistic,” “hyperrealistic,” or “4k” and “8k” to tell the AI to generate the most realistic, high quality photos possible. However, I found that the inclusion of such phrases made very little difference on the generation process:
What did make a difference, however, were the various “parameters” that you can add to your prompt to tell the AI to make your desired image a certain way (some of which can be seen in the above prompts). Midjourney has a number of these parameters that it understands innately, so you don’t have to spend time tweaking the semantics of your prompt. To use parameters in Midjourney, you simply type “–[parameter] [value]” somewhere in your prompt.
The first one I’ll go over is the “–ar” or “–aspect” parameter, which allows you to set the aspect ratio of your image. I used a 16:9 ratio for most of my images as it is the aspect ratio used by the vast majority of computer monitors and television screens. Interestingly, I later learned that Midjourney does not actually support a 16:9 aspect ratio and automatically converts it to a 7:4 ratio. The two are close enough to not make a noticeable difference (1.78 vs 1.75), but I still found it odd that it does this. The next step is to choose your desired version, indicated by the “–v” parameter. The current Midjourney model is Version 5, and you can choose between versions 5.0, 5.1, or 5.2:
I found that versions 5.0 and 5.2 were the best at generating the most realistic looking images, as the people generated by version 5.1 had slightly blurred faces and the lighting in the photos was more muted. The last two parameters are the “–style raw” parameter and the “–s” or “–stylize” parameters. The “–style raw” parameter is accepted by version 5.1 and 5.2, with the purpose of making your images slightly less “artistic” and slightly more “photographic”:
The “–stylize” parameter works similarly, but can have a much stronger effect on the artsy-ness of your image. The value of the stylize parameter ranges from 0 to 1000, with 100 being the default value. A higher value will generate a significantly more “artistic” depiction of your prompt, but will be more liberal in terms of how closely it follows the prompt. A lower value will follow the prompt more strictly, but will be less “artistic”:
As you can see, the images prompted with higher “–stylize” values have noticeably more variation in color and more theatrical lighting, but the AI followed the prompt so loosely that only one of the images even contains a person in a doctor’s clothes, and none of them are set in a doctor’s office or hospital. On the other hand, the image prompted with the lowest “–stylize” value is more dull and white in color, but it understood the prompt perfectly. I found that setting the “–stylize” parameter to 200, or twice the default value, generated the best balance between “artistic” and “photographic.”
In our exploration of Midjourney and its text-to-image AI capabilities, we’ve uncovered some intriguing insights into the art of crafting the perfect image. While the initial attempts may have yielded unexpected results, we’ve learned that the magic lies in the nuances of the prompts. Words like “stock photo” turned out to be the golden ticket to achieving that familiar, polished look we often associate with commercial imagery. Surprisingly, phrases like “photorealistic” didn’t always do the trick. Instead, it’s the clever use of parameters that truly fine-tunes the AI’s output. The aspect ratio, version selection, and stylization parameters proved to be the keys to achieving the delicate balance between artistry and realism. As we navigate the ever-evolving world of AI-generated visuals, it’s clear that Midjourney empowers us to wield creativity and precision in equal measure, unlocking new possibilities in the realm of visual storytelling for life sciences and beyond. So, whether you’re a scientist, educator, or communicator, harness the potential of Midjourney and embark on your own transformative journey through the world of imagery with confidence.
AI Disclosure Statement: Some of this text content was generated with an AI content tool.