Deploy Stable Diffusion Inference Service for Realistic Image Generation for Marketing
In today's digital age, marketing strategies are increasingly relying on dynamic and visually engaging content to capture consumer attention. Leveraging artificial intelligence for creating realistic images can revolutionize how brands create advertisements, social media posts, and other marketing materials. This tutorial will guide you through setting up and deploying an AI-based image generation model using Covalent Cloud, specifically focusing on generating high-quality, photorealistic images
The major benefit of using Covalent Cloud is that it allows us to deploy a production-ready image generation backend effortlessly, which can be integrated into various marketing workflows to enhance visual content dynamically..
Before you start, ensure you have the latest version of covalent-cloud
installed. You can update or install it using:
pip install covalent_cloud -U
Environment Configuration
First, let's set up an environment on Covalent Cloud specifically designed for running our image generation model. This environment will include all necessary libraries such as PyTorch and Hugging Face's transformers. To learn more check here. Here's how you can create an environment in Covalent Cloud:
import covalent_cloud as cc
cc.create_env(name="stable-diffusion-env",pip=["torch","diffusers","transformers","peft","huggingface_hub"],)
Environment Already Exists.
Executor Configuration
To ensure our model runs smoothly, we will configure a cloud executor with the appropriate resources. You can set other GPUs as shown here. Here's how to configure it:
service_executor = cc.CloudExecutor(
env="stable-diffusion-env",
num_cpus=2,
memory="100GB",
num_gpus=1,
gpu_type=cc.cloud_executor.GPU_TYPE.A100,
time_limit="30 minutes"
)
Model Deployment
Let's define a service that hosts the image generation model. This service will be used to generate images based on the input parameters. You can learn more about defining services here. Here's how you can deploy the model:
Note:
Since we cannot JSON serialize a PIL Image, we will convert the image to a base64 string before returning it and then decode it on the client side.
@cc.service(executor=service_executor, auth=False, name="RealVis-XL")
def image_model(model="SG161222/RealVisXL_V4.0"):
import torch
from diffusers import StableDiffusionXLPipeline
from diffusers.models import AutoencoderKL
vae = AutoencoderKL.from_pretrained(
"madebyollin/sdxl-vae-fp16-fix",
torch_dtype=torch.float16,
)
pipe = StableDiffusionXLPipeline.from_pretrained(
model,
vae=vae,
torch_dtype=torch.float16,
custom_pipeline="lpw_stable_diffusion_xl",
use_safetensors=True,
add_watermarker=False,
use_auth_token=None,
variant="fp16",
)
return {"pipe": pipe}
@image_model.endpoint("/generate_image")
def generate_image(
pipe,
prompt: str,
negative_prompt: str = "",
seed: int = 0,
guidance_scale: float = 7.0,
num_inference_steps: int = 30,
use_upscaler: bool = False,
upscaler_strength: float = 0.55,
upscale_by: float = 1.5,
):
import torch
from diffusers import StableDiffusionXLImg2ImgPipeline
import base64
import io
pipe.to("cuda")
def seed_everything(seed):
import random
import numpy as np
torch.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)
if torch.cuda.is_available():
torch.cuda.manual_seed_all(seed)
return torch.Generator().manual_seed(seed)
generator = seed_everything(seed)
if use_upscaler:
upscaler_pipe = StableDiffusionXLImg2ImgPipeline(**pipe.components)
if use_upscaler:
latents = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
guidance_scale=guidance_scale,
num_inference_steps=num_inference_steps,
generator=generator,
output_type="latent",
).images
upscaled_latents = torch.nn.functional.interpolate(latents, scale_factor=upscale_by, mode="nearest")
images = upscaler_pipe(
prompt=prompt,
negative_prompt=negative_prompt,
image=upscaled_latents,
guidance_scale=guidance_scale,
num_inference_steps=num_inference_steps,
strength=upscaler_strength,
generator=generator,
output_type="pil",
).images
else:
images = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
guidance_scale=guidance_scale,
num_inference_steps=num_inference_steps,
generator=generator,
output_type="pil",
).images
# Convert image to base64 string
image = images[0]
bytes_io = io.BytesIO()
image.save(bytes_io, format='PNG')
image_as_str = base64.b64encode(bytes_io.getvalue()).decode('utf-8')
return image_as_str
For the current realistic image generation, we will use the RealVisXL from huggingface model hub
# Deploy the function service
client = cc.deploy(image_model)(model="SG161222/RealVisXL_V4.0")
image_generator = cc.get_deployment(client, wait=True)
print(image_generator)
╭────────────────────────────── Deployment Information ──────────────────────────────╮
│ Name RealVis-XL │
│ Description Add a docstring to your service function to populate this section. │
│ Function ID 664c31d9f7d37dbf2a468a6a │
│ Address https://fn.prod.covalent.xyz/1664c31d9f7d37dbf2a468a6a │
│ Status ACTIVE │
│ Tags │
│ Auth Enabled No │
╰────────────────────────────────────────────────────────────────────────────────────╯
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [3m POST /generate_image [0m │
│ Streaming No │
│ Description Either add a docstring to your endpoint function or use the endpoint's 'description' parameter │
│ to populate this section. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Let us quickly check the quality of the generated images using the deployed model:
Note: Here we are using the client directly to interact with the deployed model. You can also use the REST API endpoint to interact with the model. To learn more, check here
import io
import base64
from PIL import Image
prompt = "closeup portrait view of an american cow boy with cinematic lighting, photorealistic,canon mark 5"
negative_prompt = "(octane render, render, drawing, anime, bad photo, bad photography:1.3), (worst quality, low quality, blurry:1.2), (bad teeth, deformed teeth, deformed lips), (bad anatomy, bad proportions:1.1), (deformed iris, deformed pupils), (deformed eyes, bad eyes), (deformed face, ugly face, bad face), (deformed hands, bad hands, fused fingers), morbid, mutilated, mutation, disfigured ,unrealistic, cartoonish, CGI, 3D render, sketch, painting, illustration, low quality, blurry, grainy, pixelated, distorted, deformed, disfigured, out of focus, overexposed, underexposed, oversaturated, washed out, bad anatomy, bad proportions, extra limbs, missing limbs, floating limbs, disconnected limbs, mutated hands, mutated feet, fused fingers, elongated fingers, text, watermark, signature, logo, frame, border"
seed = 24
guidance_scale = 6.5
num_inference_steps = 20
use_upscaler = True
upscaler_strength = 0.52
upscale_by = 1.55
image_str = image_generator.generate_image(
prompt=prompt,
negative_prompt=negative_prompt,
seed=seed,
guidance_scale=guidance_scale,
num_inference_steps=num_inference_steps,
use_upscaler=use_upscaler,
upscaler_strength=upscaler_strength,
upscale_by=upscale_by,
)
image_arr = io.BytesIO(base64.b64decode(image_str))
raw_image = Image.open(image_arr)
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(raw_image)
ax.axis("off")
plt.show()
Marketing Use Cases: Generating Images for Tourism Campaigns
With our realistic image generation model deployed on Covalent Cloud, we now turn to practical marketing applications. In this section, we explore the model's utility in generating high-quality images for various tourism campaigns. Such campaigns often demand captivating visuals that not only highlight the destination but also invoke a sense of adventure and allure.
Creating Images for Different Tourism Themes
We'll demonstrate how our model can be employed to create tailored images for different tourism sectors—each with unique themes like adventure, luxury, and culture. Here’s how you can utilize the model to generate images that adhere to specific marketing narratives:
travel_prompts = {
"Adventure Tourism": "National Geographic style photo of sun-drenched canyon carved through red rock, lone hiker silhouetted against dramatic sky, weathered boots and map resting on sun-warmed boulder. (realistic, epic, dramatic, adventure)",
"Luxury Tourism": "Architectural Digest style photo of a sprawling overwater bungalow perched above crystal-clear turquoise water. Hammock gently swaying on private deck, sunlight dappling the pristine wood floor. (luxurious, aspirational, paradise, serene)",
"Cultural Tourism": "Golden hour photo, vibrant mosaic wall with local folklore, partially hidden by blooming flowers. Cobblestone street bathed in warm light, long shadows hinting at rich history. (cultural immersion, vibrant, timeless, storytelling)",
"Wildlife Tourism": "Award-winning wildlife photo, majestic snow leopard perched on rocky outcrop, gaze fixed on snow-capped peaks. Faint trail of footprints in pristine snow, hinting at the rare encounter. (endangered species, raw beauty, conservation, awe-inspiring)",
"Sustainable Tourism": "Morning light, rustic wooden lodge nestled in lush greenery, solar panels gleaming. Bicycle leaning against weathered porch, inviting exploration of surrounding nature. (eco-friendly, responsible travel, tranquil, harmonious)",
"Culinary Tourism": "Food magazine style photo, top-down view of rustic wooden table laden with local delicacies: crusty bread, ripe fruits, steaming pot of stew. Single empty plate inviting participation. (farm-to-table, authentic cuisine, vibrant colors, shared experience)"
}
def generate_image(prompt):
prompt+= "- correct texture, realistic, photorealistic, detailed, high quality, high resolution, natural,lifelike, sharp, crisp, Canon Mark 5"
seed = 24
guidance_scale = 5
num_inference_steps = 30
use_upscaler = True
upscaler_strength = 0.52
upscale_by = 1.55
image_str = image_generator.generate_image(
prompt=prompt,
negative_prompt=negative_prompt,
seed=seed,
guidance_scale=guidance_scale,
num_inference_steps=num_inference_steps,
use_upscaler=use_upscaler,
upscaler_strength=upscaler_strength,
upscale_by=upscale_by,
)
image_arr = io.BytesIO(base64.b64decode(image_str))
return Image.open(image_arr)
images=[]
for i, (category, prompt) in enumerate(travel_prompts.items()):
images.append(generate_image(prompt))
Shutdown Deployment
Before looking at the images, Lets make sure to shutdown the deployment to avoid any unnecessary charges
image_generator.teardown()
'Teardown initiated asynchronously.'
Adventure Tourism
National Geographic style photo of sun-drenched canyon carved through red rock, lone hiker silhouetted against dramatic sky, weathered boots and map resting on sun-warmed boulder. (realistic, epic, dramatic, adventure)
images[0]
Luxury Tourism
Architectural Digest style photo of a sprawling overwater bungalow perched above crystal-clear turquoise water. Hammock gently swaying on private deck, sunlight dappling the pristine wood floor. (luxurious, aspirational, paradise, serene)
images[1]
Cultural Tourism
Golden hour photo, vibrant mosaic wall with local folklore, partially hidden by blooming flowers. Cobblestone street bathed in warm light, long shadows hinting at rich history. (cultural immersion, vibrant, timeless, storytelling)
images[2]
Wildlife Tourism
Award-winning wildlife photo, majestic snow leopard perched on rocky outcrop, gaze fixed on snow-capped peaks. Faint trail of footprints in pristine snow, hinting at the rare encounter. (endangered species, raw beauty, conservation, awe-inspiring)
images[3]
Sustainable Tourism
Morning light, rustic wooden lodge nestled in lush greenery, solar panels gleaming. Bicycle leaning against weathered porch, inviting exploration of surrounding nature. (eco-friendly, responsible travel, tranquil, harmonious)
images[4]
Culinary Tourism
Food magazine style photo, top-down view of rustic wooden table laden with local delicacies: crusty bread, ripe fruits, steaming pot of stew. Single empty plate inviting participation. (farm-to-table, authentic cuisine, vibrant colors, shared experience)
images[5]
Conclusion
This tutorial demonstrates how to deploy and utilize an AI-powered image generation model on Covalent Cloud for creating stunning marketing visuals. Experiment with different prompts and configurations to fully explore the potential of realistic image generation in your marketing campaigns. Note that this is just a starting point, and you can further customize the model to suit your specific requirements. With Covalent Cloud, you can also fine-tune and train the model on specific datasets to generate images that align perfectly with your brand's vision and marketing goals as well, right from Python and without worrying about the infrastructure.