![]() ![]() Guidance scale is enabled when guidance_scale > 1. Prompt at the expense of lower image quality. guidance_scale ( float, optional, defaults to 7.5) -Ī higher guidance scale value encourages the model to generate images closely linked to the text.If not defined, the default behavior when num_inference_steps is This parameter is modulated by strength.Ĭustom timesteps to use for the denoising process with schedulers which support a timesteps argument More denoising steps usually lead to a higher quality image at theĮxpense of slower inference. num_inference_steps ( int, optional, defaults to 50).Process runs for the full number of iterations specified in num_inference_steps. When strength is 1, added noise is maximum and the denoising Starting point and more noise is added the higher the strength. Indicates extent to transform the reference image. strength ( float, optional, defaults to 1.0).The width in pixels of the generated image. width ( int, optional, defaults to _size * self.vae_scale_factor).The height in pixels of the generated image. height ( int, optional, defaults to _size * self.vae_scale_factor).And for numpy array would be for (B, H, W, 1), (B, H, W), (H, W, 1), or (H, W). If it’s a numpy array or pytorch tensor, it should contain oneĬolor channel (L) instead of 3, so the expected shape for pytorch tensor would be (B, 1, H, W), (B, H, W), (1, H, W), (H, W). If mask_image is a PIL image, it is converted to a White pixels in the maskĪre repainted while black pixels are preserved. Image, numpy array or tensor representing an image batch to mask image. mask_image ( torch.FloatTensor,, np.ndarray, List, List, or List).If passing latents directly it is not encoded again. If it is a numpy array or a list of arrays, theĮxpected shape should be (B, H, W, C) or (H, W, C) It can also accept image latents as image, but ![]() Tensor, the expected value range is between If it’s a tensor or a list or tensors, theĮxpected shape should be (B, C, H, W) or (C, H, W). Image, numpy array or tensor representing an image batch to be inpainted (which parts of the image toīe masked out with mask_image and repainted according to prompt). image ( torch.FloatTensor,, np.ndarray, List, List, or List).If not defined, you need to pass prompt_embeds. The prompt or prompts to guide image generation. ( prompt : typing.Union] = None image : typing.Union, typing.List, typing.List] = None mask_image : typing.Union, typing.List, typing.List] = None masked_image_latents : FloatTensor = None height : typing.Optional = None width : typing.Optional = None strength : float = 1.0 num_inference_steps : int = 50 timesteps : typing.List = None guidance_scale : float = 7.5 negative_prompt : typing.Union, str, NoneType] = None num_images_per_prompt : typing.Optional = 1 eta : float = 0.0 generator : typing.Union, NoneType] = None latents : typing.Optional = None prompt_embeds : typing.Optional = None negative_prompt_embeds : typing.Optional = None ip_adapter_image : typing.Union, typing.List, typing.List, NoneType] = None output_type : typing.Optional = 'pil' return_dict : bool = True cross_attention_kwargs : typing.Union, NoneType] = None clip_skip : int = None callback_on_step_end : typing.Union, NoneType], NoneType] = None callback_on_step_end_tensor_inputs : typing.List = **kwargs ) → StableDiffusionPipelineOutput or tuple load_ip_adapter() for loading IP Adapters.save_lora_weights() for saving LoRA weights.load_lora_weights() for loading LoRA weights.load_textual_inversion() for loading textual inversion embeddings.The pipeline also inherits the following loading methods: Implemented for all pipelines (downloading, saving, running on a particular device, etc.). Check the superclass documentation for the generic methods This model inherits from DiffusionPipeline. Pipeline for text-guided image inpainting using Stable Diffusion. feature_extractor ( CLIPImageProcessor) -Ī CLIPImageProcessor to extract features from generated images used as inputs to the safety_checker.Please refer to the model card for more details safety_checker ( StableDiffusionSafet圜hecker) -Ĭlassification module that estimates whether generated images could be considered offensive or harmful.Can be one ofĭDIMScheduler, LMSDiscreteScheduler, or PNDMScheduler. Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.įrozen text-encoder ( clip-vit-large-patch14).Ī UNet2DConditionModel to denoise the encoded image latents.Ī scheduler to be used in combination with unet to denoise the encoded image latents. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |