Object Generative Fill: a code pattern idea (with example in C#)

What is Generative Fill?

You are probably familiar with the Generative Fill feature in graphic editing software like Adobe Photoshop. The idea is that well-trained AI models can guess how to fill blank areas in the picture. The model can either try to guess what to do by itself or accept prompts to steer it toward generating something specific.

When we use a generative fill feature to add more context to the photo, the result might look like this:

Example of an input image.
Example of a Generative Fill transformation performed by the Cloudinary AI service.

Idea: Use generative Fill pattern in object-oriented programming

I want to present the idea of a pattern in the object-oriented programming world that is similar in concept to generative fill for images. The pattern is to have an abstraction over generative AI API that can fill the missing properties in an object. Let me illustrate it with an example:

// A model partially initialized by user and then filled out by an AI service
class Country(string countryName) : ObjectWithId
{
    public string CountryName { get; init; } = countryName;

    [FillWithAI]
    [FillWithAIRule("Fill with the first historically known capital of the country, not the current one!")]
    public string? FirstCapital { get; set; } = null;

    [FillWithAI]
    [FillWithAIRule("Fill the value with name of the current president of the country")]
    [FillWithAIRule("Use CAPITAL LETTERS for this value")]
    public string? President { get; set; } = null;
}
Code language: C# (cs)

To fill out the missing fields of an object described this way, we need a simple abstraction (here named GenerativeFill) that creates a clever prompt and uses the data we provide:

// fill values in one object instance
Country country = await GenerativeFill.FillMissingProperties(new Country("USA"));

// fill values in a list of objects
List<Country> countries = await GenerativeFill.FillMissingProperties(new Country[] { new("USA"), new("Poland"), new("France") });
Code language: C# (cs)

Running such code should give us a result like here (for the last example):

The result of running the previous example. I highlighted the data filled out by the OpenAI model in green.

Benefits

The benefit of having an abstraction like the presented GenerativeFill class is that we abstract all the ugly code into a single service with a nice and simple public API. Some more concrete benefits:

  • We create a clever, generic prompt just once, and it covers many use cases. Creating a prompt string in my apps was always a pain point.
  • We insert the input data into the prompt with a generic implementation. It’s also an opportunity to optimize this and ensure there is no unnecessary JSON indentation to reduce cost 🙂
  • We can also automatically insert an example of the response schema into the prompt. We already have all the information we need in the C# model!
  • Despite its simplicity, this abstraction allows the processing of multiple object instances in a single request to the API. It reduces the overhead and cost of input tokens. I believe that easy processing of batches of input data is a significant cost optimization and convenience win!

Drawbacks

On the side of the drawbacks, what I can see already is:

  • As with all high-level abstractions, it can be leaky. Most of the time, we can forget the complexity, and it works. But sometimes networking fails, we run out of credits, or the AI model randomly breaks the contract and doesn’t return all expected results. Some of it cannot be auto-resolved and will show up as exceptions to be handled by the caller code.
    I am not currently sharing my implementation because such an abstraction needs to be battle-tested for various edge cases before someone uses it in production.
  • My proposition relies on a mutable object model. The user partially fills out the model and then passes it to the service to have the rest of the properties filled out. Many developers would prefer to design it with immutable objects to avoid pitfalls. It can be done if we have a separate input model and output model. But I decided my proposition has the advantage of simplicity, at least to convey the general idea.

Example of a working prompt

I implemented a proof of concept of the GenerativeFill abstraction described above. So, to help the imagination, here is an example generic prompt that works pretty well. The highlighted lines are ones where the service inserts user-provided data. The rest is just a constant template.

Input contains array of items to process (in the `Items` property):

```json
{"Items":[{"CountryName":"USA","Id":1},{"CountryName":"Poland","Id":2},{"CountryName":"France","Id":3}]}
```

The output should be in JSON and contain array of output items with following schema:

```json
{"Items":[{"FirstCapital":null,"President":null,"Id":1}]}
```

For each input item you should generate one output item, using the `id` property as a key linking input and output.
Your job is to replace the null values with content.

Use the following rules when filling values of properties:
- For `FirstCapital` property: Fill with the first historically known capital of the country, not the current one!
- For `President` property: Fill the value with name of the current president of the country
- For `President` property: Use CAPITAL LETTERS for this value
Code language: Markdown (markdown)

The response is something like here (I added indentation manually for readability):

{
  "Items": [
    { "FirstCapital": "Philadelphia", "President": "JOE BIDEN", "Id": 1 },
    { "FirstCapital": "Gniezno", "President": "ANDRZEJ DUDA", "Id": 2 },
    { "FirstCapital": "Tournai", "President": "EMMANUEL MACRON", "Id": 3 }
  ]
}Code language: JSON / JSON with Comments (json)

Summary

Integrating a C# application with a natural-language-based API like ChatGPT or Gemini involves several steps, which are quite repetitive. Designing a prompt, serializing input data, deserializing output data, handling errors, retrying in case of failures, optimizing the prompt to work on chunks of data: all this is a pain in the butt when you do it again and again.

Even though chat-like APIs already have ultra-high levels of abstraction, we can still build useful abstractions above them when we use them from object-oriented languages like C#. Maybe the pattern described here will make your life easier at some point, too 🙂

No comments yet, you can leave the first one!

Leave a Comment