The Dark Side is calling me! Help! (with designing a C API)

I have a design problem that seems well-suited to OOP, but I want a clean C implementation. I'd like the community's thoughts on the matter. A bit of background:

I've been working on this image editor project, called Papaya, for some time now. Recently, I made a big breakthrough in high-level design, and want to implement everything as nodes. Here's what I would like to implement:

Layering and masking
Traditional image applications have layers. You can put pixels on a layer, and you can put layers on top of each other to composite an image. Each layer can also have a blend mode (e.g. Normal, Multiply, Overlay, etc.). Layers can also have masks. Photoshop confuses this even further by having two types of masks - a black and white mask that is represented as a second image in the same layer, and a clipping mask, which uses the layer below as a mask.

Papaya won't have any layers at all. It will have bitmap nodes instead. A bitmap node is just a type of node that will contain a bitmap image.

1
2
3
4
5
6
7
8
9
         ▲ Output
         |
     ____|___
    |        |
    | Bitmap |◄--- Mask (optional)
    |________|
         ▲
         |
         | Input (optional)



A bitmap node will have two input slots and one output slot. It will contain its own image data and a blend type. If given an input image, it will blend its own data with the input image, and will produce a resultant output image. If a mask is connected, it will be used. We can thus connect bitmap nodes to each other to obtain layering and masking functionality.

When a bitmap node is selected, users will be given bitmap/raster tools to operate on the data for that node. Brushes, erasers, selection tools, etc. will be available.


Effects
In Papaya, just like everything else, effects will be nodes too. For example, a hue/saturation effect node may look like this:

1
2
3
4
5
6
7
8
9
         ▲ Output
         |
     ____|____
    |         |
    | Hue/Sat |◄--- Mask (optional)
    |_________|
         ▲
         |
         | Input



It will have two input slots, and one output slot. All effect parameters will be node parameters, just like blend type was a node parameter for bitmap nodes. Masking will be possible just like it was for bitmap nodes.

A huge amount of effects such as blurs, color corrections, exposure, brightness, contrasts, chromatic effects, vignettes, dropshadows, embosses, etc. will be possible to atomically implement as nodes.

Vectors
The node approach means that in Papaya, both vector and raster images can be first-class features. The "document" will have no resolution whatsoever. Each bitmap node may have a different resolution. The final image is just whatever node you choose to be the final image. This means that vector nodes will have first-class support.

When a vector node is selected, the raster tools on the left bar of the editor will disappear, giving way to vector tools instead.

If you connect a vector node's output to a bitmap node's input, that vector image will be rasterized at the bitmap node's resolution.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
         ▲ 
         |
     ____|___
    |        |
    | Bitmap |
    |________|
         ▲ 
         | Rasterization occurs here, at the Bitmap node's resolution
     ____|___
    |        |
    | Vector |
    |________|
         ▲
         |
         |


(Note that vector support is non-trivial to implement. All I'm saying is that it fits natively into this node-based model.)

Lossless transforms

Another kind of node will be a transformation node, where you can translate, scale, rotate and otherwise deform the input node, but this will be lossless, since we will be retaining the original node. Tiling nodes will also be possible, which will make creating patterns like checkers and stripes easy.

Parametric images

The file format used for storage of Papaya documents will just describe this node graph and properties of each node. There may be sparse document information, such as grid-line size, guide lines, document-specific settings, etc.

Papaya will consist of two main parts:
1) The Core, which is the GUI, platform layer, etc.
2) The Engine, which will be able to serialize and deserialize nodes, and also evaluate them.


The Core will call the Engine to actually produce images by compositing the nodes. I'd like to make the Engine a separate component, such that it should be able to be integrated into other code bases like games or game engines. This way, foreign code bases will be able to read, tweak and evaluate the node graph from Papaya files, and get images in return. This way games can parameterize image generation wherever required.

---

I think it'll be valuable to create the Engine in pure C, since that makes integration easier for other people.

I'm having trouble coming up with a clean non-OOP solution. Here's the general outline of what I've been thinking about:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// Public API - Used in Papaya's Core, and in consumer applications
// ----------------------------------------------------------------

    ppy_load(filename, &x, &y, &n) . . Loads file, returns bitmap.
    ppy_parse(...) . . . . . . . . . . Loads file, returns document data.
    ppy_node_evaluate(...) . . . . . . Takes in a node graph, and then evaluates
                                       and returns an evaluated bitmap for a
                                       specific node.

    ???  . . . . . . . . . . . . . . . Some way of selectively modifying nodes.



// Authoring API - Used primarily in Papaya's Core, but can be exposed to consumers anyway
// ---------------------------------------------------------------------------------------

     ppy_node_init() . . . . . . . . . . . Create/init a node
     ppy_node_destroy(). . . . . . . . . . Destroy a node
     ppy_write(filename, w, h, n, data). . Write document to a file


So, finally, the question: What would be a good way to structure this API?

I need to have different kinds of nodes - e.g. BitmapNode, HueSatNode, VectorNode, etc. I need to expose their internal member vars to users, allow for generic evaluation calls, and store them in a homogeneous array. This seems tantalizingly suitable for an OOP solution.

This spec will have to define what the Node structs and the function calls will look like.

Thoughts?

Edited by Apoorva Joshi on
APoorvaJ
I need to have different kinds of nodes - e.g. BitmapNode, HueSatNode, VectorNode, etc. I need to expose their internal member vars to users, allow for generic evaluation calls, and store them in a homogeneous array. This seems tantalizingly suitable for an OOP solution.


It seems like unions can accomplish what you need with the API. Of course, that may waste some space, so that could be a disqualifier.

Different kinds of nodes via a type flag. Exposing internal members is obvious. Allowing for generic evaluation calls either by a switch statement for the type flag and/or separating the node into components that will have similar behavior across multiple nodes and passing that component in rather than the entire node. Storing in a homogeneous array as the node struct containing the above.

I hope this makes sense and at least has 1% of something you haven't thought of yourself :-)

P.S. I wanted to mention that I think your design for Papaya's architecture is beautiful. I love the node structure you've created.
People reading this, just because you respect my work, please don't take my comments here as a final word. It's just an opinion.

What I would try is:

- Bottom layer:

For each type of "node", define a single function which takes all the node parameters explicitly and does the equivalent of ppy_node_evaluate. Probably take all the parameters except mask & input in a "node structure". Expose these functions and structures for use by apps that want to just leverage your operations. (Immediate mode vs retained mode.)

- Next layer:

Define a union of all the above structs, plus a type field to indicate which type. (I usually just put this field in all the structures to make everything clear, but if you do it here, because the bottom layer is exposed that might be a bad idea.) ppy_node_evaluate() goes here. it has to be implemented as a switch which calls all the above functions; or if they all have the same function signature, you can make a table of functions, one per type. (Not a function pointer per node, nor a vtable per node type. Just a flat array of functions indexed by type.)

- Next layer:

ppy_init / ppy_destroy. Though I'm not sure about this part of the design; you want the client to manage arrays of nodes, but you want to manage the storage for the nodes themselves yourself? Is the library managing graphs at all, or is still requiring the client to traverse the graph? It seems like managing and traversing the graph could be useful to library-ize as well (though it could be the next level out). ppy_destroy() will need to switch on the type to clean up ancillary data, but I expect a switch will be enough, a table of functions isn't needded.

- But:

This is what I would think about. I would sketch it out and see if it made sense without actually implementing it. I might try to write code to use it and see how it fits together. And I certainly don't know exactly what you're trying to do here so my comments on memory management above may not really be relevant.

== Unrelated comments

Tools like Photoshop pretty much do the same stuff, just with layers rather than nodes; vector objects are layers, adjustement layers are layers; embedded bitmaps (smart objects? not sure the name) allow applying a transformation matrix losslessly (I think).

The advantage of layers are purely on the UI side; it's much easier to manipulate a linear stack of layers in a UI than it is to edit a node-based graph. Also, most of the times you're going to be creating trees, not DAGs, and PS's "layer folders" allows expressing trees. But there might be times where you want to mask multiple things with one thing and you can't easily do that in PS, so there might be occasional win, but I would expect far more pain than win. Of course you can always implement a layer UI using a node-based backend, and not vice versa, so that doesn't really have to impact the API, and of course if your goal is to enable expressing things that other editors can't, well, that's your goal then.

Edited by Sean Barrett on
If we just take a literal representation of your description, you could end up with something that looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
enum ppy_node_type
{
    ppy_node_type_bitmap,
    ppy_node_type_huesat,
    // ... etc ...
};

struct ppy_node
{
    ppy_node *Input1;
    ppy_node *Input2;

    // Caches the result when evaluate is called
    ppy_bitmap_data *Cache;

    ppy_node_type Type;
    ppy_node_type_data *Data;
};

struct ppy_node_huesat_data
{
    f32 Hue;
    f32 Saturation;
    f32 Value;
    // ... other stuff ...
};


I think you'd want to tweak this some as you start playing with you to shape the APIs so they are easier to use. For instance, you might want to create a type that puts a functor and the pp_node_type together, or you might want a lookup table of ppy_node_type -> functors.

So the usage might look something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
ppy_node *node = ppy_node_huesat_init(1.f /* hue */, 0.5f /* sat */, 0.1f /* value */);
// node.Type == ppy_node_type_huesat
// node.Data == pointer to ppy_node_huesat_data { Hue = 1.f, Sat = .5f, Value = 0.1f }

// Build the graph
node->Input1 = SomeOtherNode1;
node->Input2 = SomeOtherNode2;

// Walks the graph.
ppy_node_evaluate(node);


The reason to use this approach over unions is that adding new node types is merely adding an entry to the ppy_node_type enum, adding the registration for the function that evaluate can call, and adding a data structure. If you use a union, you need to keep updating that single structure which can become quite unwieldy over time. Also, this approach allows for a plug-in architecture allowing 3rd parties to add their own node types (though you'd need to update the ppy_node_type to a struct and store a type ID and the functor together for that).

This is the direction I'd probably start in, but I've only spent 15 minutes thinking about the problem.
Do you want to support custom nodes?

nothings

== Unrelated comments

Tools like Photoshop pretty much do the same stuff, just with layers rather than nodes; vector objects are layers, adjustement layers are layers; embedded bitmaps (smart objects? not sure the name) allow applying a transformation matrix losslessly (I think).

The advantage of layers are purely on the UI side; it's much easier to manipulate a linear stack of layers in a UI than it is to edit a node-based graph. Also, most of the times you're going to be creating trees, not DAGs, and PS's "layer folders" allows expressing trees. But there might be times where you want to mask multiple things with one thing and you can't easily do that in PS, so there might be occasional win, but I would expect far more pain than win. Of course you can always implement a layer UI using a node-based backend, and not vice versa, so that doesn't really have to impact the API, and of course if your goal is to enable expressing things that other editors can't, well, that's your goal then.


I find Blender's node editor very powerful and don't think it's harder to work with than a stack-based ui, so I'm very much looking forward to how this turns out :)
CaptainKraft

It seems like unions can accomplish what you need with the API. Of course, that may waste some space, so that could be a disqualifier.


I don't have a lot to say about the main topic of the question. But my experience on the issue of union and switch vs inheritance and function pointers has led me to believe that I strongly prefer the former. In fact, I have gone so far as to just do struct and switch in some cases, because even though it wastes space a little more than a union and switch, not having to manage any memory overlapping at all just makes the code slightly easier to handle and we're talking about wasting 32 or 64 bytes per item with a very low n in my case...

But anyway, the point I really want to stress about the inheritance and function pointers approach, is that it's not just poorly implemented by C++, it's just a terrible way to design code all together. The idea is you're going to try to compress your data by saying that each item is exactly as big as it should be, so if you have some items that are very big, and then lots of items that are very small, you are then saving some space.

However in reality, and in the case of your problem, that is never the case. You never really need to treat two things the same way that have vastly different struct sizes. If you're thinking "well the bitmap is really big, and an effect node is really small" then just recall we're talking about the fixed size portion of each item. The bitmap node just contains a pointer to the actual bitmap data.

On top of this, I have found that in any system like this eventually you will discover a neat way to take advantage of the fact that a node has the fields for a bitmap and an effect at the same time, and if you've aggressively separated these into different node types, you're either going to be duplicating a lot of code, doing something extraordinarily hacky with diamond inheritance structure, or just merging the code that you spent all that time separating anyway.

This isn't really API advice as much as implementation advice, but I guess what I'm saying is if you keep your OOP mindset and tweak it so that you try to flow every node type through just one struct, you'll probably be much happier with the result.

Edited by Allen Webster on
GreenLightning
Do you want to support custom nodes?


In most scenarios I find custom-node type stuff to encourage over-engineering for low benefit (in most scenarios nothing custom will ever happen; for example the author of stb_image_resize.h wanted to have the option for customized filter kernels, since the system for that was pretty generic internally, but I advised against it--"you can always add it later if people ask for it", and nobody did.).

OTOH image processing plugins are super-common, so this may be the case that really wants it. Nevertheless, I personally think you're better off designing a plug-in API separately, e.g. a node type that does have a callback or one or more callbacks, and not trying to one-size-fits-all by making your own internal nodes and externally-customized nodes have to work the same way. And indeed if you do this you can always design that node type and that plug-in API further down the road.

Edited by Sean Barrett on
Sean,

Thanks a lot for pitching in. Much appreciated. :)

Exposing the bottom-most layer as an image editing library had completely skipped my mind. It does make a lot of sense, and gives me something more to think about.

As for ppy_node_init/destroy, it does make sense to put the node data in the library, and offer an immediate-mode API.

I will think about these points and sketch up an API along with some code that uses it.

In DAGs vs trees, DAGs seem to be the clear winner to me. Usability is a UI problem, and I have a solution in mind, to keep the interface familiar, while sticking to nodes.

---

CaptainKraft,

I was considering unions+switch cases vs function pointers vs macro tricks, and I think I've eliminated macro tricks. :D

Sean's answer gives me another dimension to think about, so I'll see what fits in best.

---

owensd,

I did something similar with the current undo system in Papaya, but having done it, I prefer switch cases or function pointers to type casting opaque pointers to memory. Safer and easier to debug that way.

---

Thanks again for all of your answers. It's given me lots to think about. Feel free to continue the thread. I'll be keeping an eye on it, and will post updates as I have something more solid. :)

[Edit: New replies appeared while I was typing this out. I shall have to come back to the conversation tomorrow, though.]

Edited by Apoorva Joshi on
A couple of thoughts. I like what nothings is saying about splitting into layers. It's the approach that best allows you to experiment and change things as you develop it.

Some words of caution:

This model is well known and explored in computer graphics.

I appreciate that this approach was a breakthrough for you. That's great! However, just because this perfectly fits your mental model of how the program works internally, it does NOT mean that users will have ANY interest in directly using a node graph to express their ideas. For programmers, there are lots of obvious benefits to this approach -- BUT IT IS NOT GENERALLY SO FOR PEOPLE WHO MAKE GRAPHICS.

If you look around I am sure you can find programs that already perfectly implemented this and presented it to users. They have one thing in common: no actual users.

Internally it's great to represent the data flow like this, and even add an API. But please don't let the internal data representation dictate how users interact with your software. At least not if one of your goals is getting people to use Papaya.

PS. Sorry for caps. I feel very strongly about this. I have seen people waste years trying to get users for software based on this kind of UI.

Edited by Ralph Brorsen on Reason: PS
A couple of random thoughts.

First, the way you describe your node system, you only really need two nodes, bitmap and vector. With lossless transforms, this is just a new node linked from the previous node with the transform applied.

So effect would be added as an input to each node. And you could even have the ability to add multiple effects per node. Internally it can still be a small node, but API wise for the outside world, it perhaps doesn't matter.

Second, I think there are two useful API use cases discussed here, with a third that I don't know if you care about.

The first is the game engine usage. Do you imagine the game engine is going to modify this DAG? Or is it going to just use the final output of Papaya. If the latter, then it only really needs a read only style API that allows it to actually get the final bitmap. This would ideally be a simple call to output the DAG to a common Bitmap format. If you think they may want to makes changes, then that actually falls under the third use case, editors.

The second use case would be addons. If you want to allow developers to create custom effects, some of the earlier suggestions sound good to me. (Use your own tastes, of course.) But they have different needs, that should actually be simpler than the others. (You could even just create a simple math evaluator and allow them to write text files for this part, if you want to make development smoother.)

The last case is if you want to allow third party editors. This actually seems like the API structure you are actually envisioning, but your use cases you described don't seem to really match up.

Last advice. Write a simple display program, that does nothing but display the final output. What did you need to get that going? That would basically be the first use case for the game.

Now think about how you would want to come into and write a custom effect, but limit yourself to use as little as possible of the other APIs.

And finally, the editor would be Papaya itself, so don't go rewriting it. Design the API for it how you want to use it.

Yes, I just advised Snuffaluffagus Oriented Programming (SOP).
ApoorvaJ
Sean,

owensd,

I did something similar with the current undo system in Papaya, but having done it, I prefer switch cases or function pointers to type casting opaque pointers to memory. Safer and easier to debug that way.



You can do the type checking, it just becomes run-time validation. Though really, you're essentially doing the same thing with the switch statement.

For example, a sample evaluate function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
bool ppy_filter_emboss_evaluate(const ppy_node *Input, ppy_node *Output)
{
    if (!Input || !Input->Input1 || !Output) { return false; }
    
    // ppy_node_data_for_type is a MACRO that validates the type.
    ppy_node_data_emboss *InputData = ppy_node_data_for_type(Input, emboss);
    if (!InputData) { return false; }

    Output->BitmapData.Pixels = i32(Input->BitmapData.Pixels + InputData->Embossness);

    return true;
}


The MACRO looks like this:

1
2
#define ppy_node_data_for_type(NODE, TYPE) \
    (ppy_node_data_ ## TYPE *)((!NODE || NODE->Type != ppy_node_type_ ## TYPE) ? 0 : NODE->TypeData)


And the structures look something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
enum ppy_node_type
{
    ppy_node_type_bitmap,
    ppy_node_type_emboss
};

struct ppy_node_bitmap_data
{
    i32 Pixels;
};

struct ppy_node_data_emboss
{
    static const i32 Type = ppy_node_type_emboss;
    f32 Embossness;
};

typedef void* ppy_node_data;

struct ppy_node
{
    ppy_node *Input1;
    ppy_node *Input2;

    ppy_node_bitmap_data BitmapData;

    ppy_node_type Type;
    ppy_node_data TypeData;
};


You have to store a little bit of extra data with this approach (basically the type info), but I find that this alleviates many of the silly issues with wrong types during development. It's possible to then strip the runtime checks away at release builds if you really need to.
Allen,

I agree with the fact that function pointers tend to get ugly really quickly. I'll probably opt for a union & switch mechanism.

Your aggregated struct & switch suggestion is interesting, and seems like a valuable alternative to keep in mind when taking these kinds of design decisions. In this case, it is not suitable because even though the amount of nodes in any document will be very low, I'm hoping that there will be a lot of types of nodes. Storing them as an aggregated struct doesn't feel like a good way to go, at least at the moment.
Bill,

BillDStrong
First, the way you describe your node system, you only really need two nodes, bitmap and vector. With lossless transforms, this is just a new node linked from the previous node with the transform applied.


I think there is real value in keeping effects as nodes, instead of making them input/output slot modifiers. I think it keeps simple conceptually and also in terms of the UI design. Additionally, we can have nodes like channel splitters/mixers which will ease the difficulty of storing different data in different channels, while also keeping the data easy to visualize.

BillDStrong
Do you imagine the game engine is going to modify this DAG? Or is it going to just use the final output of Papaya.


This isn't fully fleshed-out yet, but the primary use case I was thinking about was that engines will be able to change the params of individual nodes, e.g. the value of a hue/sat node, the radius/type of a blur node, etc. Basically, my main use case was param tweaking, not changing the topology of the DAG.

BillDStrong
The last case is if you want to allow third party editors. This actually seems like the API structure you are actually envisioning, but your use cases you described don't seem to really match up.


Papaya itself will need all of the functions and structs used to alter the topology and display node information. I don't see why a different third-party editor API is needed.
I think this is in the intersection of API design and implementation detail.


You could let the user define a callback so they can use their own write function.

So you would have, for example:

1
ppy_load(char* filename, ...); 


and

1
2
3
typedef ppy_write_func(void* context, void* data, size_t num_bytes);

ppy_load_with_func(ppy_write_func* func, void* context, ...);


and ppy_load would internally call ppy_load_with_func with a default write function and context.

One example of how this can be useful. If a user of the API wanted to run on Windows with UTF-16, they could implement something like

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
void my_write(void* context, void* data, size_t num_bytes)
{
    FILE* fd = (FILE*)context;
    fwrite(data, num_bytes, 1, fd);
}

// Calling code
{
    my_file = wfopen(...);  // UTF-16 path.

    ppy_load(my_write, my_file, ...)
}


This transfers the details of writing data to the user. If they want to use memory-mapped files or the Win32 API or whatever they need, they can.

I implemented this in my JPEG library when I ran into the UTF-16 problem, and I got it from reading @nothings' code, of course :)

As a side note, your concept seems to be very analogous to Direct2D Effects on the implementation side.