empulse
I personally prefer formats like the ones in Jonathan Blow's stream or .INI files. I don't understand the ambiguity problem. If you just treat any whitespace as an indicator of a new identifier it would be very obvious to the user. If your file looks like
| name value
name value
name value
|
The user would know not to write namevalue. As long as you don't assign any significance to the amount of whitespace it should be fine.
Yeah I guess white space is an easily solvable example. What I worry about is when you get into wanting to define nested elements and other things. For example maybe I have an item with a name and 2 values. Or multiple key-value pairs under another name. Do we have each pair on a separate line with significant indentation:
| name value
name value
name value
name value
|
Or do we do some sort of inline syntax:
| name1 value1 name2 value2 name3 value3
|
What happens when I want the value to be a string with spaces or special characters? ('\n', '\t', etc.)
I think all of these problems can be solved but the fact that they have to be solved makes it hard for the user to know what they can and can't type. I think there's some benefit to going with a predefined language that has a decent amount of documentation and user base. However, whether or not JSON or YAML are commonly understood languages I can't really say. So maybe the benefit isn't as large as I'd like to hope.
It might be important to mention also that I'm not really looking for a fully featured serialization language but I would like some support for strings, grouped key-value pairs, and possibly arrays. There's also a possibility I might be adding options that use regular expressions so string formats that require less escaping would be better.