[jam] Math thingy

First day is over for me.

My project is a program to help me work out math equations.

What I started with

I started the project a few moth ago but only worked a few hours on it. What I already had done was

  • writing an equation with identifier ( a, b, x, y...), multiply, divide, add, subtract.
  • display (in text form) the equations as trees or on a single line
  • addition/subtraction property of equality y = x => y + a = x + a
  • add similar identifiers y = x + a - a => y = x, well, sort of.

question:

  • how to handle negative ?
  • Only add operator, with negative operand or subtract operator ?
  • Should constants be signed or use a negative "property" ?
  • Should identifier have a negative property or use a * -1 system ?

Observations

  • Identifiers seem to always be leafs of the tree
  • At the moment a node has either 0 or two child, never one.

Day 1

Not much advance made. After getting used to the code again, I found an issue with one of the tree traversal functions and decided to make some tests and additional way to print out the path between "identifiers" in the equations. Hopefully tomorrow will be better.

It's not interactive at the moment, it's more an API.

/* y = ax + b */
node_t* equal = math_equal( );
node_t* y = math_identifier_l( "y" );
node_set_left( equal, y );
node_t* plus = math_add( );
node_t* a = math_identifier_l( "a" );
node_t* x = math_identifier_l( "x" );
node_t* mul = math_mul( );
node_set_left( mul, a );
node_set_right( mul, x );
node_t* b = math_identifier_l( "b" );
node_set_left( plus, mul );
node_set_right( plus, b );
node_set_right( equal, plus );

print_equation( &out_buffer, equal );

node_t* step_1 = math_step_start( equal ); {
    node_t* b_copy = math_identifier_l( "b" );
    math_equality_property_subtraction( step_1, b_copy );
    math_step_finish( &out_buffer, step_1 );
}

node_t* step_2 = math_step_start( step_1 ); {
    math_add_similar_identifiers( step_2, string_l( "b" ) );
    math_step_finish( &out_buffer, step_2 );
}

Prints out (the second step output isn't correct):

y=a*x+b
--------------------
y-b=a*x+b-b
      =                                                        
  -               -                                            
y   b         +     b                                          
          *     b                                              
        a   x                                                  

--------------------
y-b=a*x+b
      =                                                        
  -           +                                                
y   b     *     b                                              
        a   x       

Edited by Simon Anciaux on Reason: typo

Here's an idea, you could take the "mathy" approach:

"a - b" is represented as "a + (-b)"

where "-b" is a unary operator giving the "additive inverse of b" and "b" could be any expression.

then have the property: a + (-a) = 0

I think only having to deal with addition could simplify the implementation.

That's what I wanted to express with "Only add operator, with negative operand or subtract operator", but I did a poor job. One of my problem is that I never know the name of things in math which makes it hard to find information (and English is not my mother tongue so it's even worse).

I was looking for "Additive inverse" without knowing the name, so thanks for that.

From the registration thread: "I'll try to work on a program to help me work out math equations. It's not meant to solve equations by itself, only let me write an equation and do operations on it as I would on a piece of paper."

Strangely enough, this is exactly the problem I'm trying to solve too! :) But I'm approaching it more from the visual/tactile side of the problem, so I'm going to try to implement a lot of natural/intuitive mouse interactions alongside fast keyboard shortcuts to make things feel fluid.

Here are some brainstorm ideas I had, on the off chance that something might be helpful to you: https://www.evernote.com/shard/s237/sh/0b3ea4b8-9956-eb20-d3a1-81e73e493c4c/bf473b55436f015d04a454bc365a21b4

Good luck! And I'd love to chat about the idea if you are ever interested.


Edited by Tyler on

@HarmOUR Thanks. My goal is also to have easy and fast interaction with a GUI, but I prefer doing the "engine" first pretending to have any UI I want. And I don't think I'll get to the interactive part of the program during the jam. Also thanks for sharing your notes.

Day 2

I started by fixing the issue seen in the output from yesterday. It was an issue with removing nodes from the tree. After fixing the issue, I started to make the node removal function more robust with things like handling what happens when you remove the root of the tree or the last node on a side of the =. And quickly found that it wasn't the remove node function that needed to handle that.

So I tried to fix the calling code and realize that I was trying to automate the process of "adding similar elements" of the equation, which might be usefull, but the first goal of the program is to let the user choose what should happen. For example instead of having "Try to add all the x in this equation", I want the user to select two x and tell the application "Add those x".

I could do a simple version of that, but it didn't seem to help me much going toward my goal, so I tried to think a bit about "things you can add in an equation".

I chose to only allow multiply and add at the moment. After a bit I decided to try to name things with names I found on the internet. It seems obvious but to this point I used terms without certainty about what they meant. If those aren't correct, please let me know, as it would really help.

A variable is something we don't know or are searching (this definition isn't clear yet). e.g. x = a + 5 (x is a variable).

A constant is a things on its own (letter or number). e.g. y = x + 4 (4 is a constant) or y = x + a (a is a constant).

A term is a group of things (1 or more) without addition or subtraction. e.g. y = 2x (2x is a term) or y = ax (ax is a term).

A coefficient is something that multiplies a variable. e.g. y = ax (a is a coefficient) or y = 2x (2 is a coefficient).

An expression is an addition/subtraction of terms.

With those definition it became clear that I needed to be able to detect terms and work with them. For example y - b = ax + b - b, b on the right side are terms and I need to find their coefficient after adding them. It made it clear to me how to add more "complex" term like abc or 2cd4.

A thing I keep in mind is that at the moment I consider 4ac a term but not (2+2)*ac. I don't have parenthesis in the program at the moment, so I'll handle that later.

So today was about "adding term together", which is close to completion (I need to finish adding the result of the add back in the tree). Here is some code.

/* TODO simon: handle parenthesis. */
b32 math_is_term( node_t* node ) {
    
    b32 result = ( node->type == node_type_identifier || node->type == node_type_mul || node->type == node_type_div || node->type == node_type_constant );
    
    if ( result && node->left ) {
        result = math_is_term( node->left );
    }
    
    if ( result && node->right ) {
        result = math_is_term( node->right );
    }
    
    return result;
}

/* TODO simon: Handle division, paranthesis. */
/* NOTE simon: The returned node is a new tree for the term without coefficient and needs to be added to the main tree. */
node_t* math_separate_term_and_coefficient( node_t* term, s64* coefficient ) {
    
    node_t* result = 0;
    
    if ( term->type == node_type_constant ) {
        
        ( *coefficient ) *= term->constant;
        
    } else if ( term->type == node_type_identifier ) {
        
        result = math_identifier( term->identifier );
        
    } else if ( term->type == node_type_mul || term->type == node_type_div ) {
        
        node_t* left = math_separate_term_and_coefficient( term->left, coefficient );
        node_t* right = math_separate_term_and_coefficient( term->right, coefficient );
        
        if ( left && right ) {
            result = math_mul( );
            node_set_left( result, left );
            node_set_right( result, right );
        } else if ( left ) {
            result = left;
        } else if ( right ) {
            result = right;
        }
    }
    
    return result;
}

/* TODO simon: This function consider 2 * 2 * x != 4 * x. Also a * b != b * a.
The first issue can be solved by using math_separate_term_and_coefficient.
I don't know for the second issue at the moment.
 */ 
b32 math_are_terms_equal( node_t* term_1, node_t* term_2 ) {
    
    b32 result = ( term_1->type == term_2->type );
    b32 term_1_left = ( term_1->left != 0 );
    b32 term_1_right = ( term_1->right != 0 );
    b32 term_2_left = ( term_2->left != 0 );
    b32 term_2_right = ( term_2->right != 0 );
    result = result && ( term_1_left == term_2_left ) && ( term_1_right == term_2_right );
    
    if ( result ) {
        
        switch ( term_1->type ) {
            
            case node_type_identifier: {
                result = ( term_1->identifier == term_2->identifier );
            } break;
            
            /* TODO simon: No add and sub at the moment (needs parenthesis for that). */
            case node_type_mul:
            case node_type_div: {
                result = math_are_terms_equal( term_1->left, term_2->left );
                result = result && math_are_terms_equal( term_1->right, term_2->right );
            } break;
            
            case node_type_constant: {
                result = ( term_1->constant == term_2->constant );
            } break;
        }
    }
    
    return result;
}

void math_add_terms( node_t* term_1, node_t* term_2 ) {
    
    _assert( term_1 && term_2 );
    
    b32 term_1_is_valid = math_is_term( term_1 );
    b32 term_2_is_valid = math_is_term( term_2 );
    
    if ( term_1_is_valid && term_2_is_valid ) {
        
        s64 term_1_coefficient = 1;
        s64 term_2_coefficient = 1;
        node_t* term_1_no_coefficient = math_separate_term_and_coefficient( term_1, &term_1_coefficient );
        node_t* term_2_no_coefficient = math_separate_term_and_coefficient( term_2, &term_2_coefficient );
        
        b32 equal = math_are_terms_equal( term_1_no_coefficient, term_2_no_coefficient );
        
        if ( equal ) {
            
            path_t path = { 0 };
            b32 path_exists = math_path_between( term_1, 0, term_2, &path );
            
            if ( path.first_path_node_is_on_the_left_of_last_node ) {
                swap_type( term_1, term_2, node_t* );
                swap_type( term_1_coefficient, term_2_coefficient, s64 );
            }
            
            _assert( term_2->parent );
            node_t* parent = term_2->parent;
            _assert( parent->parent );
            node_t* operator_node = parent->parent;
            _assert( operator_node->type == node_type_add || operator_node->type == node_type_sub );
            
            s64 coefficient = term_1_coefficient;
            
            if ( operator_node->type == node_type_add ) {
                coefficient += term_2_coefficient;
            } else {
                coefficient -= term_2_coefficient;
            }
            
            math_remove_node( term_1 );
            math_remove_node( term_2 );
            
            if ( coefficient != 0 ) {
                node_t* mul = math_mul( );
                node_t* coefficient_node = math_constant( );
                coefficient_node->constant = coefficient;
                node_set_left( mul, coefficient_node );
                node_set_right( mul, term_1_no_coefficient );
                node_t* add = math_add( );
                node_set_right( add, mul );
                /* TODO simon: Add in the equation tree. */
            }
        }
    }
}

Edited by Simon Anciaux on Reason: typo

Day 3

I started the day by trying to finish the code for adding similar terms. Mostly I needed a way to find the right most thing of the equation and insert the coefficient and term after, which I didn't expect to be so easy.

While doing that, I encounter some "questions" about how to handle removing nodes and keeping the tree/equation coherent, which was a concern from day 2. Until now, when I removed something from the tree I was removing the node and a parent or grandparent (to this point the only nodes I wanted to remove were leaf, meaning they were only identifiers, no operator). I needed a little carefulness, for example when you remove the leaf on the right, you can just remove/replace the parent with the opposing node of the leaf, but the same is not true for removing the leaf on the left, where I need to remove the grandparent and keep the parent... That wasn't working great, because as said yesterday it gets more and more complicated when you try to handle the removal of the root of the tree, or the first node after the = sign...

It got me thinking that there are 2 concept here: removing something from the tree, and keeping the equation coherent. So I decided to split that into two functions.

  • The removal of a node in the tree now is only setting the appropriate branch of the parent to null. This lives the tree in a non coherent state.
  • Then I have a function to reorganize the tree to make it coherent again, by re-arranging and removing nodes when necessary.

It's not well tested and generates meaningless 0 in the equation (for example y = 0 + x), but it seem to handle the few cases I tested correctly. It also allows me to remove several things from the tree before making it coherent. That also led me to try to separate functions that "act on the tree" and functions that "do math on the tree".

While writing test equations, it became obvious that it's annoying and slow having to type all the function calls. I thought that writing a small parser to convert equation into a tree would be easy and fast to do. The first version was written fast but wasn't working correctly on things harder than y=a+a, and reveal that it was harder then I first thought. I ended up with the following code that seems to work, but I didn't take much time to test it properly. I'll clean it up tomorrow.

node_t* math_parser( data_t string ) {
    
    u8* start = string.bytes;
    u8* end = memory_end( string );
    u8* current = start;
    
    while ( current < end && ( *current ) != '=' ) {
        current++;
    }
    
    node_t* result = math_equal( );
    
    node_t* left = math_parse( 0, &start, current );
    current++;
    node_t* right = math_parse( 0, &current, end );
    
    node_set_left( result, left );
    node_set_right( result, right );
    
    return result;
}

node_t* math_parse( node_t* last_node, u8** current_pointer, u8* end ) {
    
    node_t* result = 0;
    node_t* node = 0;
    
    if ( ( *current_pointer ) < end ) {
        
        u8 c = ( **current_pointer );
        ( *current_pointer )++;
        
        if ( char_is_decimal( c ) ) {
            
            node = math_constant( c - '0' );
            
            if ( last_node ) {
                node_set_right( last_node, node );
                result = math_parse( last_node, current_pointer, end );
            } else {
                result = math_parse( node, current_pointer, end );
                
                if ( !result ) {
                    result = node;
                }
            }
            
        } else if ( char_is_letter( c ) ) {
            
            u64 id = math_get_id_for( memory_construct( 1, 1, ( *current_pointer ) - 1 ) );
            node = math_identifier( id );
            
            if ( last_node ) {
                node_set_right( last_node, node );
                result = math_parse( last_node, current_pointer, end );
            } else {
                result = math_parse( node, current_pointer, end );
                
                if ( !result ) {
                    result = node;
                }
            }
            
        } else if ( c == '+' ) {
            
            node = math_add( );
            node_set_left( node, last_node );
            result = math_parse( node, current_pointer, end );
            
            if ( !result ) {
                result = node;
            }
            
        } else if ( c == '-' ) {
            
            node = math_sub( );
            node_set_left( node, last_node );
            result = math_parse( node, current_pointer, end );
            
            if ( !result ) {
                result = node;
            }
            
        } else if ( c == '*' ) {
            
            node = math_mul( );
            node_set_left( node, last_node );
            result = math_parse( node, current_pointer, end );
            
            if ( !result ) {
                result = node;
            }
            
        } else if ( c == '/' ) {
            
            node = math_div( );
            node_set_left( node, last_node );
            result = math_parse( node, current_pointer, end );
            
            if ( !result ) {
                result = node;
            }
        }
    }
    
    return result;
}

Edited by Simon Anciaux on

Day 4

Not as much advance as I hoped for.

I started by cleaning up the parser a little, as all the operator "paths" did the same thing. I also realized that the parser was helpful but having all the functions call and variable names allows to "select" terms, so I also made a function to convert the tree into function calls that I can paste in the code. It's not perfect, as on each step/operation I do on the equation I create a copy of the tree, so the pointers are not valid in the new equation, but it makes it easier to create a copy of a term (sub-tree). Also to make sure variables have unique names, they are named with a _somenumber.

I generated some equation trees using the parser, and after printing them I realized that while they were "correct", they required more work to identify terms as I needed to process the operator priorities (* and / before + and -).

For example y=2*a*x+2*b+a*x produced the following tree:

y=2*a*x+2*b+a*x
  =                                                            
y                         *                                    
                      +     x                                  
                  *     a                                      
              +     b                                          
          *     2                                              
      *     x                                                  
    2   a                                                      

After a bit of thinking, I realized that I could get more meaning to the position of elements in the tree, and make it easier to work with. The idea is to have terms (things that only contain multiplication or division) near the bottom of the tree, and only have addition and subtraction between them.

So I reworked the parser so that it processes things by terms (which I wanted yesterday but failed to do) separated by addition and subtraction. It now produces:

y=2*a*x+2*b+a*x
  =                                                            
y                     +                                        
              +           *                                    
          *       *     a   x                                  
      *     x   2   b                                          
    2   a                                                      
node_t* math_parse_term( u8* start, u8* end ) {
    
    node_t* result = 0;
    
    u8* current = start;
    
    while ( current < end ) {
        
        u8 c = *current;
        
        if ( char_is_decimal( c ) ) {
            
            node_t* node = math_constant( c - '0' );
            
            if ( result ) {
                _assert( result->type == node_type_mul || result->type == node_type_div );
                node_set_right( result, node );
            } else {
                result = node;
            }
            
        } else if ( char_is_letter( c ) ) {
            
            u64 id = math_get_id_for( memory_construct( 1, 1, current ) );
            node_t* node = math_identifier( id );
            
            if ( result ) {
                _assert( result->type == node_type_mul || result->type == node_type_div );
                node_set_right( result, node );
            } else {
                result = node;
            }
            
        } else {
            
            node_t* node = 0;
            
            if ( c == '*' ) {
                node = math_mul( );
            } else {
                _assert( c == '/' );
                node = math_div( );
            }
            
            node_set_left( node, result );
            result = node;
        }
        
        current++;
    }
    
    return result;
}

node_t* math_parse( u8* start, u8* end ) {
    
    node_t* result = 0;
    
    u8* current = start;
    
    while ( current < end ) {
        
        u8* term_start = current;
        
        while ( current < end && *current != '+' && *current != '-' ) {
            current++;
        }
        
        node_t* node = math_parse_term( term_start, current );
        
        if ( result ) {
            _assert( result->type == node_type_add || result->type == node_type_sub );
            node_set_right( result, node );
        } else {
            result = node;
        }
        
        if ( current < end ) {
            
            u8 c = ( *current );
            
            if ( c == '+' ) {
                node = math_add( );
            } else if ( c == '-' ) {
                node = math_sub( );
            }
            
            _assert( result );
            node_set_left( node, result );
            result = node;
            
            current++;
        }
    }
    
    return result;
}

void tree_verify_links( node_t* node ) {
    
    _assert( node );
    
    if ( node->parent ) {
        _assert( node->parent->left == node || node->parent->right == node );
    }
    
    if ( node->left ) {
        _assert( node->left->parent == node );
        tree_verify_links( node->left );
    }
    
    if ( node->right ) {
        _assert( node->right->parent == node );
        tree_verify_links( node->right );
    }
}

node_t* math_parser( data_t string ) {
    
    u8* start = string.bytes;
    u8* end = memory_end( string );
    u8* current = start;
    
    while ( current < end && ( *current ) != '=' ) {
        current++;
    }
    
    node_t* result = math_equal( );
    node_t* left = math_parse( start, current );
    node_t* right = math_parse( current + 1, end );
    
    node_set_left( result, left );
    node_set_right( result, right );
    
    tree_verify_links( result );
    
    return result;
}

I then tried to do some simple addition of similar terms, which highlighted some bugs and obsolete assumptions. And it's starting to work, although selecting terms is not user friendly.

node_t* step_1 = math_step_start( equal ); {
    math_equality_property_subtraction( step_1, b_7 );
    // math_equality_property_addition( step_1, b_7 );
    math_step_finish( &out_buffer, step_1 );
}
        
node_t* step_2 = math_step_start( step_1 ); {
    node_t* right = tree_find_right_most( step_2->right ); 
    math_add_terms( right->parent->left->right, right );
    math_step_finish( &out_buffer, step_2 );
}
y=a*x+b
--------------------
y-b=a*x+b-b
      =                                                        
  -               -                                            
y   b         +     b                                          
          *     b                                              
        a   x                                                  

--------------------
y-b=a*x+0-0
      =                                                        
  -               -                                            
y   b         +     0                                          
          *     0                                              
        a   x                                                  
node_t* step_1 = math_step_start( equal ); {
    node_t* t1 = step_1->right->right;
    node_t* t2 = step_1->right->left->left;
    math_add_terms( t1, t2 );
    math_step_finish( &out_buffer, step_1 );
}
y=2*a*x+2*b+a*x
---------------
y=0+2*b+0+3*a*x
  =                                                            
y             +                                                
      +           +                                            
    0     *     0     *                                        
        2   b       3     *                                    
                        a   x                                  

Edited by Simon Anciaux on

Day 5

Not much code written today, but I spent quite some time making sure some math and how the tree represent it was correct.

The first thing I did was to improve the math_make_tree_coherent function to remove all the useless zeroes. I also handled multiplication and division although I didn't test those yet. Division by zero should set an error, and the equation will display the zero denominator to show where the problem is.

I also updated some function that only worked with the right side of the = sign because it was easier at first.

While improving some of the functions, the idea that it would be nice to have easier "navigation" in the tree might be useful. By navigation I mean that the first thing when you use equal->right should be the first term of the right side of the equation (or it's parent). When you look at the tree at the moment, the first term is "far" from the =. In the following example, the equal->right node is the top right +.

y=a*x+2*b-2*a*x+3*a*b*c+d+e*f
  =                                                                                                                            
y                                                 +                                                                            
                                              +       *                                                                        
                              +                 d   e   f                                                                      
                  -                       *                                                                                    
          +               *           *     c                                                                                  
      *       *       *     x     *     b                                                                                      
    a   x   2   b   2   a       3   a                                                                                          

After thinking about it a bit more while working, I came to the conclusion that it wouldn't bring any improvement, it would have the same pros and cons except in a different direction. I still made a function to "inverse the direction" of the tree with no real reason.

#define node_is_add_or_sub( node ) ( node->type == node_type_add || node->type == node_type_sub )

node_t* math_tree_left_to_right( node_t * node ) {
    
    node_t* side_root = tree_find_side_root( node );
    b32 is_left = node_is_left( side_root );
    node_t* root = side_root->parent;
    
    node_t* current = side_root;
    b32 is_valid = true;
    
    while ( is_valid && current && node_is_add_or_sub( current ) ) {
        
        is_valid = !node_is_add_or_sub( current->right );
        
        if ( is_valid ) {
            current = current->left;
        }
    }
    
    if ( is_valid ) {
        
        current = side_root;
        
        while ( current && node_is_add_or_sub( current ) ) {
            
            _assert( current->left && current->right );
            
            if ( node_is_add_or_sub( current->left ) ) {
                node_t* left = current->left;
                node_set_left( current, left->right );
                node_set_right( left, current );
                current = left;
            } else {
                break;
            }
        }
        
        if ( is_left ) {
            node_set_left( root, current );
        } else {
            node_set_right( root, current );
        }
    }
    
    return current;
}
y=a*x+2*b-2*a*x+3*a*b*c+d+e*f
  =                                                                                                                            
y         +                                                                                                                    
      *           -                                                                                                            
    a   x     *               +                                                                                                
            2   b         *                   +                                                                                
                      *     x             *       +                                                                            
                    2   a             *     c   d     *                                                                        
                                  *     b           e   f                                                                      
                                3   a                                                                                          

Also at some point notepad.exe started considering the output text file as a Chinese or Japanese text. It's a plain ascii file, and opening it in notepad++ is working correctly.

out.txt

Day 6

Started the day by making the mirror function to change the direction of the tree. I found out that it's a common operation in binary tree called tree rotation.

I also made a decision about how to represent negatives. It was pretty easy, and I don't know why I didn't do it earlier, maybe the knowledge gained this week made it clear.

  • number constants are just signed number;
  • letter constant have a sign field;
  • there are addition and subtraction operators.

Pretty straightforward and easy to work with. I hadn't any issue working with that today, but the program does so little at the moment. I spent some time making sure existing code properly handled negatives, and upgraded the parser, function call generator and printing functions.

Last, I worked on the add_terms functions to avoid generating 0-something by replacing the - operator by a negative constant in the next term. That implied making some helper functions to get the previous/next term, and testing if a term evaluates to zero.

y=-2*-a*b*-c-2*a*b*c
  =                                                                                                                            
y                    -                                                                                                         
                *                *                                                                                             
            *     -c         *     c                                                                                           
       *      b          *     b                                                                                               
    -2   -a            2   a                                                                                                   

--------------------
y=-4*a*b*c
  =                                                                                                                            
y      *                                                                                                                       
    -4         *                                                                                                               
           *     c                                                                                                             
         a   b                                                                                                                 

Edited by Simon Anciaux on

Day 7

Not much to show, but I made a small video anyway.

https://www.sisyphe.be/donotdelete/hmn_jam_2021.mkv

Bitwise is in the list of things I should watch. Should probably put it at the top of the list.

Here is a small video update on the math thingy: Math wip 23/12/21.