Subst() rewrite proposal

This is a proposal for a rewrite of the Subst logic in SCons.

Current issues

Issues with the current engine include:

  1. The processing of {} symbols are not handled correctly. The engine incorrectly handle statements such as:
    • "${Foo(${Boo})}"
  2. Escape handling has issues. This might not be as simple to deal with, but the current logic is much worse than it should be.
  3. Non string types are not handled well. This is very much so for recursive substitution calls in which the escape handling gets invoked. For example:
    •    1 env['FOO']=['Hello','world']
         2 env['CPPDEFINES']"$FOO"
         3 print env.subst("$_CPPDEFINES") 
      
    we would like to see:
    • /DHello /Dworld
    but we get:
    • /DHello world
    Which is incorrect. In other cases depending how the substitute path is set you may get
    • "/DHello world"
    which is also incorrect

    4.Current api's like subst_list() don't work correctly or act in strange ways. For example the subst_list() api will always return a list with one list of items, not a list of items. (This is the difference of getting "hello", "world" vs ["hello", "world"])

  4. caching logic is not well defined. This leads to a lot of other code in SCons trying to cache certain values that it thinks should be ok to cache. This increases memory usage and makes it difficult to correct "clear" the cache or reevaluate when it might make sense to do so.
  5. There some internal cases in which the user can hit bugs in the subst() engine in which internal classes that should work in cases of "adding" two string objects with the "+" operator should work but don't for some reason, do a bug in UserString or the class the sub-classes from it.

  6. Certain action may be cumbersome to say in certain cases. For example:
    •    1 env['Boo']="$somestuff"+env.Literal("${leave alone})+"$otherstuff"
      
    it would be nice to be able to say in certain cases:
    •    1 env['Boo']="$somestuff${Literal('${leave alone}')}$otherstuff"
      
    This allows a clearer understanding of what is needed that depending on special "hidden" string objects types that handle this, and might be corrupted with a str() call

Suggested improvements to the "string" grammar

To help improve the syntax used when defining what to substitute I suggest that we allow for this syntax:

$$
resolves to $
$var
resolves the value of var
${expression}
evaluates the expression based on values defined the Environment object, or a dictionary object provided. If the value is a list or non string type the value will have a string conversion applied to it
$( value $)

allows the value to not be processed by the signature function. In our case this means a certain "raw" mode will process return this value differently

${IGNORE('value')
long form of the $( $) syntax
${LITERAL('value')
another way to say env.Literal('value'), but embedded in the string.
${APPEND('expression')}
evaluate the expression and returning a list of raw types (for example Node objects might be returned), appending the values to the parent container, given that it is not a string type(wording??) This allows for the expression to result in a list to be expanded in place correctly. For example:
  •    1 env.Append(MYPATH=['path1', 'path2 with space'])
       2 env.Replace(CPPPATH=['path0',"$[MYPATH]"])
    
would result in the expected
  • /Ipath0 /Ipath1 /I"path2 with space"

not in the current

${APPENDUNIQUE('expression',move=True)}, ${PREPENDUNIQUE('expression',move=True)}, ${PREPEND('expression')}
Like append, but with the different logic. In the case unique, the optional move can be set to control how the order will be handled. It defaults to True ( False might be better if I have my logic backwards) which says if the value already exists, move the value to the end ( or start) of the collection. If False it would not add the value so the existing value would be used.
$['expression']
short hand for the Append Unique case, as this is generally what we want with path and flags, ie we only want one case and ordering of paths matter in that we want what we depend on to move to the end of the list. (I have found this logic is generally useful for large build on posix system for values such as LIBPATH and LIBS when resolving a tree of values with the substitute engine)

API

Functions

In package SCons.Subst

SubstString(input,subst_mode,use_cache,**dict)::

input
the value to be substituted, the value could be a string or collection of strings
subst_mode

special rules to apply. A bit field of value such s SIG (remove values in $( $) ), WHITESPACE (keep white space). ( what else??)

use_cache
controls if the value is cached or fully reevaluated.
dict
this is the environment object or dictionary of values in which the expressions in a ${} like statement would evaluate within.
Returns
This function returns a string object, or internal object sub-classed from a Python str type.

Classes

class SubstStrBase(str):

class Literal(SubstStrBase):

class Ignore(SubstStrBase):

class SubstStr(SubstStrBase):