Thinking out loud

TODO:
I think I need to change the parser.body() grammar so that function calls are contained by expression productions.

Done! FUNCCALL productions are contained by EXPRFCALL productions. The body grammar no longer parses function calls directly. Instead, it parses only function definitions and expressions. Function calls are only parsed through these expressions.

This is so I can eventually interpret code like this (once the ‘return’ statement is working).

def addem(x + y):
    return x + y

print(addem(addem(2,3) + 4, 5))
>>> 14

Here’s the code coverage report.

Name             Stmts   Miss  Cover
------------------------------------
parser.py           81      1    99%
productions.py      60      0   100%
scanner.py          93     10    89%
test_parser.py      13      2    85%
------------------------------------
TOTAL              247     13    95%

Some of the scanner’s error handling code wasn’t tested.

Awesome job! I like this rolling report. I wonder if there’s some place more official for journals like this here?

Thanks! I’ve poked around this forum for some kind of journal or weblog, but I didn’t see anything under profile management. Despite that, what I like about this forum is that you can download all your posts from the profile activity page.

Having names for things can be helpful! When distinguishing between what types of parameters I need for my grammars, this is what I learned.
Function definitions use formal arguments.
Function calls use actual arguments.
http://ecomputernotes.com/what-is-c/function-a-pointer/what-are-the-differences-between-formal-arguments-and-actual-arguments-of-a-function

This means I’ll have to write two kinds of parameter productions, and possibly a new kind of variable-identifier production.
I found some related grammars in this handy grammar spec.


(Yes, I’ve referred to this pdf before)

I don’t fully understand them, but I’m working on it (gotta read more of the Augmented Backus-Naur RFC). I’m using these snippets to model my own grammars. Unfortunately, the symbols get messed up during copy-paste from the pdf. I’m not sure of how to add/escape characters in a markdown inline code block. (```)

<function-definition> ::= <word>: {func | function} <spec-block> <body-block>
<spec-block> ::= [ {[ <func-attrs> ]}o <formal-arg> * <return-spec> o {/local <local-var>+} o ]
<func-attrs> ::= infix | stdcall | cdecl {variadic | typed} o| {variadic | typed} cdecl o
<formal-arg> ::= <word> [ <type> ] <c-string>o
<context-definition> ::= <word>: context <body-block>

The {\local <local-var>+}o part of the grammar gets me thinking that \local is some kind of scope-handling flag. The symbol has something to do with function block namespaces.

Here are some snippets from function calls.

<fixed-arguments-function-call> :: = <qualified-word> <expression>*
<qualified-word> ::= <word> | <qualifier><word>
<qualifier> ::= <context-name>/ | <qualifier><context-name>/<context-name> ::= <word>

The qualified-word is an identifier matched to the function’s namespace… probably to assign the value to the parameter only within the executing function’s scope. Question: How does this address recursion? There have to be separate namespaces for each level in the recursion, no? Maybe red-lang isn’t built to support recursion (which would be weird)?

Read more about expression and assignment grammars.

<qualifier> ::= <context-name>/ | <qualifier><context-name>/

It looks as though qualifiers support nesting… so there probably is recursion in red-lang. Maybe this has something to do with the function … like that is the ‘instance’ of the function block being executed.

Lots of guessing. All guessing.

For the record: LMPTHW is a doggone challenge, but absolutely fascinating. I don’t think I’ll ever need to write my own interpreter or database from scratch again, but feels so great to learn about it.

It is taking me much, much longer to get through part four of the book than I had anticipated (over a month now!). As a result, I’ve finally given up my goal of ‘get through the course’. In exchange, I’ve taken up a commitment to solve problems every day. Solving problems is more than just typing code. It’s also about research, documentation, review, and insight.

Zed calls the activities in the book ‘exercises’, but if someone ever asks me about LMPTHW, I’d call them “independent study units”. He gives the basic knowledge and expectations, but you personally have to extend and apply it as far as you can handle (and then some).

(EDIT: clarify problem solving)

2 Likes

I just found BNF grammars for Ruby!! This is so cool. So many resources out there.

This one seems to be more complete than the pdf. It includes the grammar for CALL_ARGS (which is really important to me right now!)

http://docs.huihoo.com/ruby/ruby-man-1.4/yacc.html

Here’s one for ECMAscript… probably an old version, but the symbols are all hyperlinked!!

http://tomcopeland.blogs.com/EcmaScript.html

|FunctionDeclaration|::=|"function" Identifier ( "(" ( FormalParameterList )? ")" ) FunctionBody|
|---|---|---|
|FunctionExpression|::=|"function" ( Identifier )? ( "(" ( FormalParameterList )? ")" ) FunctionBody|

This is from the ecmascript grammars. I think the word “function” in quotes represents the lexeme or string representation of the token. same goes for the parentheses.

What I find amazing is that the grammars for declaration/definition are nearly identical to the grammars for expression/calling. I think the only difference is that the Identifier is optional for the function expression… I think that’s what “( Identifier )?” means… although how can you call a function without an identifier??

Anyhow, if the grammars are pretty much the same for funcdef and funccall, maybe I don’t need to write additional versions of the params grammar->production reduction to accommodate assignment and reference. Maybe the changes are higher up in the variable/identifier productions.

I’ve been thinking it over. I’m pretty sure I just need to split Parser.params() into Parser.args_formal() and Parser.args_actual(). They’ll each reduce to their own productions. When I write the visitor methods, the ArgsFormal.visit() will create identifiers in the World.variables dict, whereas ArgsActual.visit() will contain variable assignments for the values which will eventually be passed to function’s namespace (once ai figure out that part).

It works! I kept the old parser.params() grammar because it already does what I need, but I modified the ExprVar.visit() method of the variable expression production. This way, variable expressions no longer ‘register’ themselves in the World.variable dict. Instead, I created a formal argument production called ArgF. Unlike expressions, they always have an identifier and are always registered with the World.variables dict. The analyzer runs ok so far (no problems with visitor methods). I believe my next task is to implement scope (and maybe assignment checking) and then I might be ready to start on the interpreter, ex35.

To summarize…

Function calls reduce to the function name, parameters, and body. The parameter values are eventually assigned to their positional counterparts in the formal arguments of the function definition.

Function definitions reduce to the function name, formal arguments, and body.

Formal arguments are initialized in the world.variables dictionary.

Ok. I think the analyzer is making sense now. I still haven’t implemented variable assignment, indent levels or scope. I’m not sure of which order in which to tackle them. My suspicion is to handle scope before I take on assignment… or maybe I could start with rudimentary, global scope assignment and then get into local scope.
Anyhow, here’s my analyzer test’s sample output

>>>FuncDefNAME:  print
>>>FuncDefNAME:  hello
>>>	script:  [
<ROOT>
 body: [
<FUNCDEF>
 name: print 
 args_formal: 
<ARGSFORMAL>
 arguments [
<ARGF>
 identifier: printarg
</ARGF>]
</ARGSFORMAL> 
 body: 
<BODY indent: 0 >
 contents: []
</BODY>
</FUNCDEF>, 
<FUNCDEF>
 name: hello 
 args_formal: 
<ARGSFORMAL>
 arguments [
<ARGF>
 identifier: x
</ARGF>, 
<ARGF>
 identifier: y
</ARGF>]
</ARGSFORMAL> 
 body: 
<BODY indent: 0 >
 contents: [
<FUNCCALL>
 name: print
 params: 
<PARAMS>
 parameters: [
<EXPRPLUS>
 left: 
<EXPRVAR>
 identifier: x
</EXPRVAR>
 right: 
<EXPRVAR>
 identifier: y
</EXPRVAR>
</EXPRPLUS>]
</PARAMS>
</FUNCCALL>]
</BODY>
</FUNCDEF>, 
<FUNCCALL>
 name: hello
 params: 
<PARAMS>
 parameters: [
<EXPRNUM>
 value: 10
</EXPRNUM>, 
<EXPRNUM>
 value: 20
</EXPRNUM>]
</PARAMS>
</FUNCCALL>]
</ROOT>]
>>>	variables:  {'printarg': None, 'x': None, 'y': None}
>>>	functions: {'print': 
<FUNCDEF>
 name: print 
 args_formal: 
<ARGSFORMAL>
 arguments [
<ARGF>
 identifier: printarg
</ARGF>]
</ARGSFORMAL> 
 body: 
<BODY indent: 0 >
 contents: []
</BODY>
</FUNCDEF>, 'hello': 
<FUNCDEF>
 name: hello 
 args_formal: 
<ARGSFORMAL>
 arguments [
<ARGF>
 identifier: x
</ARGF>, 
<ARGF>
 identifier: y
</ARGF>]
</ARGSFORMAL> 
 body: 
<BODY indent: 0 >
 contents: [
<FUNCCALL>
 name: print
 params: 
<PARAMS>
 parameters: [
<EXPRPLUS>
 left: 
<EXPRVAR>
 identifier: x
</EXPRVAR>
 right: 
<EXPRVAR>
 identifier: y
</EXPRVAR>
</EXPRPLUS>]
</PARAMS>
</FUNCCALL>]
</BODY>
</FUNCDEF>}

In the previous example, >>> variables and >>> functions are showing that the visitor methods are indeed declaring function definitions and variables in the analyzer’s world object. The analyzer absolutely requires that functions be defined/declared before being called, so that’s pretty cool. The same will apply to variables. The analyzer will require that they must be declared before being assigned any value or unsimplified expression.

Reading about scope, functions and variables

1 Like