Tuesday, June 14, 2011

Working towards compiler compiling

Missed some weekly posts due to 'okay, just one more feature and I'll make the post' syndrome and overall exam-time mess. I have one more exam at 16th, and then nothing should hold me from hacking on pyjamas full-time.

So here's what I've been doing meanwhile:

Libs status:
I've implemented simple pyjamas app to track dependency trees of particular modules and provide report on each dependency, http://pyjslibs.appspot.com/
Big goal is to have 'translator' target green. Its direct prerequisite - to green 'compiler' target.
It reports some unneeded modules as well, since at this moment linker tries to compile all imports, even if they are within try/except or if statements.

  • pyv8run, my primary testing and developing tool for pyjamas is now fixed and significantly improved. I've merged it with my older pyv8shell tool, and it now supports REPL mode itself. REPL mode still has some importing issues, but importing itself is still big mess in Pyjamas, some bigger things like `sys.path`, import hooks, imp module, __import__ builtins are yet to be implemented properly.
  • depstest - tool I've created to generate data for  pyjslibs status app above.
  • generate_stdlib - simple script that puts together stdlib from pyjamas/lib, pypy/lib and stock cpython lib. Idea here is to support as many stock (unchanged) cpython modules as possible, so we would not have to maintain them ourselves.
  • test.py - script that runs all different tests we have to run each time something is modified. Fixed some import issues with libtest and cpython along the way.
Translator, pyjslib, pyjs/lib

I've implemented some missing features and added some libraries, but this wasn't primary goal yet,
  • pyjspath (os.path) module, which does path manipulations but obviously lacks filesystem functions
  • types module - pyjamas does not use separate types/classes for some things like tracebacks/frames/code, so it had to be hacked a bit to work.
  • Added __doc__ attribute to modules
  • Implemented __builtins__ alias for builtins
  • Implemented __builtin__ importing
  • Partially implemented globals() 
  • Implemented `from module import *`
Internal compiler and parser modules

As I've explained on mailing list, pgen/lib2to3 were too heavy to use directly, so I've spent this week chopping them into small, separate `compiler`, `parser`, `symbol` and `token` modules, with same interface as according cpython lib modules, and separate script to generate grammar, symbol and token files.
This work is done, and translator in my internalast branch now uses only internal compiler and related modules. As Luke said, there is indeed some performance drop, but with caching its not very significant, and there is still a lot of space for optimization.
Heres some time measurements for compiling LibTest and all its dependencies with cpython compiler and internal compiler:

real    0m27.505s
user    0m24.262s
sys     0m2.664s

real    0m38.957s
user    0m35.622s
sys     0m2.668s

For the next week I will continue working towards getting this new compiler/parser to compile via pyjs and pass some basic python-parsing tests. Current status is listed at pyjslibs status app, via 'compiler' target.
I've solved all translator errors for compiler.*/parser.* modules, but there are still more for their cpython dependencies:
 #> ./pyv8run.py --strict ../stdlib/test/test_compiler.py
Traceback (most recent call last):
__main__.TranslationError: _weakrefset line 59:
unsupported type (in _stmt)
With(CallFunc(Name('_IterationGuard'), [Name('self')], None, None), None, Stmt([For(AssName('itemref', 'OP_ASSIGN'), Getattr(Name('self'), 'data'), Stmt([Assign([AssName('item', 'OP_ASSIGN')], CallFunc(Name('itemref'), [], None, None)), If([(Compare(Name('item'), [('is not', Name('None'))]), Stmt([Discard(Yield(Name('item')))]))], None)]), None)]))

`with` statement is #1 offender now, but typical pyjamas app, being executed by browser, has no open() capability anyway, so its not obvious whether its worth to invest time into implementing `with` now. I'll decide on it once other `compiler` issues are solved, maybe its easier to just pull python2.5 modules instead.

@jnowl was asking for skulpt vs emscripten vs pyjamas comparison, and since he is not first to ask about it, I'll post more detailed answer as separate post later.
Short answer:
1) Pyjamas is more mature and production-ready
2) Pyjamas overall is geared towards pyjamas (gwt port) widget set, and is solid solution for web application development. For any other usage, it can shoot you in the leg, and there are no docs.
3) Pyjamas translates python code into javascript code, which is executed via some javascript VM. Emscripten translates python interpreter into javascript, and runs python code via it, adding another layer to the stack. Latter is more 'correct' approach, while first is more practical. We are okay with lacking some cpython capabilities if it still gets the job done.