Tuesday, December 14, 2010

Notes from the Lorito Braindump - Contexts

Last Thursday, allison, chromatic, dukeleto and I met to discuss the direction that Lorito was taking and to try and get as much as we could out of chromatic's head and into the wider world.  As it turns out we came up with some significant changes in the design of Lorito as an interpreter, but I think they'll end up being quite beneficial once they solidify a bit.  The following summary is a bit less warty and incomplete than the rough notes I nopasted to #parrot as soon as I'd typed them up after the meeting, but there are still a number of unanswered question.  I'll recap these at the end.

Terminology

M0 - Lorito ops.  Think of magic.  M0 has no magic, i.e. no complex behaviors or subtleties.  Higher levels are M1 (anything built from M0, e.g. PIR), M2 (nqp-rx and winxed) and M3 (Rakudo and Partcl).

Context is the new Interp

The biggest decision we made was that contexts would play most of the roles that the interpreter currently fills.  They will contain all the mutable state needed by a running program.  This includes the PC, registers, return PC, exception handler PC, exception payload and a pointer to the calling context.  Some things such as bytecode segments and iglobals will still belong to the interp, but it will be going on a pretty severe diet for Lorito.  The GC may or may not live in the interp.  We'll flesh this out as we go.

Having an explicit PC also means that a dedicated goto op is no longer necessary in M0.  Jumping around within (or between) a bytecode segments simply means that the PC is explicitly set to an address rather than automatically incremented.  We can also allow the PC to escape into the system stack for ffi, though this idea hasn't been sanity-checked yet and may in fact be insane.  This is all, of course, very low-level M0 stuff.  Higher-level languages will have all of the proper control flow constructs.

 It's important to realize that M0 is designed to be as powerful as C, just easier to analyze.  If an attacker can get a context to execute arbitrary M0, that'll be sufficient to own a machine.  Security will be present, but it will live above M0, e.g. M0 bytecode verification or modification of the current context.

Each context will also have its own REPR and HOW according to jnthn's 6model work.  What this means is that we plan on using the MOP as the basis of our contexts.  A context will have control over how it implements cloning and subclassing.  This will give us numerous specialization possibilities.  We can make contexts that only allow a restricted subset of operations for something like PL/Perl6 or a more static-oriented context for low-power embedded or mobile platforms.  A context can decide that it will no longer allow itself to be subclassed or cloned, and there'll be no way to do so without circumventing the MOP.  All security concerns need a great deal of thought and scrutiny, but I believe that this will give us a solid foundation to build on.

We will also take advantage of representation polymorphism to allow for different types based on differing storage constraints, e.g. compactness, speed, or compatibility with calling conventions.

The current context will be the first argument to each M0 op.  We're now going with a fixed-length 4 argument op format.  The context may be implicit or explicit, depending on what we can figure out.  A fixed op width will go a very long way toward simplifying any code that needs to work with bytecode.  It will be a most welcome change to get away from pbc and its variable-length (and occasionally variadic) ops.  It'll be a joy to rip that code out.  We need to make sure that this doesn't cause enough pain in other places to cancel out the benefit.

During the discussion, chromatic wondered out loud if there were a way to make contexts immutable.  I'm not entirely sure what he meant, but I'm recording the question here to try to keep it from being forgotten.

With the context-based approach, on function invocation (or any CPS-based control flow changes), a clone of the context is created and given a pointer to its caller.  When this happens, data from the calling context will be COW'd to the called context to avoid excessive memory usage.

One of my burning questions was how CPS could work in a low-level assembly language where there weren't any continuations or closures.  The answer is that we'll fake it by using the context as a continuation.  We can get at a context's guts by a few simple loads and derefs.  I'm a little fuzzy on the details, but I can at least see how it's possible to do CPS in M0 with a bit of hand-waving.

I had originally intended to reformat all of my notes into a nice post, but it's already close to bed time and I'm only though the first point.  The rest of my notes will have to wait for another day.  Until then, here are some of the remaining unanswered questions:
  1. What kind of data belong in the interp and what all do we need in the context?  The answers are settling, but there's still some uncertainty.
  2. Where does the GC live?  Is it a separate context, part of the interp or something else?
  3. Is manipulating the PC a reasonable primitive to build an ffi on top of?
  4. What pain will be caused by fixed-argument ops?  Is it a worthwhile trade-off?
  5. How would an implicit context as the first argument to each op work?
  6. Is it possible to have immutable contexts and to do so more efficiently than straightforward COW'd contexts?

Monday, December 6, 2010

Roadmaps: Fact or Fiction?

Parrot's roadmaps haven't historically been a great source of encouragement or accurate information.  Our goals have often been overly optimistic with the result has been that most of the time spent dealing with our roadmap has been spent pushing back uncompleted tasks.  The current system has been based on tickets attached to a specific version of Parrot which it was hoped would be completed by the time that version of Parrot rolled around.  Sometimes the tasks had champions, sometimes not.

Unfortunately these tickets are often placeholders for ideas that are fully-formed only in the mind of one person.  This prevents otherwise willing developers from jumping in and makes tasks hard to re-start after a break.  There are also tasks that have received a good deal of attention but that simply haven't been completed.  These tasks make the roadmap into a reminder of what we haven't accomplished rather than a list of our accomplishments and a source of encouragement.

Parrot's hackers have been hard at work making valuable contributions, but work has been largely independent of the current roadmap.  It's always a challenge to keep an accurate roadmap in a project based on volunteer tuits, but whiteknight and I are sure that we can do better.

He and I chatted briefly on #parrot earlier this evening about how we want to structure Parrot's roadmap in the future.  What we'd propose follows:

The roadmap will be based on major versions (essentially calendar years).  Each year at the post-x.0 Parrot Developer's Summit, we will finalize the roadmap for that year.  This roadmap will be wiki-based, since the wiki integrates nicely with Trac's ticket system but also allows a more flexible structuring of information.  We will have a solid plan for the next year centered around the supported (.0, .3, .6 and .9) releases.  The roadmap will list only major features which have a champion* and which we are confident we will be able to deliver.  If we aren't confident of being able to deliver a feature in time for a supported release, it's better to have a release with no planned roadmap items than to have a pleasant fiction.  We will also have a fuzzy plan for the following year, though it shouldn't be considered binding.  Anything beyond two years will be planned only in a very general sense.  We will maintain a wishlist for tasks which we want to undertake but don't have any dedicated volunteers, so that such features won't be lost or clog up the roadmap.

Parrot has an unfortunate history of over-promising and under-delivering.  This has not helped our reputation among other OSS hackers and I want us to correct the trend.  I want our new roadmaps to center around promising only what we're highly confident of being able to deliver.  Establishing a track record will take time and effort, but two or three years from now I want to be able to look back with pride and say that we proved we could deliver what we promised.


*In this case, a champion means that this person is dedicated to seeing a feature to completion.  "Owner" is another way of communicating the idea.

Tuesday, November 23, 2010

What happened in the dynop_mapping branch?

Several months ago, fperrad opened a ticket complaining about difficulties with dynops*.   (For those just joining us, dynops are libraries which can be loaded at compile-time to create user-defined PIR ops.)  Parrot worked reasonably well with one or fewer dynop libraries loaded, but the problem people were seeing occurred when there was the possibility of multiple dynop libraries being loaded in different order.  At the time, Parrot's approach to storing ops in bytecode was simply to store the op's number directly in bytecode, followed by any arguments needed by the op.  When a dynop library was loaded, its ops would simply be appended to the interpreter's op tables and those offsets would be stored in bytecode.

You might already see where this is going.  When dynop libraries were loaded in a different order between compilation and loading, the dynop offsets within the interpreter's op tables would no longer be valid and hilarity would ensue.  Also, by hilarity I mean segfaults.  This could happen when a program which loads foo_ops followed by bar_ops was compiled to pbc.  If that pbc were then loaded by a separate program which loaded bar_ops first, boom.

plobsing took it upon himself to fix Parrot so that dynop libraries could be loaded in any order without invalidating previously compiled bytecode.  His solution was to do away with the per-interpreter op tables and move the op tables down into the excessively-capitalized PackFile_ByteCode struct.  When ops are added by a bytecode segment, they're given entries in the op mapping table.  The offset into those tables is what's stored in bytecode.  The first op used will always get 0x00, the second will get 0x01, etc, no matter what the ops are.  If you've been looking at pbc_dump's disassembly output, this is why the op numbers don't correlate with the numbers in src/ops/core_ops.c after the dynop_mapping merge.  As part of the process of wrapping my head aroung plobsing's changes, I modified pbc_dump to output op mappings as well as the disassembled ops:

cotto:/usr/src/parrot $ ./pbc_dump -d hello.pbc
<snip>
BYTECODE_hello.pir => [ # 5 ops at offs 0x30
  map #0 => [
    oplib: "core_ops" version 2.9.1 (3 ops)
    00000000 => 00000164 (say_sc)
    00000001 => 00000022 (set_returns_pc)
    00000002 => 0000001d (returncc)
  ]
 0000:  00000000 00000000   say_sc
 0002:  00000001 00000001   set_returns_pc
 0004:  00000002            returncc
]
<snip>

Since I'm trying to reimplement the code in the PackFile PMCs, it was important to figure out how this code works at a low level so that non-imcc code can once again build a valid pbc file.  For this to work, the PackFile PMCs need to be updated to do the same thing that imcc's pbc code does now.  The first question, then, is what exactly the current code does.  This breaks down into three stages: loading a packfile from a stream (usually a file), executing loaded bytecode and serializing bytecode to a stream.

Execution is the simplest change.  In C, it means that code that deals with ops now needs to perform lookups on a packfile bytecode segment's op tables rather than on the interpreter's (now removed) global op tables.  There are two important tables; op_info_table, which contains information on ops such as their names, family, arguments, etc; and op_func_table, which contains a list of pointers to the op functions.  There's also save_func_table, which is used as temporary storage when something messes with op_func_table.  These three pointers now live in the PackFile_ByteCode struct, so most code that deals with ops only needs to be changed as follows:
-        op_info_t * const op_info  = interp->op_info_table[*base_pc];
+        op_info_t * const op_info  = interp->code->op_info_table[*base_pc];
The value of *pc will generally be lower, but that's an implemention detail.

For storing and loading, the PackFile_ByteCode_OpMappingEntry, PackFile_ByteCode_OpMappping and PackFile_ByteCode structs (see https://github.com/parrot/parrot/blob/master/include/parrot/packfile.h#L235 ) are used.  Because the bytecode segment (the PackFile_ByteCode struct) now contains op maps, the op maps need to be stored and loaded before the bytecode segment can be meaningfully used.  An op map (PackFile_OpMapping) consists of an array of entries, with each entry contiaining all the mappings which use the same library.  In the simple case where all ops are core, the op map will have only one entry, for the "core_ops" library.  "core_ops" is the name for the ops that are built as part of Parrot and are always available.  There will be another op mapping entry for each loaded dynop library such as "perl6_ops" or "math_ops".

The contents of an op mapping entry are minimal.  The PackFile_OpMapping_Entry contains the name of the library (*lib), the number of ops (n_ops) and two arrays called lib_ops and table_ops.  table_ops is an op's number according to the op mapping table and lib_ops is its number within an op library.  When imcc needs to look up an op's number (using this function), it will ensure that the necessary library is loaded and perform a linear search through through all mapped ops and all loaded op libraries looking for an op with the correct function pointer.  When it finds a previously unmapped op, it will add it to the entry for the right library and return its index.

This is a problem because a single packfile implementation isn't good enough.  We actually have five.  And by five, I mean two.  The first implementation is the one that works and is implemented as C structs and functions.  The second implementation is a PMC-based interface which is intended to allow PIR code to generate valid pbc.  (It also allows the generation of wildly invalid pbc with hilarious results, but that's an unintended benefit.)  The PMCs are what PIRATE uses to generate pbc that worked with Parrot before the dynop_mapping branch merged.  The packfile PMCs are largely untested apart from PIRATE, so because the pbc format change didn't cause any new failures for those PMC, they were never updated.

The packfile PMCs are important because they're the future.  imcc, which is our current PIR compiler, is widely disliked and has been used by at least one developer to frighten his children.  imcc's code has several performance issues, poor maintainability and an undesirably low bus number.  It's also tied into Parrot's internals much too tightly for anyone's good.  Once PIRATE is ready, I want us to be able to rip out imcc and use the parrot executable as nothing more than a bytecode interpreter.  In addition to decoupling imcc from Parrot, this will let us use more self-hosted tools and will help us work out how to make pbc manipulation more accessible to Parrot's external users.

Making the Packfile PMCs opmap-aware is an important step because it will mean that pir code will once again be able to produce valid pbc.  From there, world domination is a smop.


* As always, Parrot is better off because people mentioned problems they ran into.  There was some pain in the interim, but Parrot is more robust as a result of the reports we received

Monday, October 25, 2010

Parrot's Teams: Five Scenarios

Parrot's concept of teams was rushed into service without being entirely fully-formed.  That doesn't make it an automatic disaster, but it does mean that we're figuring out inter-team and intra-team dynamics as we go.  To help this process along, here are some hypothetical (or not) events and my best guess as to how the different teams would interact in addressing them.  After each example, I've tried to list the major advantages and disadvantages that the team structure creates but more are welcome.  Note that these cases are idealized somewhat and are still speculative.  Real life is always messier.

1: Research Paper

We've got a couple developers who keep their eyes peeled for new research papers and we're always glad use relevant research papers to improve our code.  If someone presents us with some research that they think is relevant to Parrot, here's how I'd envision our process working:

  • Someone posts to parrot-dev or #parrot saying that they found a research paper we should consider.
  • The architecture team takes the lead and looks over it, explicitly soliciting feedback from the community and from other teams.
  • If the improvements look viable, the architecture team says so and writes up the algorithm as it's relevant to Parrot on the wiki, along with any relevant notes.
  • The architecture team puts out the call for someone to implement the code.
  • A Parrot hacker picks up the project.
  • Someone from the architecture and product teams follow the progress of the branch and review commits.
  • As the branch stabilizes, the product team benchmarks it (or ensures that it's benchmarked) to demonstrate a meaningful improvement.
  • As the branch stabilizes, QA also makes sure that it has good test coverage and documentation.
  • As the branch gets ready for merging, the product team checks that external projects won't be disrupted by the change.
  • The code is merged, well-documented and tested and doesn't break anything for Parrot's users.

advantages: Teams will ensure that Parrot has a unified direction as new research comes to our attention.  They'll also give us a clear path from paper to mergable code and will help enforce a higher bus number for new code, in addition to ensuring that code is documented and tested before it gets merged.
disadvantages: There will be a higher barrier to entry and increased dependence on the architecture team.


2: Significant Design Change

Say that a Parrot developer proposes a significant design change to address a bug or misfeature.  An example of this is Peter Lobsinger's dynop_mapping merge, which made some small but significant changes to bytecode.  The branch did a good job of solving the problem at hand, but one important test and a significant external project (examples/pir/make_hello_world_pbc.pir and PIRATE, respecitvely, which will be the subject of a later post) broke because of it and have yet to be fixed.  Here's how the process might work with teams in full force:

  •  Someone files a ticket or posts to parrot-dev or #parrot about a design flaw in Parrot that requires some redesigning.
  •  A Parrot hacker steps forward to fix it.
  •  Said hacker figures out a fix and discusses it with the architecture team.
  •  The architecture team reviews it and either gives the ok or helps iterate the design.
  •  The hacker starts implementing his changes.
    • While hacking, he describes the API consequences to the product QA teams, who update the relevant docs and/or add tests.
  • When the code is ready to merge (and ideally while the branch is being developed):
    • the architecture team reviews the code for bugs and to make sure design changes go as planned.
    • the product team reviews the code for user-facing changes.
    • QA makes sure that the changes are well-tested and documented.
  •  The code is merged, the relvant ticket is closed and everyone's happy.

advantages: Parrot maintains a unified direction across design decisions.  The team structure ensures that code is well-reviewed for different aspects while it's being worked on and that when coding is done, the branch will be (mostly) ready to merge.
disadvantages: This process will take more effort from the originator of the fix to explain his thinking and to answer questions during code review.  This will raise the bus number of the code, but will also raise the barrier to entry.


3: API Overhaul

Let's say that we decide that some part of our API needs a massive overhaul.  An example of this may be coming soon: Andrew Whitworth has expressed some distaste at the state of Parrot's embedding API and may soon take a much-needed jackhammer to it.  Here's how I envision the process working with teams:

  • The product team decide that an API needs massive refactoring in order to be useful to users, either through review or due to user feedback.
  • The product team figure out what the API should look like.
  • The product team hacks everything together in a branch.
  • QA looks at the branch to make sure that the new API functions are well-tested and that upcoming deprecations are documented
  • The architecture team does a brief review for sanity.
  • After the proper time for deprecations has passed, the changes are merged into trunk, causing much user jubilation.

advantages: API changes will have more dedicated code review with a specific aim.  More people will be looking over code changes and will be familiar with what will be merged into trunk.
disadvantages: The refactor will be more sensitive to tuit shortages on the part of different teams.



4: Lorito

Lorito is an upcoming major reenvisioning of Parrot at a low level.  Currently most of Parrot is written in C and PIR, and the impedance mismatch between the two is a significant bottleneck.  Lorito will be a very low-level and minimalist set of ops which will provide sufficient power to reimplement most of the C components of Parrot, eliminating the impedance mismatch, among other benefits.  Here's one way Lorito could become a reality:

  • We decide that Lorito is a good idea.
  • The architecture team leads the effort to figure out a rough timeline and order of events.
  • The architecture team leads the design and documentation effort to work out what a Lorito VM will look like.  Everyone is actively encouraged to participate.
  • Volunteers are solicited to implement prototypes to find holes in the design.  These holes are filled in as they're discovered.
  • As the design stabilizes, the product team looks at Lorito from a product perspective, helping further refine the design.
  • Once the design is settled, hacking on the final implementation begins in earnest according to the timeline.
  • The architecture, product and QA teams review major branches for design, test coverage and documentation as they progress.
  • After much effort, we are able to use Lorito overlays* as a replacement for internal Parrot components currently implemented in C.

advantages:  There's a consistent force ensuring that progress is made and a well-defined timeline.  All relevant parties have opportunity to voice their concerns and influence the final product.
disadvantages:  The process depends on having input from different teams and will be sensitive to tuit shortages.

* By "Lorito overlay", I mean anything that compiles down to Lorito ops.


5: Major Security Vulnerability

Let's say that a major security vulnerability is discovered and made known to Parrot's developers.  For this example, say that the latest supported release was 3.9.0 and that the latest developer release was 3.11.0.  Here's how we'd deal with this to ensure a minimal turnaround time:

  • The issue is raised and both 3.9.0 and 3.11.0 are found to be vulnerable.  Consistent with our support policy, the supported 3.9.0 release needs to be fixed.
  • Someone writes a proposed fix, either as a patch or a branch, depending on the vulnerability.
  • Representatives from QA team, product team and architecture teams briefly meet to make sure that the fix is sane (architecture), that the fix is valid, tested and documented as being fixed (QA), and that the fix doesn't negatively impact users (product).
  • The fix is committed to trunk, along with a backported version for 3.9.0 .  QA makes sure that new 3.9.1release is produced and distributed with appropriate notification.

advantages:  We provide a known-good fix in a timely manner, along with a regression test to ensure that the bug doesn't resurface.
disadvantages:  The structure requires some synchronization of schedules.



I hope that this provides a good idea of what I think the teams will look like as they work together to improve Parrot.  Nothing's set in stone yet, but my hope here is to provide a starting point for further discussion.
Internal organization of the architecture is a subject for another day.

Thursday, October 21, 2010

Parrot has a new architect. What now?

Close followers of Parrot have probably noticed that Allison Randal, our esteemed architect, hasn't been very active over the last few months.  After her recent announcement that she'd been hired as Technical Architect for an obscure Linux distribution called "Ubuntu", folks might be wondering what Parrot's future looks like.  This is doubly true because the architect position has had a bus number of one.  If Allison were hit by bus or otherwise incapacitated, there was no structure in place to ensure that someone could step up and keep Parrot moving in a consistent direction.

Burnout has also been a problem for Parrot's past architects, partly because the architect ended up being responsible for managing most of Parrot.  We've done a great job of making Release Manager a straightforward process that can be performed by any Parrot developer with a commit bit.  The Release Manager position, however, has been the exception.  Most of the interesting roles, e.g. managing Parrot as a product or working with the wider OSS community, haven't been formalized and have fallen to the architect in the absence of someone willing to take the lead.  Allison is a capable leader and an A-list hacker, but Parrot has passed the point where it can be formally managed by a single volunteer, even one of her caliber.

It was in this environment that Jim Keenan put together a meeting of Parrot developers in Portland, Oregon.  Many topics were discussed, among which was a restructuring of Parrot to split responsibilities into separate roles.  Andrew Whitworth has already covered the idea in its current state, which will undoubtedly change as we progress.  The end result is that we'll be splitting responsibilities into 5 teams, only one of which will cover architecture.  We'll be solidifying the structure and formally voting on leads in the coming weeks, but interim leads have already volunteered for most available positions to get the process bootstrapped.  Andrew is provisionally in charge of the Product Management and in addition to posting some thoughts on the team structure, has already started fleshing out his vision for the Product Management team.

Then at last Tuesday's #parrotsketch meeting, Allison announced that she would be stepping down immediately, and that she had chosen me to succeed her as head of the architecture team.

What this means for Parrot's immediate future is that while I'll be the closest analog to Allison, Parrot won't rest primarily on my shoulders in the same way that it did on previous architects'.  It will be the architecture team's job to look to the future and determine where Parrot needs to go, but other jobs will be delegated to different teams, allowing all of us to specialize without letting anything important falling by the wayside.

Allison mentioned that after the meeting, she felt like a huge weight had been lifted from her shoulders.  She plans on staying with Parrot as a developer, but will be focusing most of her energy on Pynie.  For those of us wondering what Parrot's future looks like, we now have part of the answer and a reason for optimism.  It will take some time until we figure out just how the different teams will interact and what it means to be on a team, but the new team structure promises to help us become a more focused community and to produce a high-quality production-ready platform for interoperable dynamic language implementations.