Saturday, July 30, 2011

M0 Roadmap Goals for Q3 2011

M0 has been coming down the pipeline for several months. It's still pretty raw and has a number of known functionality holes, but it's getting better by the week. I'd like to make the next few stages of M0 part of our official roadmap, so this post spells out the overall plan and what I think we can accomplish in the next three months.

M0 currently exists as a fairly hacky Perl 5 prototype. This is of necessity because Perl isn't generally intended to operate at the level that M0 requires. Perl is still serviceable as a prototype implementation language, but the form that will be integrated into Parrot will be written in C. There will be many stages between now and when the M0 migration is complete, but the goal I'll focus on is noop integration. I'll explain what I mean by that below.

I see Parrot's migration to M0 falling into 7 stages:

M0 Prototype

We're working out bugs in the Perl 5 M0 interpreter and making certain that M0 will be a sufficient foundation for Parrot. M0 may change significantly but we're making an effort to stabilize it.

C89 Implementation

We're happy with M0 and have a reasonably efficient compiler-agnostic implementation of M0, written in C89, which passes all tests. Separate compiler-specific implementations are fine, but not a priority.

Noop Integration

C/M0 is linked into libparrot and exposes an interface that C code can use to call into M0 code. At this point no subsystems have been reimplemented in M0.

Mole

We specify and implement Mole, which will be a C-family langauge that compiles directly to M0. Writing M0 is painful (this was an explicit design goal), so Mole is what a large chunk of the M0 that implements Parrot will be written in. M0 bytecode is what will be run from Parrot, so other code generation possibilities exist.

Early Integration

We've started moving subsystems over to M0. The order of which systems hasn't been determined yet, but producing a complete list and making sure we're aware of the dependencies will prove important.

C/6model in Core

Having a solid implementation of 6model in core will eventually be a blocker. Implementing our current object semantics in M0, only to switch to 6model later isn't a wise use of our hackers' tuits.

Pervasive Integration

At this point, everyone can jump in. We have a couple major subsystems converted and have worked most of the kinks out of the process of translating C into M0. We'll be converting every subsystem that we can find to M0 and will have plenty of example code and documentation to lower the barrier to entry.

Complete Integration

Parrot has a fairly small core of C code consisting of little more than the M0 VM and the GC.


Committing to a timeline can be tricky. It's much more important to have an M0 that's thoroughly well thought-out than one that's usable by a certain date. That said, the M0 spec and prototype are coming along nicely. Completing the "Noop Integration" stage and possibly getting a solid Mole compiler by the 3.9 release are reasonable goals, depending on how many interested parties make themselves known. I'm happy to see that whiteknight has made C/6model one of his roadmap goals. C/6Model in Core is largely orthogonal to M0 except that it needs to be integrated and solid before we start translating Parrot's object-related C code into Mole.

Wednesday, July 20, 2011

When Interpreters Collide

Note: this post is about implementing an M0 interpreter in Perl and is more a lightly edited braindump than a polished presentation of a concept.

Recently some test failures in M0's test suite revealed that the prototype Perl interpreter had been sneaking some of its perl-nature into the implementation.  The M0 assembler had been storing all values as strings and the interpreter had been secretly using its perlishness to convert the number-like values into ints at runtime.  This doesn't work well for an M0 implementation because M0 needs to be very specific about the low-level behavior of an implementation and the way it treats registers.

Perl is not C, and the basic problem I'm running into is that Perl is not designed to operate at the low level that M0 (as it currently stands) requires.  M0 is all about bytes and assigning meaning to the value in a register by using a certain classes of ops on it.  Perl is much higher-level and doesn't even have a particularly strong distinction between strings and integer values.  If I want Perl to have string byte-oriented C-like semantics, it means that I'll be widely (ab)using the bytes pragma and pack/unpack.  This is doable, but it's also torturing Perl into implementing something even further from its intended use case than the current (and subtly-incorrect) M0 implementation already is.  sorear rightly freaked out when he looked at the M0 interp code, because it's doing something that Perl wasn't intended to do and something that Perl isn't particularly well-suited to.

Still, javascript has been used to emulate at least x86, 6502, Z80 and 5A22 and  with surprisingly reasonable performance.  Arguably that's also pretty far from javascript's intended use case, and still it works.  This many just be an issue of finding the least hacky way to do something inherently very hacky.

The alternative is to specify M0 to have flexible underlying semantics, but I don't know that it'd be either practical or advisable to go too far down this road.  It's worth giving some thought to making the M0 spec be minimally unnatural to implement in a high-level language, but M0 is by its nature a low-level beast.  Implementations are bound to reflect that to some
degree.

In the end, the best way forward will probably be to plow through the craziness of implementing a simplified CPU in Perl and look forward to building on chromatic's C implementation, where the intent of the implementation language is much closer to the aim of the project.

Sunday, July 3, 2011

Parrot Weekly News for July 3rd, 2011

Welcome to the first edition of PWN.  At YAPC::NA, long-time developer chromatic expressed frustration at the fact that Parrot as a community hasn't been effective in communicating the knowledge of its members.  IRC, while great for immediate communication, doesn't lend itself to transparency for those who don't have time to hang out on #parrot 24/7 or to follow our irc logs.  My hope for this newsletter is to make Parrot's development more transparent, even for those with only have an hour or two per week to keep up with Parrot.  I also hope that this will serve as a common channel of communication for all Parrot developers in order to provide a basic understanding of what's been happening in Parrot and what's needed.

YAPC::NA

The past week contained YAPC::NA, a grassroots Perl conference organized by the Perl community for the Perl community.  There were three Parrot-related talks given by kid51, dukeleto and me, and one Perl 6 talk given by colomon.  There was also a well-attended Parrot/Perl6 BoF session on Tuesday and a hackathon on Thursday.  The hackathon was largely focused on coding and didn't generate significant directed discussion.

kid51's 10 Questions

kid51 had a short talk in which he raised a number of important questions about OSS projects in general.  He then proceeded to apply those questions to Parrot, with less than stellar results.  He had some of good points, particularly that Parrot needs to become production-ready before it can be considered a true success, that Parrot needs to have a better-defined purpose and focus, and that the project needs to "get to the point".  Asking tough questions isn't usually fun, but kid51 did Parrot a great service by honestly and directly pointing out some of the flaws of our community.  I hope his feedback will lead to positive changes in the way we look at ourselves and the products we're producing.

kid51's slides and a recording of his talk are here.

dukeleto's Visual Introduction to Parrot

dukeleto presented an introduction to the world of Parrot.  His intent was to give Parrot newbies a high-level overview of Parrot, its community and its ecosystem.  It was lighter in content due to being targeted toward less experienced audiences.  Nevertheless, it was an entertaining talk for people who already knew Parrot and provided a novel metaphor for understanding VTABLEs.  Once we're based on 6model, I look forward to seeing what kind of metaphor he comes up with.

dukeleto's slides are here.

cotto's State of Parrot

I presented a talk on the state of Parrot just after dukeleto's talk.  I covered developments in Parrot over the past year, some of the issues we need to deal with and what we expect the future to hold.  The short version is that there are a number of problems that are keeping Parrot from realizing its potential, but I think we have it within ourselves to overcome them and to produce an exciting production-ready virtual machine with some novel and useful properties.

My slides are here.

colomon's Numerics in Perl 6

colomon gave a worthwhile talk about performing numerical calculations in Perl6, both in Rakudo and Niecza (pronounced "niecha").  The talk was a good display of how people are using code that's built on top of Parrot and Rakudo.  As with all beta software, there were places where colomon ran into holes in the implementations of both Niecza and Rakudo, but the talk was hopeful and make me proud to be a Parrot hacker.

His slides are here.

Parrot/Perl6 BoF

The Perl6 and Parrot BoF session was considerably more organization-focused than most attendees were expecting.  Although the majority of attendees were from Parrot, Perl 6 (Larry Wall) and Rakudo (colomon) were also represented.  A primary point was that Parrot need to get better at communicating communal knowledge among its members and users.

Someone also suggested an intriguing way of reframing participation in Parrot.  Many of us developers work to scratch our own itches, but question "What would you be doing if the Parrot Foundation were paying you a salary?" provided a new way to look at how we manage Parrot and spawned a couple threads on parrot-dev.  For my part, this question provided the morivation for putting together this newsletter.  I hope it will also provide a motivation for all developers to take a more complete view of Parrot.

Room For Improvement

In this section of the newsletter, I will highlight areas of Parrot that are ripe for optimization.  Due to YAPC::NA this newsletter is already filling up quickly, so I'll highlight just one area.

config_lib.pir creates a hash that contains all data picked up by Configure.pl during configuration.  It has more than 250 entries, the majority of which don't provide any useful information.  Figuring out which entries in the hash are necessary and removing all the rest will help trim Parrot's startup time and make parrot_config a bit easier to sort through.  If you're interested in this, drop by #parrot or parrot-dev and chances are good that someone will be able to put you to work.

Other possible areas for optimzation are listed on the following pages on our wiki.
http://trac.parrot.org/parrot/wiki/PerformanceImprovements
http://trac.parrot.org/parrot/wiki/chromaticTasks
http://trac.parrot.org/parrot/wiki/PCCPerformanceImprovements

Submitting

If you see an interesting conversation on either #parrot, parrot-dev or #perl6, please mark it by saying "PWN".  When preparing this newsletter, I'll search through irclog (moritz++) for any mentions of "PWN" and a summary of the conversation to the next edition of PWN.

Sunday, May 15, 2011

Thoughts on the PDS

A number of useful conclusions and targets came from the Q2 2011 Parrot Developers Summit that happened yesterday.  This post will contain a summary of the event and my take on what we'll be doing as a result.  Props go out to kid51 for organizing an agenda for the meeting and keeping us more-or-less in line.  Strict organization isn't vital for an irc meeting, but he did good job of making sure that our limited time was used effectively.

We started out reviewing the state of our previous roadmap goals.

The Deprecations-as-Data goal was substantially met.  I love this goal because it has potential to make life easier for our users (especially Rakudo) by expressly delineating what features are going to need upgrading.  A recent issue with nci and the 't' type demonstrates that we still have more room for improvement.  (pmichaud and whiteknight discussed a proposed solution after the meeting, but it needs a little experimentation first.)  My hope for data-based deprecations is that we end up with a better early warning system that alerts Parrot's users and gets discussions started before things break horribly.  pmichaud's concern was that that the web tends toward passivity and that what's needed is active notification of pending and actual removals.  I think this will be a boon.

whiteknight's IMCC Isolation goal is making excellent progress.  pmichaud commented that it's had no negative impact on Rakudo's development, which is impressive given its scope and invasiveness.  IMCC isn't yet an optional component, but it's quite possible to run libparrot without initializing IMCC at all.   Excising it completely is quickly becoming a possibility.  whiteknight has been doing a bang-up job and isn't showing any signs of slowing down.

The third goal is one that dukeleto and I have been working on, of getting M0 prototyped.  dukeleto's working on the assembler and I've got the interpreter, both being written in Perl 5 with the binary M0 format (".m0b") being the only interaction between them.  The punchline is that the interpreter is fully-implemented with stubs for all ops and the assembler is a couple weeks from being usable, depending on duke's tuits.  On the one hand I'm a little disappointed that we don't have a fully usable prototype, but it is what it is.  Even once both prototypes are "complete", there are several questions we need to get together with allison and/or chromatic to answer.  Our M0 plan is to get the prototypes as complete as we know how and to have another meeting where we get all our questions answers, possibly even hacking the last few needed bits into the prototypes as we meet.

Once we moved away from the retrospective, pmichaud quickly asked what Parrot's plans were concerning Rakudo.  He specifically asked if Rakudo should consider itself officially blessed in developing against master rather than a release (we said "yes"), and if we planned to use Rakudo for regular benchmarking.  This second concern is especially important because Rakudo has seen some significant performance regressions in the last couple months, in spite of the introduction of the new generational mark & sweep GC.  The expectation is that regular performance testing would have brought this to light sooner and that once it's in place, we'll be more conscious of how our changes affect Rakudo's performance.  We've had a distinct lack of benchmarking in the last few months.  I hope this is the first of many attempts to revitalize our efforts to improve performance.

On the same note, Codespeed (which runs speed.pypy.org) was mentioned as a possibility.  I remember mentioning this in the past without effect, but hopefully the time was right at PDS.  We didn't formally ask for someone to investigate it though.  I hope it doesn't get dropped on the floor again.

The next PDS was scheduled for July 30th or 31st, which seems comfortably far away from any known conferences.  whiteknight volunteered to set up a Doodle, which is proving to be a very handy tool for scheduling these things.

The next topic to come up with profiling.  While working on Rakudo, pmichaud hacked out very quick and dirty sub-level profiler that immediately pointed out an important hotspot.  This indicated to me that we need to up the game of the profiling tools that we provide as part of Parrot.  whiteknight and I were on the same page, so one of our new roadmap goals is to dig into the current profiling runcore, find out what's keeping it from being useful and fix it.  It currently depends on IMCC to get its information about the currently running code, so there's potential for much yak-shaving.  On paper the goal is only to investigate.  I hope we can get much more done.  I love providing useful tools to people, so I'm glad to have a chance to redeem the profiling runcore.  Unfortunately having whiteknight work on profiling will mean that he won't be spending as much time figuring out how to apply 6model to Parrot, but that's what it means to have priorities.

A third concern was raised by pmichaud, who said that it's difficult to gauge what Parrot's leadership thinks about certain issues.  One of the triggers in this case was my rather foolish removal of the intiailization of Parrot's PRNG (pseudo-random number generation) using the system clock.  At the time Peter Lobsinger made the reasonable-sounding argument that there's no single way to correctly do PRNG that will satisfy the needs of every possible use case.  After too little thought, I decided to interpret that as meaning that it didn't matter that I'd changed Parrot's PRNG behavior because Rakudo should be doing what makes sense for them.  This ended up being a bad idea that caused some pain for Rakudo, and while I eventually reinstated PRNG intialization from the system clock and later from the system entropy pool, it showed the need for a better-delineated interface to gather option from Parrot's developers as a whole.  To this end, whiteknight and I will serve as a sort of ombudsmen for when technical decisions end up harming users and need to be appealed.  I don't think we'll need to put on our ombusdmen hats often, but we'll be glad to have them when we do.

Breaks in compatibility are inevitable, but what whiteknight and I hope to achieve as ombudsmen is to make sure that users have a respectful ear and will get fair consideration for their problems.  A disconnect between the needs of our users and our goals is very unhealthy and can only harm both parties.

Overall, it felt like a very productive and well-organized discussion.  pmichaud did a great job of representing Rakudo's concerns and I think that the coming months will see several improvements in Parrot's process and tools to make it a better plaform for Rakudo to build on.

Sunday, May 1, 2011

M0ving Forward

dukeleto and I shared a hotel room at LinuxFestNorthwest and had a great opportunity to talk about M0 after our respective talks.  We went over the state of the spec and what the best forward might be.  We also tried to look at what the future M0-based Parrot workflow will look like and how we can get there, though we got distracted before the crystal ball was delivered.

First, dukeleto mentioned that M0 is less discoverable than it needs to be, especially for a project that we expect to become Parrot's new foundation.  He suggested that we write a document that someone can read to get a clear 10,000 foot view of M0 and how its pieces fit together, a glossy brochure of sorts.  This could be either an introductory section in the M0 spec or a separate document.  The important thing is to have something we can point people at so that dukeleto and I aren't the only ones who can readily articulate what M0 is and where M0 is headed.

We also made some updates to the spec to make getting values from the variables table less confusing.  This is fairly minor in the scheme of things, but so is Perl's "say".

Last of all, we hammered out a plan for how get a working M0 prototype assembler and interpreter.

atrodo has been very valuable in providing his prototype Lorito implementation, both in his documentation and in the way he's had to bring assumptions to the surface to get a runnable interpreter.  His implementation differs from the spec in a number of ways (many of which are because it predates the spec), but it's been helpful in those places because it shows us what we want by counterexample.  The next (brief) stage was a set of prototype PIR dynops of M0 I hacked together.  This was great to get some runnable code that was close to the spec, but it very quickly ran into the impedance mismatch between the high level of PIR and the low level of M0.  The effort on the m0 prototype dynops wasn't wasted, but they've reached the limit of their usefulness.

The next step we've decided to take is to implement a separate prototype M0 assembler and interpreter.  dukeleto is be working on the assembler and I'll do the interpreter, both based on the M0 spec in the m0-spec branch on GitHub.  The only interface between the two will be M0's binary representation, so we can easily change one without needing to modify the other.  We're trying to converge on the structure of both the interpreter and assembler, but we expect this to the last prototype rather than a final implementation.   We'll also be writing tests against both the interpreter and assembler which we can later use against any future implementations.

dukeleto has started hacking in the m0-prototype branch in src/m0 and managed to get some very basic tests passing before he went to sleep.  We'll both be using Perl 5.10 as an expedient, since we don't expect these projects to serve as more than prototypes.  As a temporary measure one of us will need to hand-generate a couple simple bytecode files to verify that the assembler is working correctly.  These files will live in t/m0 in the branch.  The test code will be a minimal hello world program and a slightly more complex multi-chunk M0 program to help iron out inter-chunk interaction.  We haven't decided on what the complex example will be yet.  This is a part of the spec we'll need to work on as we come to understand what implementation makes the most sense.

Overall, rooming together at LinuxFestNorthwest has been very helpful in moving M0 forward.  Both of us have used the opportunity to bounce ideas off each other and to get the M0 train out of the station.  We're still a couple stages (and probably one more face-to-face meeting with allison and/or chromatic) away from a final implementation, but we can see the light at the end of the igloo, and it's looking pretty good.

There are a couple things that still need to get done.  In the interest of trying to keep them from getting dropped on the floor, they are:

  • Map out what a future m0 workflow will look like, what we need to do now to make it possible.
  • Make M0's roadmap and status more discoverable by making a glossy brochure that will communicate the idea effectively to someone who hasn't heard of M0 before.

Monday, January 10, 2011

Thanks, Code-In students!

This post is intended for students who've participated in this year's Google Code-In, specifically for those who worked on some of the tasks for Parrot.  If you worked on another project, this post is still for you.  Just ignore the Parroty bits.

When we Parrot developers first decided that Parrot would be participating in Google's new Code-In program (gci), I was quite skeptical.  Most of our initial tasks were for translations and many didn't seem to me like they'd help Parrot as a project, especially since Code-In was a new (and untested) initiative.  If you'd asked me what I though before the start of gci, I'd say that I had low expectations but would be glad if proven wrong.

I'm glad to say that the amount and quality of the contributions we've received from gci students has proven me very wrong.  We've had a few low-quality results, but the large majority have been of excellent quality.  Over the course of gci, we've added thousands of lines of tests and code, squashed lots of bugs and had several reported, and have increased our test coverage by about 3.5%, all of which which represents a great deal of work for a large project like Parrot.  As gci progressed, we've even been able to bump up the difficulty of our "difficult"-rated tasks substantially to challenge our most ambitious students.  Parrot is much better off because of the efforts of all of you.

But gci isn't what this post is about.  Now that gci is over, you students will have the opportunity to continue hacking on OSS projects such as Parrot, but you won't be doing it for the artificial currency that Google has been kind enough to create.  If you continue, you'll be working for the same reason any other developer hacks on an OSS; for scratching an itch, for the excitement of having people use something you've helped build, and for the ability to contribute to something useful that's much bigger than any one person could create.

The Parrot project will welcome your contributions, as I'm sure any other gci projects will also do.  Google gave you a motivation to get over the initial hump of finding a project and figuring out a couple accessible things to contribute to, but now it'll be your job to keep going.  Much of OSS development happens because people are scratching their itches*.  I have mine**, and I hope some of you gci students will find your own itches to scratch too.  Along the way you'll run into all kinds of roadblocks, from broken libraries to half-assed implementations to outright lies in documentation, but those are just some of hazards of building something new.  The best you can do is shave the requisite yaks*** so the road won't be as bumpy for the next hacker and get back on the track to making something awesome.

I hope to see all of you continue to make contributions to Parrot after the end of gci.  Your incentives will be different from now on, but they'll also become much more exciting.  If you're interested and don't know quite what you want to do, we'll always try to help you find something awesome to keep you busy.  Please stick around and keep on hacking!
Thanks,

Christoph Otto
Architect, Parrot VM



* This includes corporate-sponsored OSS development, where you get hired to scratch an itch.  gci and OSS experience looks good when companies search for these kinds of people.

** My personal itch is to make a PHP interpreter on Parrot that can interoperate with other Parrot-based languages.  Yeah, it's a big itch.

*** Yak shaving means solving problems to solve problems to solve problems, etc.  You may end up doing a lot of that in Parrot since it's more meta than many projects.