The Perl Foundation: Grant Proposal: Fixing Perl5 Core Bugs

David Mitchell has submitted a grant proposal, which if accepted would make use of a portion of the funding generously provided to TPF by Booking.com.

Before the Board votes on this proposal we would like to get feedback and endorsements from the Perl community. Please leave feedback in the comments or send email with your comments to karen at perlfoundation.org.

Grant Title: Fixing perl5 core bugs

Name: David Mitchell

Amount Requested: $25,000

Synopsis

Recently, booking.com donated $50K for the "further development and
maintenance of the Perl programming language". I would like part of that
money to to be used to fund me for approximately six months to devote 50%
of my time fixing "hard" core perl5 bugs.

Benefits to the Perl Community

There are currently approximately 1200 open and 300 new bug reports in the
perl5 bug queue. Although some of these are of the "5.003_08 does not
build on platform X" variety, many are current: for example, almost 500 of
them were created after the release of 5.10.0. As the perl core has become
more and more gnarly, and the pool of experienced but active core hackers
has declined, these bugs are just piling up and not getting fixed,
especially the hard ones. With this funding, I would would be able to
devote serious time and effort to making a dent in this queue.

Note that unlike many large open source projects, perl has no paid
developers devoted to bug fixing.

Deliverables

Unusually for a TPF grant, there are not clear-cut deliverables for this
project. I intend to devote 500 hours of my time over the next six months
fixing perl core bugs. The net result will be a list of bug numbers that
have been diagnosed, and (hopefully) fixed. Because it's impossible to
predict in advance how difficult a bug is going to be to diagnose and fix
(or indeed whether it is even fixable), I can't commit in advance to a
fixed list of bugs that I will fix over the course of the grant. Nor is it
realistic to have a bounty per fixed bug; I would end up not getting
rewarded for time spent on difficult bugs, and conversely I would have a
strong incentive to cherry-pick easy bugs, defeating the purpose of the
grant.

Therefore, monitoring of my progress will become important (see below).

Project Details

I think this has been fully covered above.

Inch-stones

Note that due to the length and scale of this project, it is suggested
that there be two project managers, who can spread the monitoring load
between them as they see fit.

Since this project is heavily based on hours worked and the monitoring
thereof, I would post a weekly summary on the p5p mailing list which
details, for each bug worked on that week, how many hours were spent on
diagnosis and fixes, plus any bug status changes. This frequent feedback
would allow the grant managers and active core developers (who will be
aware of any recent commits and other activity of mine) to observe whether
my claimed hours bear any relation to actual activity and results, and
thus allow early flagging of any concerns.

Missing two weekly reports in a row without prior notice would be grounds
for terminating the project.

Once per calendar month I would claim an amount equal to $50 x hours worked.
I would issue a report similar to the weekly ones, but summarizing the
whole month. The report would need to be signed off by one of the
project managers before I get paid. Note that this means I am paid
entirely in arrears.

At the time of my final claim, I would also produce a report summarising
the activity across the whole project period.

Also, (the "nuclear option"), I suggest that either of the project managers
be allowed, at any time, to inform the board that in their opinion the
project is failing badly, and that the TPF board may then, after allowing
me to present my side of things, to vote whether to terminate the project
at that point (i.e. to not pay me for any hours worked after I was first
informed that a manager had "raised the alarm").

To ensure that at there are at least some visible results for the hours
spent, I would be required have closed at least one bug per 20 hours
before being able to claim money for those hours. (I would hope to close
more bugs than that, but by setting a low baseline, I'm not tying my
hands, while still allowing TPF to have something visible for publicity
purposes during the interim.)

Project Schedule

I am available to start work on this project immediately.

The project is expected to take six months. I am self-employed, which
allows me a good deal of flexibility. By promising approximately 50% of my
time, this gives me the ability to continue with my existing commitments
to other clients, while deferring seeking new clients. As such, the weekly
hours I devote to perl are likely to be highly variable, but hopefully
averaging out to about 20 hours per week. If for some reason I find that I
have spent less than 500 hours at the end of the six months, then I will
continue the project until until the 500 hours been spent, with the
proviso that that the TPF board are free to terminate the project at any
time after the six months. Conversely, if I manage to devote more than 20
hours per week, then my monthly payments will be accordingly larger, and
the project will terminate early (once the 500 hours are spent).

Note that it is currently my intention that the after six months I will
apply for a further $25K extension, although there is no obligation for me
to do so, nor for TPF to approve it.

Bio

I'm a freelance UNIX sysadmin and programmer living in the UK. I have been
using perl since 1993, and have been fixing core perl 5 bugs since 2001.
I have had commit rights since 2003 and I was pumpking for the 5.10.1 perl
release.

In short, I am one of only a handful of active people who understand large
parts of the perl internals and who can thus fix "hard" bugs.

The Perl Foundation: 2010Q1 Grant Proposals

For this quarter TPF has three grant proposals that were not funded in 2009Q3 round and that will be discussed and voted again in this round, and four new grant proposals:

Please take some time to comment on these proposals. TPF Grants Committee is very interested in community feedback on these projects relevance. Please be polite.

The Perl Foundation: 2010 Grant Proposal: Enhancing Perl 6 Pattern Matching

Enhancing Perl 6 Pattern Matching with Ideas from Snobol4 and Other Sources

Name:

Morris M. Siegel, Ph.D.

Email:

[hidden email]

Amount Requested:

$3000 (negotiable)

The substance of my proposed alternative pattern-matching specification has already been essentially worked out as a self-funded research project, as it were. I felt this project was of such importance that it was worthwhile giving myself a sort of sabbatical to develop it. It has taken rather longer than I originally anticipated, and my personal funds have dropped to an uncomfortably low level. I can no longer afford to keep concentrating my efforts on this project on an unfunded basis.

My grant request is intended to retroactively fund some of my past development, and as well to enable me to continue focusing my efforts on the project. (The main task remaining is to write it up carefully and precisely, but there are some aspects that still need to be thought out.) Without a grant I would have to relegate the project to my limited spare time, and by the time it would be done it could well be too late for serious consideration.

I selected the figure of $3000 since I understand it is the upper range of typical Perl Foundation grants. I think the merits of the project, plus its time requirement (past and future), would justify a larger sum if the money is available. In addition, a larger sum would enable me to spend more time on fleshing out those ideas that still need to be thought out.

Synopsis

One of the chief reasons for Perl's popularity is its regex pattern-matching facility; no other part of Perl has been made into a stand-alone package (PCRE) or borrowed so extensively by other languages. The very name "Perl" alludes to the fundamental nature of pattern matching: the "E" of the acronym "PERL" stands for "extraction," which mostly means pattern matching. Perl 6 pattern matching is substantially more powerful than that of Perl 5.

Snobol4 is arguably the first widely-available language providing a pattern-matching facility, and despite its age, and despite all the new features of Perl 6, there are still some aspects in which Snobol4 pattern matching is more powerful than that of Perl 6.

Aside from the above, the current specification for Perl 6 pattern matching, Synopsis 05 (available as http://svn.pugscode.org/pugs/docs/Perl6/Spec/S05-regex.pod, http://perlcabal.org/syn/S05.html, or http://perl6.cz/wiki/Synopses/S05), is quite complicated to learn and remember, as is evident simply by reading through all of S05. Moreover, many of its multitudinous capture mechanisms lead to code which is brittle, hard to read and maintain, and often non-mnemonic. These complications and problems do not conform to the practical usability which is supposed to be a hallmark of Perl (the "P" of the acronym), and are not a necessary price that must be unfortunately be paid for the power.

The purpose of this project is to formulate an alternative specification for Perl 6 pattern matching that is (1) enhanced by ideas inspired by Snobol4 and other sources but adapted to Perl's idiom, (2) simpler to learn and use, and leading to code which is easier to read and maintain, and (3) at least as powerful, and arguably more powerful, than the current specification.

Benefits to the Perl Community

If pattern matching is enhanced as indicated by the preceding paragraph, then potentially all Perl programmers needing to do non-trivial pattern matching will benefit.

In addition, the very acceptance of Perl 6 in the wider computing community could well be facilitated, since I think it quite probable that many would be put off by the complexity of the current pattern-matching specification. Along these lines, enhanced pattern matching would be more likely to inspire the adaptation of PCRE to Perl 6 and similar imitation by other languages, and thereby benefit not just the Perl community but the entire computing community.

I am well aware that much work has already been done in the implementation of the current specification, and that much development has been done based on the current specification, notably Larry Wall's STD grammar for Perl 6, and also the grammars for the various Parrot-based languages. As such, it is admittedly bold to suggest at this late stage that the specification be significantly revised. However, I believe the advantages afforded by the alternative specification warrant its serious consideration. (In a conversation I had once with Larry Wall, he stated that although he does not agree with all my ideas, he finds it worthwhile to listen to them.)

At YAPC|10 in Pittsburgh, on Jun 24, 2009, I gave a talk entitled "Enhancing Perl 6 Pattern-Matching with Ideas from Snobol4" (http://yapc10.org/yn2009/talk/1988), whereby I intended to present my ideas to the Perl community and get feedback. Unfortunately, I did not time my talk well, and by the time I finished presenting an overview of Snobol4 (to provide background for my ideas) and a sampling of problems with the current specification (to justify revising it), there was not enough time left to actually explain the alternative specification. Conversations with other YAPC participants did reveal interest in hearing my ideas. In particular, in discussions with Patrick Michaud, who is the chief if not sole implementer of the current specification, he (1) acknowledged that the core Perl 6 developers realize that S05 is hard to read (Larry Wall confirmed this), (2) complimented me on my examples illustrating the brittleness and other problems of the current capture mechanism, and (3) stated that if an alternative specification were even better than the current one, he would be happy to implement it.

Technically it would be possible for both the current and the alternative specifications to coexist in the same implementation, so on-going Perl 6 development efforts could proceed unimpeded while the alternative specification was being implemented and refined. If at some point it were decided to actually replace the current specification with the alternative (which is the ultimate intent), I believe the conversion of existing code should not be too laborious, so the initial release of Perl 6 would not be unduly delayed. There is some precedent for this, viz. the two different threading models of Perl 5.

Deliverables

This project should initially result in a document published on the Web presenting (1) an overview of the relevant parts of Snobol4, to help motivate the Snobol4-inspired features of the alternative specification, (2) a discussion of problems with the current specification, and (3) the alternative specification, in fairly complete detail.

After publication, a notice would be emailed to the appropriate mailing lists (perl6-language, yapc, snobol, perhaps others) informing subscribers of the existence of the document and inviting feedback. Based on feedback, the alternative specification might be revised. After a few iterations of this, assuming sufficient interest expressed by the Perl community, the alternative specification should be stable enough to proceed to implementation and further refinement as appropriate.

Project Details

The alternative specification document would assume the reader knows Perl 5 and has read S05 at least cursorily, and would rely on S05 to provide details on those features common to the alternative and current specifications. However, the alternative specification has a sufficiently different flavor from the current one that the document would have to present many ideas from scratch, so it should be reasonably accessible even to someone whose understands pattern matching conceptually but is unfamiliar with S05.

It is difficult to go into more detail on the content of the alternative specification document without summarizing it, which I feel is beyond the scope of a grant proposal. The "inch-stones" listed below are far too terse to give the reader any notion of the content. However, to provide some sort of glimpse of the content, we list the following features of Snobol4 pattern matching that are absent from Perl 6 as it stands now but would be present in alternative pattern matching:

(A) Compile time vs. build time vs. match time

The pattern structures used in Snobol4 pattern matching are not built at compile time. Rather, at run time, a pattern structure is built as a result of evaluating a pattern-valued expression; once built, this structure can be used to do pattern matching, either immediately or later on.

As a result of having two distinct run-time operations, pattern building and pattern matching, the Snobol4 programmer has the ability (1) to chose during which operation to bind the value of pattern components (e.g. LEN(N) vs. LEN(*N)), and (2) to define new pattern-matching functions in a convenient high-level manner, without having to resort to writing macros or low-level code.

Understanding this three-way distinction among compile time, build time, and match time is crucial. On one hand, a careless or novice programmer who conflates compile time and build time can inadvertently write a program that inefficiently reconstructs the same pattern numerous times (although to mitigate this an optimizing compiler can precompute constant patterns or subpatterns). On the other hand, this distinction encourages a mind-set and facilitates a programming style in which the programmer writes pattern-valued functions to effectively extend the language of pattern-matching expressions, since the execution of these functions takes place during build time and does not cost anything at match time. Writing such pattern-valued functions seems at least conceptually easier than writing macros, and the ability to do so enhances the expressiveness of the programming language.

If the equivalent distinction existed in Perl 6, then not only would the expressiveness of pattern notation be increased, but also some of the complexity of the core pattern-matching specification could be offloaded to modules that define pattern-valued functions or methods.

(B) Conditional capture

In the Snobol4 operation of conditional value assignment (binary "."), assignment ("capture," in Perl terminology) takes place only if the value is captured from a subpattern that is part of an ultimately successful match. That is, in the semantics of Snobol4 pattern matching, there is a distinct "conditional" phase following the successful conclusion of a match and prior to the substitution phase, in which conditional value assignments (which may include arbitrary side effects) are carried out. This phase, which currently has no analogue in Perl whatsoever, enables the Snobol4 programmer to write patterns that backtrack without having to undo side effects performed by alternands that initially succeed but are later backtracked out of. Although unrestricted backtracking can result in unacceptably slow performance, limited backtracking can be quite efficient, and reworking a pattern to avoid backtracking entirely can be tedious and result in code that is less natural. If Perl 6 had a similar conditional phase, then the programmer would no longer have to rid his patterns of backtracking in order to avoid performing inappropriate side effects. This would clearly facilitate the task of formulating patterns, especially complex ones.

(C) Miscellaneous primitive patterns

Snobol4 has some useful primitive patterns which cannot easily be emulated in Perl 6: TAB and RTAB, which move the cursor to a given position from the beginning or the end of a string, and (in the Snobol4+ dialect) ATAB, ARTAB, and LEN, which can move the cursor to the left as part of normal pattern matching (not as look-behind). Unlike (A) and (B) above, these do not reflect a fundamental difference between the Snobol4 and Perl 6 pattern-matching models, and thus could be included (if desired) even into the current specification.

A significant part of the challenge of adapting these features for inclusion in Perl 6 lies not merely in altering the notation to conform to the style of Perl 6, but rather in appropriately generalizing the features themselves to harmonize with the rest of Perl 6 pattern matching, and in particular to accord with Perl features absent from Snobol4 such as lexical scope.

Inch-stones and Project Schedule

Experience with my dissertation and other long papers I have written has shown that writing up even already-worked-out ideas is more time-consuming than one anticipates, so I have tried to be conservative in the timing estimates for the inch-stones listed below. The estimates are in units of work days and appear in curly braces following each milestone. The sum total comes to 46 work days, or (allowing for slippage) about 10 work weeks. Taking into account some personal obligations during this period, I believe the essential deliverable -- the initial specification document -- could be completed in three elapsed months, which could begin at once. How much time would be needed after that for revision would obviously depend heavily on the promptness, quantity, and content of feedback.

I mentioned above that although the ideas are essentially worked out, there are still some aspects needing further reflection. They are: (a) verifying conformity with the other Synopses (which is non-trivial, given how voluminous and dense the Synopses are); (b) fleshing out details of a possible Pattern role; and (c) providing additional examples of patterns written according to the alternate specification. These are the issues that, given a larger grant, could get extra attention. Even if the project is expanded to include them, the initial document would not be delayed: I think it important that the Perl community be able to begin considering the alternative specification as soon as feasible, and any expansion of the project could be done thereafter while feedback would be (hopefully) received and considered.

The inch-stones of the specification document are:--

I. Summary of salient parts of Snobol4 {2}

II. Problems with the current specification of Perl 6 pattern matching (S05); justification for considering an alternate specification {2}

III. The alternative specification

0. influence of prior work; disclaimer: possible Perl5ish spirit; perhaps could be simplified {.4}

1. terminology: "pattern", "special form", "subpattern", "subrule", "P6c", "P6a" {.3}

2. overview of model: data structure, with arbitrary embedded values, that acts like code during pattern matching {1}

2.1. pattern code vs. normal code (PE vs. NE [PNE, listNE, numNE] {.2}

2.2. p{PE}, p/PE/, /PE/, p(@args):attrs{PE}; perhaps pat{PE} or pattern{PE}; rule, token {1}

2.3. incorporation (similar to Lisp quasiquotation) {.3}

2.4. compile time vs. build time vs. match time {.2}

2.5. named patterns (declared rather than assigned); build time at UNITCHECK/INIT/BEGIN {.2}

2.6. substantiation, persubstantiation {.2}

3. matching a string, :i etc., :approx (cf. agrep, TRE) {.4}

4. matching a Boolean (True, <null>, False, <fail>), <do> {.4}

5. matching a number, :fuzzy {.3}

6. matching a CharSet (O-O character classes -- <[ ... ]>; cf. Icon charsets) {1.3}

7. matching a [sub]pattern (primitive or composite) {.5}

8. matching a closure {.3}

9. parametrized patterns; <bind> {.5}

10. quantification with separation {.2}

11. scoping [sub]patterns: [ ... ], <LABEL>:[ ... ] {.5}

11.1. unique properties: $/ (pattern-local, not normal-local), emission, conditional emission {.5}

11.2. possible "minor scope": e.g. (:i PE) vs. [:i PE] {.1}

11.3. <my>, <state>, NORMAL:: {.3}

11.4. <abort>, undef {.1}

11.5. <commit>, :: {.2}

11.6. <emit>, @() or @EMIT; uncaptured emissions {.5}

11.7. <yield> {.3}

11.8. @($/.quant) {.2}

11.9. identities and other examples {.5}

12. :P5 {.2}

13. capture (of emission of scoping subpattern) {.5}

13.1. overview: data flow model, "~>" (if not ">>"), target lists, repetition, using coroutine logic to capture to next target not yet processed, :take(n) { 2 }

13.2. passive targets: scalars, arrays, slices, * {.5}

13.3. active targets: functions, code references, (perhaps) $*TAKE {.8}

13.3. active targets: plain blocks, pointy blocks, <do> {.8}

13.4. active targets: p{PE}, [PE], <named_pattern> {.8}

13.5. secondary targets, splitting and joining of data streams {.5}

13.6. chaining of subpatterns {.5}

13.7. examples {1}

14. conditional phase {.5}

14.1. "~>?": conditional capture {.5}

14.2. <do?>: conditional side-effect {.5}

14.3. <confirm> {.5}

14.4. behavior w.r.t. backtracking {.5}

14.5. examples {1}

15. <tell>, <seek>, <at>, :forward, :bidi {.5}

16. <reverse> {.2}

17. :decl (or :par or :parallel), :proc (or :seq or :sequential), (:proc) to establish sequence point {.8}

18. :canon, :quick {.5}

19. generic meaning of <name> and of <op args> {.3}

20. <before PE>, <after PE>, <!before PE>, <!after PE> {.5}

21. <eval>, :memo {.3}

22. {any @arr}, {cat @arr}, {all @arr}, :eval, :lazyeval, :memo {.6}

23. <to>, <from>, <(PE)> {.2}

24. <cut>; <subst> {.7}

25. Rationalized m, M, s {.3}

25.1. m {.4}

25.2. M {.1}

25.3. s {.3}

25.4. possible generalization of m {.3}

25.5. possible generalization of s {.3}

25.6. relation to "~~" {.1}

25.7. dwimmy laxity in placement of attributes for m, s, and p {.3}

26. OO interface {.2}

26.1 m {.4}

26.2 s {.3}

26.3 resumed matching after <yield> (coroutine-style) {.4}

27. :keepall {1}

28. :g -- top-level result is list/array of Match objects {.3}

29. <try>, <catch> {.6}

30. perhaps: <lazy PE> -- like {{ p{PE} }}, but when p{PE} is first evaluated it replaces (memoizingly) the closure { p{PE} } in the pattern structure. {.3}

31. <literal>, :eval {.4}

32. matching a Range {.3}

33. matching an arbitrary object: Pattern role (patternization method) {1}

34. summary of members of $/ {.5}

35. other notational differences {.1}

35.1. :sigspace should retain the colon (m:s, s:s, p:s). (If not, at least let m:s abbreviate to ms, not mm .) {.1}

35.2. {overlay(p,q)} instead of [p & q] or [p && q] {.2}

35.3. {juxta(a,b,c)} instead of [a ~ c b] {.3}

36. :panic {.1}

37. comparison of P6a with P6c; features of P6c not directly present in P6a -- i.e. handled differently or (like ~~ and <prior>) subsumed by other features {2}

38. more examples {3}

39. co-existence with current specification {.8}

40. concluding remarks {1}

Bio

I have a Ph.D. in Computer Science from Cornell University; my dissertation is entitled Proving Properties of Snobol4 Patterns. I have long been interested in regular expressions, context-free and other formal languages, and pattern matching.

As mentioned above, the ideas constituting my proposal are basically already worked out, and there was interest expressed by some participants at YAPC|10 in seeing them. As far as I know, no one else has proposed or intends to propose an alternate pattern-matching specification for Perl, so it would follow that I am the best person to do this.

The Perl Foundation: 2010 Grant Proposal: Improve Dist::Zilla

Improve Dist::Zilla's Tests, Documentation, and Structure

Name:

Ricardo Signes

Email:

rjbs@cpan.org

Amount Requested:

$2000

Synopsis

Dist::Zilla is a tool that helps Perl programmers build distributions for the CPAN. It eliminates boilerplate, handles packaging, interfaces with changelogs and version control, improves prerequisite management, and generally makes it easier to be a CPAN author. This grant will fund work to make it easier for new users to adopt Dist::Zilla and for Dist::Zilla itself to be more easily extended, maintained, and understood.

Benefits to the Perl Community

Dist::Zilla makes the CPAN better. More code can be released because the work required to do so is greatly lessened. The code that is released can be of a higher quality because more time can be spent on the code rather than the packaging. It can also improve the lives of CPAN authors in general: if you don't want to spend the time that Dist::Zilla saves you on writing more code, you can spend it on anything else you like: skiing, sleeping, or eating ice cream.

Dist::Zilla has already been adopted by dozens of authors and used to release hundreds of distributions.

Deliverables

Each deliverable below is also an "inch-stone."

proper logging facility

Right now, Dist::Zilla logs with "print." It has always been meant to use Log::Dispatch (via Log::Dispatchouli) but these changes need to be made, presumably before testing begins, so that the testing system can incorporate logged data.

Estimated time: one half day

reusable testing tools

Dist::Zilla and most of its plugins (both core and otherwise) are not well tested, because testing it is tedious. This could be greatly improved by writing a few test classes or mock plugins.

Estimated time: two days

extensive testing of the core

The reusable test tools will be put to use (and thus proven useful) when tests are written for all the core functionality. These tests may not be exhaustive, but they will be extensive and will be written with the goal of making contributors feel that they can trust the test suite to catch most regressions.

Estimated time: four days

simplification of the command line tool's code

Right now, a number of hookable events are defined only in the code implementing the dzil command, which too tightly couples the main class behavior to the command line tool. As much as is possible, the App::Cmd-based code for dzil will be turned into a very thin wrapper around Dist::Zilla's methods.

Estimated time: one half day

event structure for distribution creation

In other words, plugins will be able to attach more behavior to distribution creation, to create new source code repositories, start files, and so on.

Estimated time: one half day

core set of well-known FileFinder plugins

The FileFinder plugin role allows other plugins to operate on dynamically located sets of files like "all Perl modules that will be installed" or "all files marked executable." At present, there are no predefined FileFinder plugins with Dist::Zilla. By providing a few core finders with well-known names, it is easier for new third-party plugins to behave more like core plugins.

This requires writing the finders, testing them, and updating existing plugins to use them. It also must be possible for a user to override the behavior at the well-defined name.

Estimated time; one day

improved prerequisite handling

This will include improved methods for specifying versions required by allowing shorthand identifiers for the latest version of a prerequisite, or the version with which the author has tested.

(If the META.json 2.0 specification is sufficiently finalized by the time this work is approved, the core Dist::Zilla prerequisite system will be improved to match it. I am familiar with the proposed changes to META and have a plan for how to support them.)

Estimated time: one day

improvements for authoring distributions containing XS

I do not write XS code or C, but a number of users of Dist::Zilla do and have asked whether I can improve Dist::Zilla's ability to accomodate them. Florian Ragwitz has given me some ideas on how to do this, and I would like to carry out his plan so that Dist::Zilla does not discriminate against XS authors.

Estimated time: one half day

documentation: improved new user's guide

This will extend and supplement the existing Dist::Zila::Tutorial, starting from the position, "So you want to release code to the CPAN..." There will be a Pod version shipped with Dist::Zilla, but also an HTML document and slidecast or screencast to more clearly walk new users through the process.

Estimated time: four days

Project Schedule

I can begin work immediately upon receipt of first-third payment. I predict about ten or twelve Saturdays of work. I believe that work can be completed this quarter.

Bio

I'm RJBS on the CPAN. I have released or adopted hundreds of modules, and Dist::Zilla is the result of my own desire for a tool to make maintenance of CPAN distributions simpler. My previous TPF grant-supported work on Pod-munging tools was also in furtherance of making it easier to maintain CPAN distributions. That work was completed without problems and the released code has been succesfully adopted by a number of CPAN authors.

The Perl Foundation: 2010 Grant Proposal: CPAN Reviews

CPAN reviews

Name:

Alexandr Ciornii ('chorny' on IRC and PAUSE).

Email:

[hidden email] (backup) [hidden email]

Amount Requested:

$1200

Synopsis

Many CPAN modules have good documentation, many have bad documentation. But there is no such thing as enough documentation. There are many good reviews, examples, descriptions outside CPAN. I propose to collect them and cataloguize.

I want to make a site with links to reviews of CPAN modules. In general this site should be community-moderated, community-edited and allow users adding links to do minimal work first and enhance later, i.e. use this site as a bookmarking service.

Benefits to the Perl Community

Simplify learning CPAN modules for novices and mature users - no need to scan google search results, and be able to see is it worth reading review or not, by opinion of others.

Ability to store list of useful links and share it with others.

Possibility of integrating list of links into author own page.

Additional benefits:

A ready code of site to copy and use for similar purposes.

Support for OpenID/Bitcard in CGI::Application.

Deliverables

Code of web app (under open license)

Working site

CPAN module to support for OpenID/Bitcard in CGI::Application.

After release I will maintain and enhance code and site further.

Project Details

I plan to develop it using CGI::Application. I will need to develop CGI::Application::Plugin::Authentication plugin for OpenID/Bitcard.

Users will be able to vote up/down for link, report spam or dupe link, comment. Every link will have title and description (only one from them will be mandatory), language, date (original), tags, list of modules described in review. After adding link, some info will be fetched automatically, so user will need to edit it.

No users registration at all, OpenID/Bitcard only.

There would be ready JavaScript widgets for other sites:

  1. To display list of links for a module (sorted by popularity).
  2. To display number of links for a module.

They would be customizable, by language of links or language list can be received from HTTP headers. Also JSON output should be available.

Site would be able to get list of links from RSS feeds by tag (I propose "cpanreview", but this will be discussed with Perl community). Also tags like "cpanreview-Module::Name" or "cpanreview-Dist-Name" would add association with module. Unassociated links would be displayed separately on special page for anyone who would like to review some links.

It would be possible for any user with sufficient number of upvotes (for ex. 2) to modify title/description/module_list of link. Number of votes should be customizable for every operation.

Later I want to ask owners of http://search.cpan.org and http://kobesearch.cpan.org/ to include links to corresponding pages on modules pages.

Github will be used for hosting code.

Inch-stones

  1. OpenID/Bitcard plugin
  2. Adding links
  3. Automating fetching data about link added by user (title, modules mentioned)
  4. Voting/spam/dupe
  5. Comment system
  6. Community editing
  7. JavaScript output, export to JSON
  8. RSS fetching
  9. Refactoring based on opinion of Perl community on real version.

Project Schedule

I will begin work immediately, with 10-15 hours a week. First version with reduced capabilities should be available in 1.5 month, full version in 3 month.

Bio

I'm Perl programmer from Moldova (Europe). I've working in Perl from 2000, joined Strawberry Perl project in 2006. I'm active memeber of Perl community, maintain 18 modules on CPAN and several more are planned for release next month. I have big number of patches for Perl modules, including CGI::Application plugins, ExtUtils::MakeMaker, Module::Install.

The Perl Foundation: 2010Q1 Grant Proposal: perl core memory improvements

perl core memory improvements

Name:

Jim Cromie

Email:

[hidden email]

Amount Requested:

How much is your project worth? $3000

Synopsis

Memory allocation enhancements in core (sv.c).

Perl's variable namespace model is very flexible, users can:

 - create vars, in any package, or in my scope, by naming them;
 - give them complex values: my $foo = [ 1, { a => 2}, 3 ];
 - share/assign/shallow-copy them: $main::bar = $foo;
 - crosslink or self ref them: $a[2] = [$a[2], $a[1]];
 - other hairy stuff

This user data is all built on-demand from an inventory of sv-parts which is kept on the interpreter's freelists (sv_root, PL_body_roots). These are refilled periodically by S_more_bodies, which gets-an-arena, slices it into sv-parts, and threads them onto the freelist.

This can result in user data spread across memory like a spiderweb in a corner; its hard to clean the corner without destroying the web. IOW, it makes memory reclaim "hard", and probably ineffective. As a result I think, perl core has never really seen the need/benefit to bother reclaiming arenas.

One important workload however could benefit; Storable::freeze() uses a ptr-table to track SVs that it has %seen, but its PTEs hang off the interpreter until process termination. For a long-running process, this is clearly suboptimal.

Benefits to the Perl Community

1st, theres this in perltodo:

  use less 'memory'
       Investigate trade offs to switch out perl's choices on memory
       usage.  Particularly perl should be able to give memory back.
       This task is incremental - even a little bit of work on it will help.

This is deep core work, benefits accrue to users of 5.14, which is eventual target. Since the interfaces changed are internal, it may be possible to get it into 5.12.x.

Currently, Storable::freeze() uses ptr-tables to track seen SVs as it freezes them, so that it honors shared linkages. Doing this on large datasets will allocate a huge ptr-table, which when freed, releases all those PTEs back to the interpreter-global freelist, where they hang uselessly until process death (or interpreter shutdown).

The work proposed below appears to provide a workable mechanism to implement the private-arenas that Tim Bunce expressed want/need for, with Nicholas Clark's comments, here:

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2009-12/msg00821.html

By my 1st reads, Tim wants to coax a set of SV allocations to be taken out of separate arenas, to protect them from others. Nick outlined a solution that largely fits with my earlier revision of this grant proposal (Aug, 09), but added a discussion of savestacks, and implied (to me, at any rate) a need for a robust underlying mechanism, prompting this revision of the proposal.

General benefits will likely flow from finding out what nytprof needs, and figuring out how to provide it :-D

Deliverables, Project Details

here are the major elements

Private Arenas

  The short version:
  - adapt get-arenas(sig): sv_type arg2 -> (void*) reqid
    and track allocs by the reqid
  - propagate that to S_more_bodies and its macro wrappers
  - add release-arenas() stub 1st

With get-arenas(reqid), we can track arenas by its users, with S_more_bodies we can extend that tracking to the interpreter's svtype consumers individually. With unique tracking of arena users, we can offer release-arenas(reqid), and since we're an internal sub-system interface, expect them to use it properly.

Design Benefits

S_more_bodies() outer-users (disregarding the macro-wrapper) keep their current interface, the arenas provisioned by it for each sv_type are transparently tracked, and can soon be reclaimed.

get-arena/release-arena give a balanced api for clients to manage slabs of memory themselves. The api is minimal, allowing and requiring simply that callers of get-arena(reqid) do:

 - call release-arenas(reqid) when done with mem.
 - know theyre not sharing parts of the arenas when done.
 - dont abuse the reqids of others, ref your own object.
 - users can create and abandon arenas (be careful!)

With this, users hacking in core can allocate many slabs, of various sizes, using just one reqid, assemble them with pointers into arbitrary structures, and when done, know that they're all cleaned up together. Users may also use multiple reqids to simplify their memory reclaim operations.

It should also be flexible and efficient enough for use by XS libraries, given their tolerance for newness.

Private-Arenas 1st user: ptr-tables

Given the Storable use case, this has potential merit; being parsimonious with PTE mem by default will work for some users.

But for less specific cases, the global PTE freelist probably wins a performance contest; the malloc demand is intrinsically less when PTEs are reused, not only freeze() uses ptr-tables, and its only pathological cases that would even cause notice.

Nonetheless, it provides a test-case for 1st use of the new interface, and an alternate ptr-table implementation, possibly providing support for 'use less memory'

Note that with the stubbed release_arenas, we only pretend to free the private allocations; this may cause problems in make test, but the overall demand for ptr-tables is quite limited (iirc the big user, t/re/regexp_qr_embed_thr.t creates ~2000 ptr-tables), and on 1GB machines, we may not run out of memory.

This has some probitive value for OOM handling also, especially in a setrlimit()d sandbox.

Design Benefit

private arenas in ptr-tables provides a concrete basis to consider other resource reclaim strategies, narrowly 1st, but perhaps also broadly for other potential users.

When ptr_table_free is called, we know that:

  - we start with an empty, private PTE freelist, fill it as needed
  - pt-store consumes PTEs from private PTE freelist
  - all PTEs in the table came from our arenas
  - all PTEs cleared back to it are from our arenas
  - no other users of those PTEs exist
  - all our arenas have our reqid

With this, we should be able to just whack the whole table (by finding and freeing the arenas with the reqid), skipping all the rethreading to the global freelist, and immediately releasing the memory back to the system. This sounds possibly useful later.

release_arenas(reqid)

Ive separated this deliverable because private-arenas in ptr-tables can be mostly validated without it (using the stub) and because in some respects its our 1st new feature, where the previous focus was on refactoring the existing code to accommodate the feature.

The 1st test of this code will be in perl_destruct

  release_arenas(&PL_body_roots[$_]) foreach @sv_types;
  release_arenas(&PL_sv_root);

Then we call it from ptr_table_clear.

support for NYTProf

Given the recent p5p traffic 12/20 (link above), I think this path to private arenas helps; it adds support needed beneath the fancy freelist pushing-and-popping briefly described there. What nytprof needs will take further study.

register_arena_consumer

The design thus far does nothing to protect (or even advise) of reqid trampling between 2 users, get_arena() implicitly allows callers to start new reservations with the given ID, which allows sharing amongst knowing users. Formal registration will provide at least advisory protection. This could be done with a flag too.

Semi-Deliverables

These have real merit in my estimation, but are rather speculative, and I'm reluctant to call them committable deliverables. I think think they help illustrate the potential of the above work.

use less memory, pte

One way to nudge this rock foward is to plug in a 2nd (private) ptr-table-* function set, addressing the Storable::freeze use case.

I suspect however that freelist pushing and popping, along with get_arenas() and release_arenas(), will ultimately be a better tool than this specialized fix for PTEs, but it serves as a point of discussion (strawman); we dont even have decent terminology yet, let alone a few paths forward.

use my_arenas

Storable::thaw() might want to put the perl-data it vivifies into a constrained region of memory, as this may improve processor cache performance, especially with their modern prefetch systems. So would perl routines, such as parsers, data generators, etc.

  # Doing it lexically would be nice;
  get_tight_hash {
    my $var;
    use my_arenas depth => 1, 'xs';
    return { Storable::thaw($packet) };
  }

Here, my_arenas seeks to capture only SVs in the contained xs scope (the thaw), and those in {} composition. depth => 1 sounds safest wrt the spiderweb problem, N might be nice if it makes sense (depth=>0 makes me nervous). I also suppose that xs might somehow be different than just depth => 1.

This doesnt attempt to migrate perl data into a container; that would be tantamount to lifting the spiderweb without damaging it, and is out of scope here. But this may shed some light in a dimly lit corner.

Inch-stones

The deliverables above are largely self explanatory, but will also include responding and resolving issues; they're then largely defined by porters and particularly pumpkings.

Tim Bunce, given his interest for nytprof, will hopefully offer guidance as to what he needs, Id treat those as immediate goals.

There are no doubt numerous knock-on effects to the rest of core, some of these will be in-scope, though I hope not all.

setrlimit()d sandbox, oom tests. work this into fresh_perl, maybe wrap this as sandboxed_perl().

p5p discussion, review, responses, revisions, variations, etc.

Project Schedule

1-2 months

Bio

Ive been hacking in perl for a while

  [jimc@groucho perl-git]$ git log blead | grep Cromie | wc -l
     102
  Ive also hacked in pertinent parts of core, ext/ code:
  - added arena-sets into the arena allocator
  - reworked the body-allocator around S_more_bodies
  - helped refactor sv_upgrade (Nick did the heavy lifting)
  - added struct body_details (says the blamelog)
  - extended B::Concise feature set
  - implemented OptreeCheck and tests using it

The Perl Foundation: 2010 Grant Proposal: Perl Compiler

Perl Compiler

Name: Reini urban
Email: [hidden email]
Duration: Until March 2011
Amount Requested: € $1000 (just for motivation)

Synopsis

Fix most of the remaining perl compiler, i.e. B::C, B::CC, B::Bytecode bugs.

Improve documentation a bit.

Maintain the planned compiler.perl.org site.

Benefits to the Perl Community

A working compiler.

Faster startup time.

Optionally faster run-time if the B::CC optimizations work out as expected.

parrot? I've worked with them. I gave up. Better a half-ass perl5 compiler now, than the ongoing ... with parrot/perl6.

Deliverables

Extend the testsuite reasonably - but less is more. The full author tests for all tested perls on all platforms needs 2 days.

Fix the existing SKIPs and TODOs

Testsuite passing on my main platforms cygwin, MSWin32, debian5, centos5, freebsd7, solaris10.

compiler.perl.org

More fun, less headaches

Inch-stones

I don't think I need this.

See below at Project Schedule. From now until end of March 2011, the next surf-season.

Project Details

I successfully ported the abandoned compiler to 5.10 and blead and fixed most of the old bugs, so that the tests pass now on most platforms.

But there's more todo. Finding bugs cannot be detailled here. In the core suite are some, in the top100 modules are some, the community will come up with more. Some are known, some not yet. So far all found bugs could be fixed within 1-2 days, sometimes they are just hard to catch.

  1. Adjust the perl core suite and find limitations (runperl issues) vs bugs
  2. Check modules
  3. Check user reports
  4. Check weird platforms, compilers, programming tricks.

CC bugs

Well, some bugs are run-time limitations which will require run-time solutions. The sortcv bug [CPAN #53536] is easily understood but hard to fix. Will need at least 2 days concentration on it.

Planned CC Optimizations

Static initialization of readonly data: SVs, AVs, HVs.

-fcog for strings (copy on grow by using a custom destructor)

Fill in missing Opcodes flags for most optimisable ops. Maybe even automatically.

Check possible type declarations with Devel::TypeCheck, MooseX::Types, attributes and such.

I've finished 50% of Malcom's Todo during the winter surf-holidays, and fixed 90% of Malcom bugs in the last year so I'm confident.

I've already got a sponsor for my conference travel expenses. A tip: They could be persuaded to sponsor this grant also :) (cPanel)

Project Schedule

During summer-time I prefer surfing over coding to keep emotional stability in the coorporate environment. Winter 2009-2010 was very productive, because I got a kick by cPanel who needed it.

For the next coding season I might need further kicks, a mini-grant like this might be enough.

2010: Find and fix all remaining bugs. I suspect there are still 5-6 major ones.

2010: Faster testsuite. Now: 8 min user - 40min author - 2 days all perls + plats.

Until March 2011: CC type and sub optimisations

Later (not part of this proposal)

Until 2012: CC unrolling => jit within perl (perl -j)

Bio

Reini Urban, living in Graz, Austria. Born 1963, pretty old, yes.

Born Lisper, but I've been writing perl programs since 1992 and released my first module to CPAN in 1995, the perl5.hlp file for Windows, created by some pod2rtf.pl. cygwin maintainer (perl, parrot, postgresql, clisp, ...) for a couple of years, and several B::* modules.

I work for a large HW+SW company (>2000 developers), 8-16 o'clock.

Since nobody is able to help me with the compiler it looks like I'm alone. Hopefully this will change! I even had to write my own Debugger. Yes, I'm aware of trucks. Surfing is not risky at all. Bycycling is more dangerous.

Icerocket blog search: yapc: A Pic of Perl’s Benevolent Dictator For Life. A photo from YAPC::2009 ...

Larry Wall was such a nice and humble person. It was great to meet the creator of Perl and it’s BDFL.

Icerocket blog search: yapc: Ligations failed

<B>yapC</B>-gfp, rodz-gfp did not yield PCR products of correct size. Possible next steps 1) Clean-up of digestion instead of heat treating. 2) Check ligation on gel after heat treating, apparently the ligase is very sticky

Icerocket blog search: yapc: YAPC Europe Foundation financial reports

YAPC Europe Foundation is nonprofit organisation, helping with organisations of YAPCs and Perl Workshops in Europe. They help by giving kickstart bonuses for conferences, workshops, hackathons and provideing payment processing. Recently t hey published financial reports http://www.yapceurope.org ...

Bloglines Search: "yapc": <B>YAPC</B>::EU

Event Description: http://conferences.yapceurope.org/ye201...