Thursday, October 16, 2008

Are you listening?

Did you know that the Eclipse SDK alone has over 250 different listener interfaces? In the following screenshot, look at the size of the scrollbar slider!



This is only the tip of the iceberg - corresponding to all these listener interfaces, there are about 250 different Event classes. This is so that later on, additional event data can be added without breaking binary API compatibility by passing a dedicated event object to the listener. (Remember that where we forgot to use event objects, we ended up with warts like IPerspectiveListener4.)

To top it off, there are probably about 500 methods for adding and removing all these kinds of listeners. addLaunchLabelChangedListener/removeLaunchLabelChangedListener, addLocalSiteChangedListener/removeLocalSiteChangedListener, addLogEntryListener/removeLogEntryListener and so on.

In the end, when you think about it at a more abstract level, there are only a couple of possible changes that can happen in a program. A value can change, from an old value to a new value. A set can change, by adding or removing elements. A list can change, by adding elements at a certain index, removing elements at an index, or moving or replacing elements.

How about we use only three listener interfaces, IValueChangeListener, ISetChangeListener, and IListChangeListener? Or we settle on something that looks like EMF's notification API. We might need pre- and post-notifications, and maybe also changes to maps, so the resulting number of listener interfaces might be closer to ten. Definitely not 500 - and I only looked at the base Eclipse SDK...

What do you think? Can we make the Eclipse API more consistent and uniform in this way? Do you agree that all these domain-specific listener interfaces, each with a corresponding event class, and add/remove methods, are a source of accidental complexity, and settling on a small number of generic listener APIs can reduce code bloat?

Tuesday, October 07, 2008

Avoiding Bloat

Martin Oberhuber asked on the e4 mailing list what had happened to the pervasive architectural themes that were identified at the summit, such as reducing bloat, too many listeners, and becoming more asynchronous. I started writing a response, focusing on one of the topics, bloat, and it quickly became more than just an email response so I am posting it here.

Before we go into the details, let me state the obvious: It is pretty much guaranteed that we will cause more bloat, overall, for the case of the Eclipse SDK based on the new e4 platform, as long as that SDK still contains 3.x plug-ins that require compatibility layers. This is because all the old (bloated?) functionality and the new (lean?) functionality will be there at the same time.



It seems the best we can do is to avoid bloating the new platform itself, when it is used without any compatibility layers. Unfortunately, we have all these cool new technologies that we would like to use - EMF, CSS, declarative UIs, data binding, cross-compiling of Java to ActionScipt, being able to use multiple languages, client-server split, etc. Put them together and the likely result is bloat. Or is there a way to avoid bloat and use cool new technology at the same time?

So what is bloat? Let's look at Wikipedia's definition of software bloat (thanks John Arthorne for pointing me to it):

Software bloat, also known as bloatware or elephantware, is a term used in both a neutral and disparaging sense, to describe the tendency of newer computer programs to be larger, or to use larger amounts of system resources (mass storage space, processing power or memory) than necessary for the same or similar benefits from older versions to its users.

Let me dive into one concrete example, to show why this is a hard problem:

Code bloat through redundancy, caused by low-level API occurs when clients of a low-level API have to write the same boilerplate code over and over again. Think of all the code we have to write for SWT layouts, for example:
Composite contents = new Composite(parentComposite, SWT.NONE);
contents.setLayoutData(new GridData(GridData.FILL_BOTH));
GridLayout layout = new GridLayout();
layout.marginHeight = convertVerticalDLUsToPixels(IDialogConstants.VERTICAL_MARGIN);
layout.marginWidth = convertHorizontalDLUsToPixels(IDialogConstants.HORIZONTAL_MARGIN);
layout.verticalSpacing = convertVerticalDLUsToPixels(IDialogConstants.VERTICAL_SPACING);
layout.horizontalSpacing = convertHorizontalDLUsToPixels(IDialogConstants.HORIZONTAL_SPACING);
layout.numColumns = 2;
contents.setLayout(layout);

Label label = new Label(contents, SWT.LEFT);
label.setText(WorkbenchMessages.FileExtension_fileTypeLabel);
GridData data = new GridData();
data.horizontalAlignment = GridData.FILL;
label.setLayoutData(data);

filenameField = new Text(contents, SWT.SINGLE | SWT.BORDER);
data = new GridData();
data.horizontalAlignment = GridData.FILL;
data.grabExcessHorizontalSpace = true;
filenameField.setLayoutData(data);
Whenever there is a low-level way of doing things, you can come up with a higher-level way and reduce the code size. Of course, you are only reducing the overall code size when the higher-level abstraction is used widely enough to amortize the cost of its implementation. In our SWT layout example, you could write instead:
Composite contents = new Composite(parentComposite, SWT.NONE);
contents.setLayoutData(new GridData(GridData.FILL_BOTH));

new Label(contents, SWT.LEFT).setText(label);

filenameField = new Text(contents, SWT.SINGLE | SWT.BORDER);

Point defaultMargins = LayoutConstants.getMargins();
GridLayoutFactory.fillDefaults().numColumns(2).margins(
defaultMargins.x, defaultMargins.y).generateLayout(contents);
Now this is looking a lot shorter, and maybe even more elegant. However, even if GridLayoutFactory is used widely enough to amortize the additional footprint caused by its implementation, there are still two problems: first, the original code ran faster, and second, you now have to learn two APIs - the higher-level one, and the lower-level one when the abstraction gets in your way.

You can see where I am going - there is no clear cut solution to this. It is really a hard problem, and in many cases, we will have to trade off one of the factors disk size, memory size, CPU consumption against the others.



Taking it just a little further, here is another idea, taken from the wikipedia article on code bloat:

The difference in code density between various languages is so great that often less memory is needed to hold both a program written in a "compact" language (such as a domain-specific programming language, Microsoft P-Code, or threaded code), plus an interpreter for that compact language (written in native code), than to hold that program written directly in native code.

So if we had a domain-specific language for creating SWT widgets and specifying their layout, we could get away with no Java code at all! I don't know if the .class file is a space efficient encoding for SWT widget hierarchies and layouts, but even if it is, consider this: The byte code for creating the widgets will stay in memory for as long as its class is referenced. Chances are that this will be a very long time; at least for the time that particular part of the UI is materialized somewhere. By comparison, if we had a domain-specific language, it would have to be read once to create the widgets and layout, after which the memory could be freed.

So maybe we can have our cake and eat it too! After thinking about this a bit, I am all excited about using cool new technologies, as long as they don't cause bloat.



We also have to be very carfeful not to use multiple redundant technologies to achieve the same thing, because that is another source of bloat. As in, for example, letting everyone plug in their favourite domain specific language for creating SWT widgets and layouts. This kind of redundancy would be just as bad as redundancy through repetitive boilerplate code, so let's pick one way of doing declarative UIs!

Note that there are lots of other sources of bloat, for example, unneeded functionality, too many layers of abstraction, or unnecessary flexibility. I am running out of time but it is probably interesting to think about these as well. I'd like to know if you have any pointers for me in the comments!

If avoiding bloat is one of the goals of e4, we need to keep this goal in mind all the time. Every bit of functionality should be pulling its own weight. For example, do not add convenience API unless its additional weight can be justified by reduced weight somewhere else.

I believe we should start watching our weight from the very beginning, and from time to time, it is probably healthy to discuss the weight of the various pieces. I can't wait until we have some kind of continuous build in place, so that we can make it visible for everyone how big (or small!) the components are, and how they are growing (or shrinking!) over time.

We could also borrow some ideas from the business world and introduce budgets. You want to provide a component for declarative UI? How about you get an allowance of 300 K? Would that be enough?

What do you think?

Thursday, September 25, 2008

Accidental Complexity

You can blame bit rot, API backwards compatibility, plain stupidity, or that we just didn't know any better at the time. Whatever the cause, we have introduced a good deal of accidental complexity over the last eight years or so, and with e4, we have a chance to reduce this or get rid of the "accidental" part altogether.

Here are some examples. If you know of other examples, please let me know!

First example: We have lots of early attempts to define API that had to be tweaked later; to see what I mean, press Ctrl+T for the Open Type dialog and enter 'I*2<' to see all the interfaces where we had to introduce a second and hopefully better version. Like IPerspectiveListener2, which was introduced in 3.0. This example is particularly embarassing, because we introduced IPerspectiveListener3 in 3.1. And, believe it or not, IPerspectiveListener4 in 3.2. Boy, am I glad that this has finally converged, or we would be looking at IPerspectiveListener7 in Eclipse 3.5.



Second example: The confusing (and untold) preferences story comes to mind, as a representative of a whole class of problems. Basically, whenever we ended up with many different ways to do the same thing, we were not able to remove old code because clients still depended on it. I am optimistic that the capabilities offered by the platform can be pared down to a manageable list. Ideally, down to something like twenty services, sometimes called "the twenty things" or the "Eclipse Application Model" in the context of e4. Like being able to persist data. Receive input. Produce selections. Schedule background work. Report progress. Provide pointers into the help system. Contribute to the menus and toolbars. And so on, but I don't have a full list at this point so I should write about it when I know more.

Third example: Like many other Java APIs, we live in a kingdom of nouns. We are sometimes joking how Eclipse, for every concept, has an adapter, a factory, and a manager. And an adapter factory, and a factory manager. Sadly, this is not a joke at all. ContextManagerFactoryModifierHelperFactory. AbstractRefactoringDescriptorResourceMapping. I wish I could make concrete suggestions for how we can improve on this, but I am afraid we need to look at these APIs in detail. It's just a gut feeling that names with three or more three nondescript nouns are making things more complicated than necessary. By the way, my mother tongue is German, so I should be used to putting many nouns together, but I still find something like INodeAdapterFactoryManager is way over the top.

Fourth example: a good deal of complexity and bloat is caused by the proliferation of preference pages, leading to a countless number of lines of code that supports all the different combinations of all the supported preferences. Are we really helping our users by exposing and maintaining all these options?



That's it for today. I'd love to hear what you think about this, or if you have more examples of accidental complexity.

(Disclaimer: I am well aware that e4 will need to be backwards-compatible, so that 3.x plugins continue to run. When I wrote "get rid of" I meant something a little more subtle, as in "move it to compatibility plug-ins so that adopters of e4 don't have to worry about it when they develop new functionality.")

Monday, September 15, 2008

Better Than Newsgroups?

For programming-related questions and answers, check out stackoverflow.com, which launched today with a public beta. It looks very interesting - a forum, crossed with digg (user-rated), wiki capabilities (user-edited), and delicious (tags). The hope is that these three mechanisms will make the site more valuable over time, as questions are answered, rated, tagged, and edited for clarity.

Currently, the site is a bit Microsoft-heavy. For example, C# has twice as many questions as Java, and EMF refers to the vector graphics "Enhanced Metafile Format". Still, Eclipse is well represented with 58 questions compared to Netbeans with 5.

I find it amazing to watch how questions come in at a rate of roughly one per minute, and how they get answered at a similar rate. It's just like on the EMF newsgroup, but so much more Web 2.0!

Friday, September 05, 2008

Simply models

I have been working on a small demo application that (when finished) is meant to be an exemplary RCP-style desktop application based on e4. This is how it looks so far:



There is a very simple navigator on the left, a thumbnails view, and a preview area.

As I wrote the application based on Eric's and Tom's work, I started by implementing a couple of views that I contributed to the new e4 Workbench through an extension point.

Then I realized that all I need is an EMF model representing the parts that make up the UI: the views, the menu, and the toolbar. I don't actually need any extension points, because my application is not an extensible IDE - it is a rich client application for which I know how the UI should be structured. I am pretty happy about the end result - my application is fully described by one XMI file that I can edit using an editor for EMF models, and I can just go and modify that file if I want to rearrange views, add new ones, delete unnecessary ones, build my menu structure and so on.



Compare this to how it works with the current 3.x Workbench: I would have to write a perspective extension, a couple of views extensions, a command extension, a menu and a toolbar extension. Then I would have to assign unique IDs to all these pieces, and to ensure that IDs match up properly. How many hours we've wasted in the past when IDs did not match up!

Of course, this only works for closed applications, not for open and extensible ones like e.g. the Eclipse IDE. However, it is still pretty useful to be able to reason about a concrete EMF model representing the UI application structure. For example, when you have lots of little pieces contributed by many plug-ins, we can now talk about what it means to produce that concrete model from the many little pieces. One interesting question based on this is: Can we maybe generalize the many ad-hoc mechanisms we have grown over the last years (most of which involve matching up IDs) to something that is less ad-hoc and more general?

It's been fun to work on this little application so far. If you would like to participate, let us know on the e4 mailing list, and/or attend the bi-weekly e4 conference calls. You can always email me directly, of course.

P.S. The code is available in CVS - check out the ui/demos project under /cvsroot/eclipse/e4-incubator to get the launch configuration and project set file (emf-workbench.psf).

Thursday, June 05, 2008

Unpaid Volunteers Do Exist

Bjorn blogged about a recent discussion on the SWT and Foundation newsgroups, and claims that Eclipse does not do enough to attract unpaid volunteers.

From my little corner of the ecosystem, I can offer a counterexample. As of today, three of the nine active committers on the Eclipse Platform project's UI component (JFace, Workbench, IDE) are unpaid, part-time volunteers. It seems to be attractive enough to be a Platform committer, at least to some, resulting in 33% active Platform UI committers who are not employees of IBM.

Interestingly, they are all 'unpaid volunteers' (this applies to two more who are not listed as active committers now but were active in the past). They are independent consultants, employed software developers investing some of their spare time, or come from small companies that use Eclipse technology. To me, this contradicts Bjorn's assumption that Eclipse is only 'paid volunteers'. In fact, I would like to understand why in the past, we haven't seen 'paid volunteer' people approach us because they want to contribute to the Platform. (It looks like the e4 effort may change this.)

I admit that we could be more open and transparent over just explaining how to contribute, and hanging out on IRC and the newsgroups. However, what we cannot do is invest a lot of time into contributors who only contribute once (as opposed to ongoing, even if it is part-time). Working with the community on their contributions takes a lot of time. This is why we invest more time on those who may turn into committers at some point: if/when they become committers, we can hope for a return on our time investment.

By the way Bjorn, by disagreeing with one of the replies saying: "this is entirely a volunteer effort", you disagreed with an actual unpaid volunteer, one of the part-time non-IBM committers on the Platform UI component. Francis managed to break through the "glass wall", and the Rizzo Ceiling! I would recommed that you read Francis' post again in this light. ;-P

P.S.
A large part of the high barrier of entry and learning curve is inherent in what we do, and results from all the IP rules, process rules, API compatibility rules, accessibility rules, internationalization rules, performance considerations, architectural integrity considerations, and some more that I probably forgot. I don't think we can do much about this, but I agree with Francis that we need to get a lot better at encouraging contributions and contributors.

Thursday, April 10, 2008

e4 summit: May 22nd/23rd

I hope that all interested parties are already subscribed to the e4 mailing list. For those who are not, just an update that we now have a date (May 22nd/23rd, 2008) and a location (Ottawa) for the e4 summit, and that work on the agenda has started. If you are interested in working on the new Platform, please consider attending the summit and add yourself to its wiki page.

Wednesday, March 19, 2008

e4 pointers, and a teaser

This is just a quick update for those who would like to contribute to e4, and those who would like to get our demo code to compile and run. I just updated the e4 wiki page (wiki.eclipse.org/E4) with a link to the mailing list, and instructions on how you can run the demos.

Oh, and here is a screenshot of the demo that was not shown in the Eclipse 4.0 talk. Come to the Eclipse 4.0 (e4) Kick Off BoF tonight to see it live:

Wednesday, March 05, 2008

Pre-emptive snarky comment

It appears I am not the only one reading the "Old New Thing" blog from a Microsoft employee who sometimes ends his posts with pre-emptive snarky comments, like for example:
I enjoy reading that blog, and not for the snarky comments. I find it interesting to read from someone who is working in a similar position but different context: there seem to be striking parallels with similar experiences we have made on the Eclipse Platform team.

Anyway, it is time to make a pre-emptive snarky comment myself. It is not actually related to this blog entry. It is related to an e-mail announcing the creation of a new component e4 in the Eclipse Incubator project:

"Just as I expected from those guys at IBM. We'll never see diversity in the Platform."

Now that this is out of the way, here is the scoop: We want to make the code for our EclipseCon demos available in the open. We realized (admittedly, very late) that nobody from the SWT team had commit rights in the existing Eclipse Incubator project. Creating a new component in the Eclipse Incubator project was the fastest way to create a home for the experimental code that we will be demoing, with write access to everybody who has been involved so far.

The key words are "so far" - our hope is to find more people and companies who would like to work with us on e4 - our current name for the next-generation platform on which an Eclipse 4.0 can be built. To find out more about this, and to see the demos, come to these two talks at EclipseCon: Eclipse 4.0, and The Future of SWT.

By the way, when I wrote "we", I was referring to people at IBM, Innoopract, and Code 9. So far.