Skip navigation
1 2 3 Previous Next

kohsuke

333 posts

(I started cross-posting blogs to my own website.)

I was working on Hudson yesterday which led me to develop this little tool called Bridge method injector.

When you are writing a library, there are various restrictions about the kind of changes you can make, in order to maintain binary compatibility.

One such restriction is an inability to restrict the return type. Say in v1 of your library you had the following code:

 public Foo getFoo() { return new Foo(); } 

In v2, say if you introduce a subtype of called FooSubType, and you want to change the getFoo method to return FooSubType.

 public FooSubType getFoo() { return new FooSubType(); } 

But if you do this, you break the binary compatibility. The clients need to be recompiled to be able to work with the new signature. This is where this bridge method injector can help. By adding an annotation like the following:

 @WithBridgeMethods(Foo.class) public FooSubType getFoo() { return new FooSubType(); } 

... and running the bytecode post processor, your class file will get the additional "bridge methods." In pseudo-code, it'll look like this:

 // your original definition @WithBridgeMethods(Foo.class) public FooSubType getFoo() { return new FooSubType(); } // added bridge method public Foo getFoo() { invokevirtual this.getFoo()LFooSubType; areturn } 

Such code isn't allowed in Java source files, but class files allow that. With this addition, existing clients will continue to function.

In this way, you can evolve your classes more easily without breaking backward compatibility.

For more about how to use it in your Maven project, the project website.

(I started cross-posting blogs to my own website.)

I will be presenting Hudson (with a focus on its Selenium support/integration) at the upcoming San Francisco Selenium Meetup event on Jun 22nd in San Francisco. There are several Selenium-related plugins in Hudson, but running Selenium tests on Hudson involves some initial setup cost. I'd discuss those, plus general-purpose features in Hudson that really work well with Selenium.

In the week after that, from 30th to July 3rd, I'll be in Israel, thanks to JFrog. This is my first visit to Israel, so I'm really excited. If there are any Hudson users there who'd like to meet up, please let me know, as I'm always interested in seeing different deployments of Hudson and learn from those. Or if you are interested in having me do a short on-site work, there won't be any travel cost, so this would be a good opportunity ;-)

Further down the road, I'll be speaking in JavaOne 2010. Historically we have a good number of Hudson committers/users in JavaOne, so we've been doing some get-together. I hope we can do it again, so please stay tuned as the details of the conference develops over the summer.

kohsuke

Interview with DZone Blog

Posted by kohsuke Apr 29, 2010

(I started cross-posting blogs to my own website.)

I did a quick interview with DZone about my new company, InfraDNA, which they published on their website. Thank you DZone for the opportunity!

(I started cross-posting blogs to my own website.)

As I wrote in my farewell note, I was working on starting a new company around Hudson. It took longer than I initially anticipated, but it's finally open for business!

The company will provide two things; one is support, so that I can answer your questions and problem reports in a timely fashion, and the other is consulting, so that I can help you develop custom plugins, or provide on-site support to work on some tricky problems.

The name of the company is InfraDNA because I think of Hudson more as an infrastructure on which all kinds of server-side automation/tools can be built/deployed, and because I think this stuff is built into me (as in DNA) — when I look back my career as a software engineer, I always somehow seem to come back to tooling. (Plus, the domain name was available!)

Looking forward to hearing from you.

kohsuke

POTD: GitHub API for Java Blog

Posted by kohsuke Apr 18, 2010

(I started cross-posting blogs to my own website.)

My project of the day (or "POTD") is GitHub API for Java — a library for accessing GitHub programmatically.

As the Hudson community is embracing plugins developed in Git more and more, I needed to interact with GitHub as a part of the community infrastructure automation. I did a quick Google search to locate existing implementations, but unfortunately I couldn't find anything good. So I decided to just write my own. Thanks to a reasonable API design of GitHub and a good documentation, it was very easy to do so. The trick is to use the right library, which handles most of the JSON/Java databinding.

The library so far only covers the part of the GitHub API that I care about, which is a small subset of the entire GitHub API. But hopefully this library is easy enough to extend so that other people can add the remaining APIs. The source code is available inGitHub.

kohsuke

Hudson console markups Blog

Posted by kohsuke Apr 14, 2010

(I started cross-posting blogs to my own website.)

Despite all the report comprehension in Hudson, such as JUnit, PMD, FindBugs, etc., log files still hold a special place in terms of capturing what has really happened. Hudson does a bit of AJAX in this space to let you follow output as it comes, but the log is basically just a plain text that doesn't really have structures.

But that is changing. One of the recent improvements in Hudson is the infrastructure and extension points for Hudson (and its plugins) to mark up the console output to improve interactivity and do some cool stuff.

I prepared two kinds of extension points for this. One is the ability to scan the console output line by line and add arbitrary markup to it. This can be used for context-independent markup, for example to turn URLs into hyperlinks, look for keywords like "ERROR", that sort of things.

The other kind is more interesting, where we can place anchors (I call them 'notes') at arbitrary points during the output, and those notes can then in turn generate markups. This enables highly context sensitive markups, which I think has a lot of potential.

For example, I started putting a note for every Ant target that gets executed during the Ant execution. I can use this to generate outline for the console output, so that you can jump to the interesting targets, or move up/down to next target very quickly. For simple build scripts, I can let users click the target name and jump to its definition in the build script.

Another place I do this today is when Hudson reports an exception. I can make a stack trace foldable so as not to overwhelm users, and I can also hyperlink each stack trace element to its source file, as a way to encourage people to start hacking Hudson. Or if a build fails, I can present an UI that gives you actions that you might want to take --- 1. edit config, 2. rebuild, 3. report to the admin, etc.

With Maven, where Hudson puts a little spying agent inside the Maven process, I can do even better. For example, wouldn't it be nice if you can hide all the "[INFO]" message with one mouse click? How about a navigation from compilation failure reports to source files? Or if you have an outline of modules that were built and jump to them quickly?

If you are an user, this is just a sneak preview into what will come. If you are a plugin developer, think about all the things you might want to do with this mechanism!

My project of the day (or "POTD") is Custom Access Modifier— an annotation and an enforcer that lets you define application-defined custom access modifiers,

So let me explain this a bit more. Say you have a library that people use, and say you are thinking about deprecating one of the methods. Yes, you can just put @Deprecated, but that doesn't actually prevent people from continuing to use them. This is where you can put the custom access modifier, like this:

public class Library {
    @Deprecated @Restricted(DoNotUse.class)
    public void foo() {
        ...
    }
}

This causes compilation to fail for new source files that try to call the foo method. But at the resulting class file still contains the method, so existing applications continue to work. As per the JVM spec, this contraint enforcement is strictly in the user land and thus voluntary, and at the runtime there's no check nor overhead.

Or say you have a "public" class that's never intended to be used outside your library? Not a problem.

@Restricted(NoExternalUse.class)
public class FooBarImpl {
    ...
}

In the first version, I packaged the enforcer as a Maven mojo, but it should be trivial to write an Ant task or CLI. A real usability improvement is if this can be done as JSR-269 compatible annotation processor, but unfortunately the enforcer needs bytecode level access to the source files being compiled, and I don't think JSR-269 gives me that, which is a pity.

The real flexibility here is that you can define your own access restrictions, not just using those that I provided out of the box.

The reason I came up with this is to better assist the feature deprecation in Hudson. With 6+ years of the code history, there are a fair amount of deprecated code in the foundation. We'd eventually like to remove them, but we can't just delete them all the sudden — there might be plugins using them out there. But with this plugin, I can actually make sure that plugins are not using those deprecated features that are candidates for removal.

I hope you'll find this tool useful. The source is on GitHub.

kohsuke

Good bye, Sun/Oracle Blog

Posted by kohsuke Apr 5, 2010

I started working for Sun Microsystems since Janurary 2001, when I first came to the US. During these years I was able to work on many different projects, such as MSV, JAXB, JAX-WS, Metro, GlassFish v3, and Hudson, to name a few, with many great people. It was all quite an enjoyable journey. I won't list all those names one by one here, for it will be too long, but if you are one of them, I think you know that I'm talking about you. As my colleague Abhijit said once, a large part of enjoying your work is the people you work with.

So with a bit of sadness and a lot of excitements, I announce that today is my last day at Oracle.

Where am I heading next? I'm actually starting my own company to take Hudson to the next stage. This has always been in the back of my mind, and I'm very excited that I'm finally doing it. Stay tuned for more details, in a week or so. But in the mean time, if you'd like get any custom development/support done on Hudson, please let me know at kk@kohsuke.org so that we can start having a conversation.

Even though I leave Oracle, I'll continue to lead the Hudson project. I'll be working with Oracle to transfer the infrastructure services to their IT operations team. There might be some out-of-schedule releases, service disruptions, and other inconveniences during this period, but hopefully things will be back in order relatively quickly.

And finally, big thank you to everyone in the Hudson community, and in a broader java.net community. I wouldn't be here without you guys, and I feel very proud that I'm a part of it. Thanks for your patronage to my projects, and I hope our relationship will continue.

kohsuke

Hudson Hackathon Day 1 Blog

Posted by kohsuke Mar 19, 2010

Hudson Hackathon Day 1 is over, and I'm just back to the office.

Total of 9 people came and we had a great time talking about infrastructure issues, possible enhancements, design dicussions, exchanging tips and plugins that they've developed, and otherwise building personal relationships. It was a beautiful day outside, and fortunately the meeting room had a lot of Sun lights to create a warm atmosphere.

As for me, I didn't get much hacking done, but that's OK because my job there was to help others more than to get hacking done myself.

If you are living in San Francisco bay area, or if you are visiting the area for EclipseCon next week, make sure to come to Hudson Hackathon this Friday 3/19 and/or Saturday 3/20. The plan is to meet up, hang out, chat, hack code, and have fun. If you are planning to attend, please RSVP by leaving your name on Wiki.

We planned this for two days, so that people doing Hudson for work can come Friday during their business hours, and people doing Hudson outside work can come Saturday without conflicting with day job commitments. Friday it'll be hosted at Oracle Santa Clara campus (and I booked a nice conference room that we only use for special occasions), and Saturday it'll be hosted at Hacker Dojo in Mountain View. See the Wiki page for more details.

It should be a lot of fun — please come join us.

James Lorenzen had an excellent blog post about the importance of a descriptive commit comment. I can't agree more.

Unfortunately, I think getting better at leaving better commit messages take trial and error — the way I've learned it is by getting frustrated by the lack of commit messages. So in the spirit of encouraging everyone (including myself) to do a better job, I thought I'd list up what I try to leave in the comments.

  • Bug ID. In fact, bug/commit association is so useful that often you use (or write) programs that analyze these relationship, so it's preferrable for this information to be machine readable.
  • URL to the e-mail in the archive that prompted me to produce a change. In Hudson, often a conversation with users reveal an issue or an enhancement that results in a commit. This URL lets me retrieve the context of that change, and I find it tremendously useful.
  • If the discussion of a change was in IM, not e-mail, I just paste the whole conversation log, as they don't have an URL. Ditto if the e-mail was sent to me privately.
  • The input value and/or the environment that caused a misbehavior. In Hudson, I have this one method that needs to do some special casing for various application servers. When I later generalized it a bit more, commit messages that recorded the weird inputs from WebSphere, OC4J, and etc. turned out to be very useful.
  • For a fix, a stack trace that indicates the failure. Sometimes I misdiagnose the problem, and later when I suspect I did, being able to see the original output really helps me.
  • If I tried/considered some other approaches to the problem and abandoned them, record those and why. I sometimes look back my old change and go "why did I fix it this way — doing it that way would be a whole lot better!", only to discover later that "ah, because this won't work if I've done that!", and I knew I've gone through the same trail of thoughts before. If I'm in a middle of a big change and decide to abandon a considerable portion of it, I sometimes even commit that and roll it back, just so that I can revisit why I abandoned it later.
  • If a change should have been logically a part of a previous change, just say so. If I happen to know the commit ID of the previous change, I definitely leave that, but if I don't remember it, I still try to point to where that previous change was, like roughly when it was made, which file it was made, who did it, etc, so that future myself can narrow down the search space if I need to find it.

What do you try to leave in your commit messages?

kohsuke

ASM incompatible changes Blog

Posted by kohsuke Feb 12, 2010

ObjectWeb ASM is a great library that's used to parse Java class files. It's used in all kinds of projects, such as Hibernate, Corba, JAX-WS, Jersey, Spring, Hudson, to name a few.

But I have a pet peeve to this otherwise great library, namely its insistence on small size (which by itself isn't a bad thing), and its consequences.

One of the choices that made to achieve this was to omit the debug information entirely from the class files, including the line number tables. This is unlike every other OSS Java projects. So you can't step through the ASM code from the debugger, your IDE won't offer any assistance while editing, and it generally makes it hard to use.

Another choice they made to achieve the small size is to ignore the backward compatibility. Whenever ASM changes, and it has to change every so often (at least every time a class file format changes), it doesn't do anything to keep the existing applications working. For example, from 2.x to 3.x, theClassReader.accept method, which parses the class file, has changed in an incompatible way. It used to takeClassVisitor and boolean, which was a flag, but in 3.x it needed to take more flags, so they opted for taking an integer as a bit mask.

It was entirely possible to retain the backward compatibility by defining the following simple one line method, but no, keepingasm.jar smaller was more important for ASM, so they just removed the method instead.

public void accept(ClassVisitor cv, boolean skipDebug) {
  accept(cv, skipDebug?SKIP_DEBUG:0);
}

The end result is that if your application is using one framework that uses one version of ASM underneath, and if you have another library that uses another version of ASM underneath, then there just isn't any way to make them work together — using the latest version of ASM breaks libraries/frameworks/apps that are built with earlier versions. Google can tell you how widespread this problem is.

In the face of this, framework/library developers works around the problem by doing package renaming. So everyone ends up getting their own private copy of ASM. and the result is far from optimal. For example, if I remember correctly, GlassFish now has 4 copies of ASM in it, one used by Corba, one used by EclipseLink, one used by JAX-WS, and another one by Jersey. The CDI implementation might have another one, too. Needless to say, size increase by this far outweigh the minimal size decrease obtained by not keeping the backward compatibility.

I think ASM developers should step back and really think hard if reducing the jar file size by a few 10KBs is really worth this much pain throughout the food chain. More concretely, I humbly propose:

  • ASM ships their jars with full debug info, like everyone else does. If small footprint is important for some users, a minimized jar can be delivered separately.
  • ASM retains backward compatibility in their API for minor releases, and for every API-breaking major release, ASM should deliver an official package renamed jar. This allows me to write a library/framework that depends on ASM 2.x, and my library can peacefully coexist with someone else's library that uses ASM 3.x. Official package renamed jars cut the # of duplicates.
kohsuke

MSI installers for Hudson Blog

Posted by kohsuke Jan 26, 2010

I've finally managed to produce the Windows installer for Hudson, as originally raised by Håkan Reis. Please try it out and let me know how it works.

This one took much longer than the installer for any other platforms, and while I normally think of Microsoft technologies very highly, Windows installers and WiX are a real disappointment. For example, you write the description of the installer in XML, but the language design is such that you need to write an ID for various XML elements even if you never reference them (and up until the previous version you had to write both long names and 8.3 short names) — it definitely set the new record in terms of the badly designed XML language. And once that's over, there's never ending pain of making sure that the upgrade works correctly. Anyway, hopefully that's all taken care of now, and it won't be visible to you users.

One of the highlights of this installer is that it comes with a JRE, to be fully self-contained. This is because Windows users don't normally know what to do with the *.war file, and they generally don't like going to the command prompt and running Java command manually.

Another highlight is the way I use Hudson to build the installer — I run a release process from Unix, but I need to build the installer on a Windows system. To achieve this, I use the "distfork" plugin, and ask Hudson to provision a Windows system while I build an installer. You can think of it as ssh without specifying the host name (and instead I let Hudson rent me an available Windows slave.) Unlike designating one machine to do the job, I can create as many installers as I want in parallel without a slow down.

I think this mechanism can be used for all sorts of batch processing. One more use for my Hudson cluster.

At work, I have two monitors hooked up to my workstation, which gives me about 4300x1600 combined screen real estate (one of them had to come out of my own pocket, but that's a separate story.) When I switched from a single monitor set up, my behavior changed a bit.

It used to be that I have most of the applications maximized, and used Alt+Tab to switch between them, most of the time. The screen wasn't big enough to run two apps side by side.

With multiple monitors, this is no longer the case. At least two, more often three applications are visible. Say a maximized IDE on the primary monitor, Gnome terminal on the left hand side of the second monitor, and Thunderbird on the right hand side of the second monitor. This arrangement lets me work on code while loking at the stack trace of an exception in a terminal, or a bug report in the e-mail, etc.

One of the pains I've been feeling is the lack of efficient way to switch window focus to different applications. And when I say "efficient", I mean by using keyboard, without reaching out to a mouse.

Let's say I'm typing an e-mail, and realized that I need to check code in IDE. For me to switch a focus, I have to first hit Alt+Tab, comprehend the list of windows that appears, then hit Alt+Tab a few more times until the right IDE window gets the focus. That's a lot of time and cognitive overhead.

So instead I developed a little script that lets me shift focus by relative directions from the current window. In the above scenario if the IDE is on the left of e-mail, I hit Win+Left, and I get the focus shifted to the left. Ditto for other arrow keys. This turns out to be much more efficient, as the spacial relationship between windows are easier to grasp and something you can see even all the time. I think this would also scale to even larger number of monitors, something which I'd love to have some time in the future.

The script uses a Ruby library that manipulates X windows underneath. It was originally developed by someone else I forgot, and I had since then started to maintain my own fork that fixes various problems. I then wrote more Ruby code that figures out which window is in the general up/down/left/right direction of the current window. This turns out to be a very interesting puzzle, as it has to cope with heuristics. In the current version, the script considers such things as visible regions of each window, their center of gravity, distance and angle from the current window, etc. Another fun part was to think about the algorithm to compute that efficiently, even though the benefit is largely theoretical and not practical, since no one opens 1000 windows. It currently takes O(N3) for the number of window N.

I think there's still some room for improvements, but I'm generally happy with the result. Beyond that, I think this library can be also used for doing all kinds of other Window manipulations that you may think of.

I use the Compiz commands plugin to kick off a script when a particular key combination is pressed. If you are using Linux/Solaris with multiple monitors, you should give it a try, too.

As a programmer, I spend a lot of time fixing bugs. And a considerable portion of that is the time spent on reproducing a problem. Here is how a typical such session goes. Your user reports that your program doesn't work and throws such and such exception. Or given the symptom he's describing, you suspect some "if" statements to be evaluating to false.

If you are lucky and experienced, you can sometimes fix the problem through careful reasoning, but often you need to collect more data to be able to fix the bug. Logging is one such measure, but I often find myself thinking "if only I could ask him to run the program with a debugger and report what the variable 'x' refers to!" After all, a debugger is an ultimate data collection tool. Unlike logging, your program doesn't need to be written a-priori to report data. Similarly, with an ability to inspect stack frames all the way down, evaluate arbitrary expressions, and modify the state of the target problem, a debugger is a far more powerful trouble-shooting tool. The one and the only problem with asking your user to run a debugger is that it's too difficult.

So to that end, I developed YouDebug.

YouDebug is a debugger but it's not a debugger. It's a debugger, because it builds on top of Java Platform Debug Architecture, and therefore is capable of doing everything your debugger can do — such as attaching to another process, breaking when certain conditions are met, inspect/manipulate variables, and so on.

But at the same time, it's not your typical debugger, because it's not interactive. You don't need source code either. Instead of using point-and-click and GUI, it comes with a DSL-like syntax sugar on top of Groovy that controls what YouDebug would do against the target program. Groovy was chosen so that Java programmers can comfortably write scripts, while still allowing me to do enough magic behind the scene.

Let's say you have the following program, which computes a String and then do substring.

public class SubStringTest {
    public static void main(String[] args) {
        String s = someLengthComputationOfString();
        System.out.println(s.substring(5));
    }

    private static String someLengthComputationOfString() {
        ...;
    }
}

One of your user is reporting that it's failing with theStringIndexOutOfRangeException:

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        at java.lang.String.substring(String.java:1949)
        at java.lang.String.substring(String.java:1916)
        at SubStringTest.main(SubStringTest.java:7)

So you know s.substring(5) isn't working, but you want to know the value of the variable 's'.

To do this, let's set the breakpoint at line 7 and see what 's' is:

breakpoint("com.acme.SubStringTest",7) {
  println "s="+s;
}

Now, you send this script to the user and ask him to run the program as following:

$ java -agentlib:jdwp=transport=dt_socket,server=y,address=5005 SubStringTest
Listening for transport dt_socket at address: 5005

And then the user executes YouDebug on a separate terminal. YouDebug attaches to your program, and your script will eventually produce the value of 's':

$ java -jar youdebug.jar -socket 5005 SubStringMonitor.ydb
s=test

See the user guide about all the other things you can do with YouDebug and how you do it. The download is available from here(I posted a new version here that fixes several bugs), and the source code is hosted here.