At HubSpot we use Graphite for most of our operational graphing needs, tracking such things as server uptime, response times, response codes (errors, etc.), cache layer capacity and hit rates, and more.
We also use Jenkins (formerly Hudson) for continuous integration, since we like to deploy very often. Jenkins, like all systems, can have capacity bottlenecks where there are no build executors available to run a build. This results in delays that frustrate developers and slow us down.
But how often does it happen? When should we add Jenkins nodes to our cluster?
One of our developers, Jeremy Katz, has recently written a script to track just this using the Jenkins API, and push the results to Graphite for easy viewing and analysis. He went a step further, open-sourcing his script under the Apache License.