We’re happy to announce that the changes we’ve been planning to our GitHub authentication integration are live in our production environment!
As we described in an earlier post, we’ve changed our OAuth model to allow users to select the privilege level they give Tddium to communicate with GitHub. Now, when you link a GitHub account, you’ll see a menu of privilege levels that you can authorize: You can always change the level you’ve authorized by visiting your User Settings page, where you’l see the same menu. For more information on Tddium’s use of GitHub permissions, see our documentation section.
At Solano Labs, we believe that a seamless integration between our service and our customers’ tools provides the best user experience. Many of our customers today use GitHub and have connected a GitHub account with their Tddium account using OAuth.
We take the security of our customers’ code very seriously, and we’re making some important changes to our GitHub OAuth integration that should give you much finer-grained control over the privileges you give Tddium to operate on your GitHub account.
What we do now
Our current GitHub OAuth functionality requests nearly complete permissions to your GitHub account (“user,repo” scope in GitHub’s API terminology). Tddium requests these privileges so that it can fully automate the setup of the CI workflow (commit hooks, deploy keys, and keys to install private dependencies). Our updated GitHub integration allows for multiple privilege levels so that you can make a tradeoff between permissions and automated setup.
In the next week or so
we’ll roll out changes that will:
- Allow basic Single-Sign-On with no GitHub API access otherwise.
- Let you choose between 3 privilege levels that allow Tddium to:
- post commit status to update pull requests (for public and private repos)
- automate CI webhooks and deploy keys for public repos.
- automate CI webhooks and deploy keys for public and private repos.
- post commit status to update pull requests (for public and private repos)
- Give instructions on creating bot Github users to allow your builds to pull dependencies installed from private GitHub repos.
If you have already linked your GitHub account, it will continue to be linked, and will give Tddium the current high level of permissions. After the rollout, you’ll be able to easily edit Tddium’s permissions on your GitHub account on your User Settings page.
We look forward to your feedback at firstname.lastname@example.org.
The Solano Labs Team
by Carl Furrow of Lumos Labs
Making sure your test suite runs quickly ensures that it will be run often. We at Lumos Labs (lumosity.com) have been working on an in-house Jenkins CI setup to run our ~2500 tests across ~360 files in under 10 minutes. Our Jenkins setup consists of about 24 executor VMs. For each build we allocate 12 executors, and each executor would get a subset of the total files to be run. For example, with 360 test files, each executor VM would be responsible for running 30 test files each.
Under this configuration a build would complete in anywhere from 12-20 minutes. Which is fine when the production releases are coming slow, but it’s an eternity when three or more people are queueing up changes that need to go into production. Running the suite locally can take 45 minutes to run just the rspec tests, so parallelization is necessity when needing to test the entire suite.
As our company grew, and more developers were creating feature branches, more CI builds were being queued in Jenkins. With the limited number of executors in Jenkins that we had, builds were queuing up. If you were behind 2-3 people in the build queue, that would mean you’d be waiting up to 30-40 minutes for your build to even start running! It was becoming a headache for all of us, so we looked at increasing the number of executor VMs, as well as beefing up the processing power of each one.
Adding more VMs to the cluster brought on additional headaches. With the increase in speed, we were noticing more segfaults occuring during the builds, marking it as a failure. But re-running the build would usually get it to pass. We spent many hours debugging the different environments, gems, etc, trying to determine where the segfaults were happening, and eventually coded up scripting solutions that could detect a segfault, and re-run the subset of tests where the segfault occured. Not a permanent solution, but it was one that could get our builds passing more often without these ‘flickering’ segfaults. Coupled with this was a constant hunt to determine whether or not a failed cucumber scenario was a legitimate one, or perhaps something related to Capybara Webkit. More developer time was spent re-working our selectors and specs that hit Capybara, which was time well spent, but it took a long time to re-code, and deal with version changes to the Capybara API. Obviously you cannot rid yourself of all responsibility, but running tests and managing our own servers was becoming tedious (wait for it).
Knowing that we wanted to stop managing our own CI environment, we went looking to CI service providers (hosted, and self-hosted) to see how they would perform. Unfortunately, we ended up investing days into configuration and setup, and still the test suite times were worse than what we were seeing in our own setup. It seemed we had the best CI around for us, and we’d have to give up on finding a hosted CI service that could be easy to setup, plus, and more importantly, faster than what we currently had. So we started building a beefier set of servers and VMs to run our Jenkins setup, and that was promising, but it was expensive.
Flash-forward to a testing-related meetup this past August, hosted by Solano Labs in SF. They showed off their hosted CI product, tddium, along with a general discussion on testing strategies and horror stories. I had a chance to talk with co-founders Jay and William about our current CI setup, and they felt strongly they could improve the running time, if nothing else.
After setting up the trial account, creating a tddium.yml configuration file, and working with Solano’s support staff to setup an environment that more closely resembled our current Jenkins setup, I had a green build!
Today most of our builds run in about 5 minutes on 1.9.3-p327.
We even had our ruby2.0.0-p247 branch under 4 minutes!
Now that our tests are run via tddium, we’ve phased-out our Jenkins setup, and the testing queue has been all but eliminated. We ended up with a setup that allowed three builds going at once, and that seems like the sweet-spot for us with builds taking about five minutes apiece.
|System||Average Build Time||Executors/Workers||Speed Improvement %|
|Self-Hosted Jenkins||17 minutes||12||-|
|tddium (ruby 1.9.3-p327)||5 minutes||24||340% improvement!|
|tddium (ruby 2.0.0-p247)||4 minutes||24||425% improvement!|
At approximately 2:14pm PT on Oct 24, 2013, Tddium’s DB master server experienced a CPU usage spike that cascaded into to a server stoppage. No data was lost.
Examining data (thanks New Relic!) and logs, our conclusion is that though average usage hovers around 20-30%, our DB master has burst CPU usage close to 100%. Once postgres crosses into “queue backup” territory, it never comes back.
Tonight, we will upgrade our DB cluster to use faster servers. This upgrade should only take a few minutes, but it will require the app to be down.
We appreciate your patience as we address these infrastructure issues.
- The Solano Labs Team
tddium rerun <session_id> # rerun failed tests from session tddium describe <session_id> # show session details
tddium describe` with a small shell-script wrapper:
rspec `tddium describe $session_id --names --type=rspec`
Note: `tddium rerun` is pretty simple right now — it doesn’t do much in the way of local sanity checking, so if you ask it to rerun the tests for the wrong repo, it’ll happily try.
by Drew Blas, Software Engineer, Chargify.com
At Chargify we rely heavily on automated testing to ensure that we always maintain a working app. With so many customers and a heavily utilized API, it’s critical that we maintain complete backwards compatibility and ensure we don’t impact existing customer operations. That’s why our test suite consists of thousands of tests for gateway interactions, workflows, and response formats. Unfortunately, it also made for painfully slow development. By integrating Solano Labs’ TDDium, we were able to reduce our complete test suite run from 2.5 hours to 20 minutes. This incredible improvement helped to promote a radical change in our testing attitudes while keeping our deployment cycle as fast and agile as possible.
What we were doing before Solano Labs’ Tddium
Our previous, homegrown Continuous Integration environment relied on single server to do each build. Unfortunately, because of the intense load from our test suite, we had numerous limitations. We could only run a single build at a time, we often couldn’t see results until it was complete, and debugging test failures was a major pain (that involved sshing into the server to try and extract environment-specific issues). Worst of all, waiting for several hours between builds meant we weren’t able to get quick feedback about broken tests and prevented us from truly making good use of our tests during development.
These factors meant we not only spent a lot of time ‘fixing’ the build, but that we often skipped steps or didn’t test properly because of time constraints. Thanks to TDDium, that process has been greatly streamlined so that we can perform TDD the way it was meant to be.
Faster Testing with Solano Labs
Of course, with a codebase as big as ours, switching to TDDium was not instantaneous. We found a lot of areas in our tests that had to be improved or refactored. Some of these changes were due to the different runtime environments, but most had to do with brittle tests that did not respond well to the randomized distributed testing model. Many of our tests were order dependent or conflicted with other tests and couldn’t be reliably run in parallel. However, TDDium did a great job of giving us the tools needed to re-create and fix these issues. There’s extensive logging available and even an option to pinpoint the exact tests and ordering used in particular run. In the cases where we needed help, the support from Solano Labs was top-notch. They worked side-by-side with us on issues where we needed assistance and saved us even more time. All this helped us to decouple our tests and prepare them for highly-distributed execution. The process was definitely worth it: we wound up with much more robust test suite that runs in a fraction of the time!
Having a test suite that runs quickly means we can get much faster feedback about changes that we make. Instead of waiting until the next day to see if a simple change ‘breaks the build’, we can instead update, test, package, and deploy at a much more rapid pace. Reducing the turnaround time on getting code to production is truly a lynchpin in any agile operation!
Easy database support!
Surprisingly, one of the easiest integration points with TDDium is the database support. The connections are mostly automatic and it’s easy to specify multiple databases in specific versions. Our tests rely on Redis, Mongo, & MySQL, and all worked basically out of the box! External service integration (like Campfire & Github) has also helped to make our lives easier and improve our team’s workflow.
Help with Rails Upgrades
Finally, we’re especially thankful to have set up TDDium before we started on our major upgrade from Rails 2.3 to 3.2. It was a huge undertaking and the ever-watchful eye of our tests is a big reason for our success. Constantly running builds and getting rapid feedback about major architectural changes allowed us keep making progress while not affecting our day-to-day development. TDDium gracefully handled the constantly changing configuration of our app with aplomb!
Ultimately, TDDium has given us a wonderful collaborative environment for running our tests AND provided an order of magnitude improvement in build time to keep our development team happily coding away. Thanks!
by Arian Radmand CTO @ www.coachup.com
The CoachUp engineering department is constantly refining its development process for the sake of efficiency. I wanted to spend some time talking about one change we’ve recently made that I really feel has maximized our development speed: setting up Tddium’s continuous integration environment (solanolabs.com).
I should begin by talking a bit about our development process at CoachUp. First, we attempt to get new features into production as quickly as possible. We push new releases of our code into production every single day. To ensure that we do not break existing functionality, we put a heavy emphasis on testing. More specifically, we put a heavy emphasis on automated testing. We’re a Ruby on Rails shop, so we leverage the great built-in testing functionality that comes with Rails (primarily rspec, but a variety of other utilities as well). Although none of our engineers subscribe to TDD fully, we are meticulous in ensuring that every piece of functionality that enters our codebase is accompanied by a corresponding set of tests. We have operated in this way since the company was started.
As a result of our practices, the test suite has grown larger each day, which has been both a blessing and a curse at the same time. The engineering team has a policy of not pushing any release to production if our test suite is not completely green. We realized that it was great that we were maintaining adequate test coverage across our application. But with a large test suite, the problems we ran into were twofold:
1. The amount of time in which our test suite ran constantly increased
2. As it took longer and longer to run our test suite, the frequency of test suite runs began to decrease (after all, we were running everything manually)
Our solution: Tddium continuous integration environment.
For those of you unfamiliar with Tddium and continuous integration, I’ll explain a bit about how we’ve integrated Tddium into our dev process to make us faster and more efficient. At CoachUp, we used Tddium to address the two problems mentioned above. We signed up for a Tddium account, which involved hooking up our github account and selecting a plan in accordance with the size of our test suite. After we were set up, the rest was really effortless!
From our perspective, we basically just develop as normal: create a new feature branch from our github repository, develop, push to github, issue a pull request, and merge when ready. In the background, Tddium works on our behalf to do several things. It will monitor our github repository and when a new feature branch is pushed up to github, Tddium springs into action by grabbing the new feature branch and cranking through our test suite. We then conveniently get an email report sent to us detailing the results of the test run. If the new feature branch introduces a regression bug, we know about it immediately and can fix it well before it even has a chance to become a problem. Further, Tddium makes it super easy to switch plans and add/remove workers based on your scaling needs.
For us, the move to Tddium greatly cut down on development time by letting us really step on the gas and develop at a face pace, knowing all the while that we would be notified immediately if we introduced any regressions into the codebase.
We’re constantly trying new tools and processes here at CoachUp to make us more efficient, and Tddium has been one of our biggest successes.
Bottom line: if you’re looking for a quick, easy, non-intrusive way to speed up your test suite and make your dev team more efficient, definitely check Tddium out!
This is what has worked for us. What tools have others used to boost the efficiency/productivity of their team?
See the post live on their site: http://engineering.coachup.com/continuous-integration/