GTAC 2015: Presentations

Opening Remarks

Yvette Nameth (Google)

Opening Keynote

Jürgen Allgayer (Google)

The Uber Challenge of Cross-Application/Cross-Device Testing

Apple Chow (Uber) and Bian Jiang (Uber)

Links: Video, Slides

Soon after joining Uber in March 2015, we encountered an Uber-unique challenge while investigating UI testing tools for our mobile applications. Many of our sanity tests require our rider application and driver application communicating/coordinating their actions with each other in order to complete the end-to-end testing scenario. In this talk, we will present our platform agnostic solution, called Octopus, and discuss how it coordinates communication across different apps running on different devices. This solution can be adopted for any tests that require coordination/communication across different apps or devices (e.g. a multi-user game, multi-user messaging/communication app, etc.)

Robot Assisted Test Automation

Hans Kuosmanen (OptoFidelity) and Natalia Leinonen (OptoFidelity)

Links: Video, Slides

OptoFidelity is a Finnish high-tech company with 10 years of experience in developing and delivering R&D test automation solutions. This talk will include our experiences and future outlook of non-intrusive test methods used in mobile device UI performance testing. Did you know that Chrome OS team uses a robot solution from OptoFidelity to measure end-to-end latency of Android and ChromeOS devices?

Juggling Chainsaws for Fun and Profit: Lessons Learned from Mobile Cross-Platform Integration Testing

Dan Giovannelli (Google)

Links: Video, Slides

Mobile development is hard. Building test infrastructure is hard. Working cross-platform is hard. Combine the three and you have a recipe for disaster. In this talk, Dan Giovannelli will share his experiences working on a cross-platform mobile test infrastructure project. He'll discuss the things that went right, the things that went (very) wrong, and what he wishes now he knew starting out. Come for insights on designing mobile tools for non-mobile engineers, stay to find out just what the heck The Matrix is and how to beat it at its own game.

Mobile Game Test Automation Using Real Devices

Jouko Kaasila (Bitbar/Testdroid)

Links: Video, Slides

Mobile Games are the biggest money making category in today’s app stores so ensuring that every version of each game works on any user’s device is a high priority for any game developer. Despite the importance of validating this, there are very few examples or frameworks for automating testing of mobile games, forcing game developers to resort to manual testing that doesn’t scale to the extent that gaming companies would need to cover their global market. One main reason is the unique nature of games as mobile apps because they are accessing the screen directly and bypassing all UI services provided by the OS and rendering most test automation frameworks useless since traditional objects are not exposed.

Luckily there are ways to use standard mobile test automation frameworks to drive test automation on real mobile devices for games by using some creativity and publically available libraries. In his presentation, Jouko Kaasila from Testdroid will discuss three different approaches with real world examples and some sample code.

How to Component Test Soup Dumplings

Toni Chang (Google)

Links: Video, Slides

People that have spent too much time stabilizing flaky tests would agree that we need to decompose tests. However, some find it difficult and not sure how, others may be challenged by teammates that believe we need E2E test to validate all scenarios. Since it is sometimes difficult to get the idea when you are not used to viewing your product in components, I will use an abstract example of a Soup Dumpling to demonstrate how to break down what seems to be an inseparable piece to components and apply tests to them.

I will take you through a fun journey of translating E2E test into component test that will give you confidence in the final product. Hopefully this will give you a fresh perspective when you look at your own product.

Chromecast Test Automation

Brian Gogan (Google)

Links: Video, Slides

The internet of things has led to a proliferation of connected devices. Validating behavior across diverse interoperating devices is a significant testing challenge. To test Chromecast, several approaches were taken. We outline test frameworks, lab infrastructure and test tooling that we developed to generate reliable quality signals from the product. We detail the challenges of testing a product that operates in noisy networked environments. We propose that test tooling for devices such as Chromecast is in its infancy and presents opportunities for innovation in software test engineering.

Using Robots for Android App Testing

Dr.Shauvik Roy Choudhary (Georgia Tech/Checkdroid)

Links: Video, Slides

Software robots, such as Monkey can be used to test an android application without much manual effort. There are several such tools proposed in academia whose goal is to automatically generate test input to drive android applications. In this talk, I will introduce a set of representative test input generation tools and present a comparative study to highlight their strengths and limitations. You will learn about the internals of these tools and how you can use them to test your application. The details of the study along with a VM setup with the tools is available at: http://bear.cc.gatech.edu/~shauvik/androtest/

Your Tests Aren't Flaky

Alister Scott (Automattic)

Links: Video, Slides

Flaky tests are the bugbear of any automated test engineer; as someone (probably Alister) once said “insanity is running the same tests over and over again and getting different results”. Flaky tests cause no end of despair, but perhaps there’s no such thing as a flaky or non-flaky test, perhaps we need to look at this problem through a different lens. We should spend more time building more deterministic, more testable systems than spending time building resilient and persistent tests. Alister will share some examples of when test flakiness hid real problems underneath the system, and how it’s possible to solve test flakiness by building better systems.

Large-Scale Automated Visual Testing

Adam Carmi (Applitools)

Links: Video, Slides

Automated visual testing is a major emerging trend in the dev / test community. In this talk you will learn what visual testing is and why it should be automated. We will take a deep dive into some of the technological challenges involved with visual test automation and show how modern tools address them. We will demo cutting edge technologies that enable running cross browser and cross device visual tests, and provide key tips for succeeding with large scale visual testing.

Hands Off Regression Testing

Karin Lundberg (Twitter) and Puneet Khanduri (Twitter)

Links: Video, Slides

Your team has just finished a major refactor of a service and all your unit and integration tests pass. Nice work! But you’re not done just yet. Now you need to make extra sure that you didn’t break anything and that there aren’t any lurking bugs that you haven’t caught yet. It’s time to put Diffy to work.

Unlike tools that ensure that your code is sound, like unit or integration tests, Diffy compares the behavior of your modified service by standing up instances of your new service and your old service side by side, routing example requests to each, comparing the responses, and provides back any regressions that have surfaced from those comparisons.

Also, we've just open sourced the tool and it is quickly becoming one of the most popular among Twitter's open source projects.

Automated Accessibility Testing for Android Applications

Casey Burkhardt (Google)

Links: Video, Slides

This talk will introduce the core accessibility affordances on the Android platform and illustrate some common developer pitfalls related to accessibility. You’ll learn about the new Android Accessibility Test Framework and its integration into the Espresso and Robolectric testing frameworks. Finally, you’ll learn how easy it is to add automated accessibility checking to your your existing Android project tests.

Statistical Data Sampling

Celal Ziftci (Google) and Ben Greenberg (MIT graduate student)

Links: Video, Slides

It is common practice to use a sample of production data in tests. Examples are:

  • Sanity Test: Feed a sample of production data into your system to see if anything fails.
  • A/B Test: Take a large chunk of production data, run it through the current and new versions of your system, and diff the outputs for inspection.

To get a sample of production data, teams typically use ad-hoc solutions, such as:

  • Manually looking at the distribution of specific fields (e.g. numeric fields),
  • Choosing a totally random sample

However, these approaches have a serious downside: They can miss out rare events (e.g. edge cases), which increases the risk of uncaught bugs in production. To mitigate this risk, teams choose very large samples. However, with such large samples, there are even more downsides:

  • Rare events can still be missed,
  • Runtime of tests greatly increases,
  • Diffs are too large for a human being to comprehend, and there is a lot of repetition.

In this talk, we propose a novel statistical data sampling technique to "smartly" choose a "good" sample from production data that:

  • Guarantees rare events will not be missed,
  • Minimizes the size of the chosen sample by eliminating duplicates.

Our technique catches rare/boundary cases, keeps the sample size to a minimum, and implicitly decreases the manual burden of looking at test outputs/diffs on the developers. It also supports parallel execution (e.g. MapReduce) so that vast amounts of data can be processed in a short time-frame to choose the sample.

Nest Automation Infrastructure

Usman Abdullah (Nest), Giulia Guidi (Nest) and Sam Gordon (Nest)

Links: Video, Slides

Nest’s vision for the Thoughtful Home involves interconnected, intelligent devices working together to make your home safer, more energy efficient, and more aware. This talk will focus on the automation infrastructure and test tools that have been built to help make that vision a reality. Various teams within Nest have been working on both cross-platform and device/feature specific systems for automated regression test and analysis. Using specific examples from real world product testing, we will cover cross-product hardware in the loop testing infrastructure and power regression analysis tools, along with camera and motion detection specific tool sets.

Event Generators

Roussi Roussev (Splunk)

Links: Video, Slides

This talk covers our experiences in developing and using software event generators at Splunk. Inspired by particle physics where event generators have been indispensable in understanding the physical world without running large experimental machines, log generators have improved the way we test our numerous integrations with modern and legacy third-party software. The talk covers the basic functionality and the challenges in generating realistic logs.

Multithreaded Test Synthesis

Murali Krishna Ramanathan (Indian Institute of Science, Bangalore)

Links: Video, Slides

Subtle concurrency errors in multithreaded libraries that arise because of incorrect or inadequate synchronization are often difficult to pinpoint precisely using only static techniques. On the other hand, the effectiveness of dynamic detectors is critically dependent on multithreaded test suites whose execution can be used to identify and trigger concurrency bugs including data races, deadlocks and atomicity violations. Usually, such multithreaded tests need to invoke a specific combination of methods with objects involved in the invocations being shared appropriately to expose a bug. Without a priori knowledge of the bug, construction of such tests can be challenging.

In this talk, I will present a lightweight and scalable technique for synthesizing tests for detecting thread-safety violations. Given a multi-threaded library and a sequential test suite, I will describe a fully automated analysis that examines sequential execution traces, and produces as its output a concurrent client program that drives shared objects via library method calls to states conducive for triggering a concurrency bug. Experimental results on a variety of well-tested Java libraries demonstrate the effectiveness of our approach in revealing many complex bugs.

Enabling Streaming Experiments at Netflix

Minal Mishra (Netflix)

Links: Video, Slides

69+ million user's streaming experience is of paramount importance to Netflix. In order to swiftly improve this, we moved the adaptive streaming algorithms into the Javascript layer. This posed a unique challenge of frequently releasing client javascript software which directly impacted consumer's streaming experience. Borrowing the continuous delivery paradigm, which has been widely and successfully adopted for service applications, we used it to retire risk over the lifecycle of a checkin and deliver updates frequently. In this talk we will describe a key component of this paradigm to enable software updates. We will dive into rollout procedure of the javascript client and tools to accurately compare the health against the current version. We will also share the challenges faced with this process.

Mock the Internet

Yabin Kang (LinkedIn)

Links: Video, Slides

Mock the internet, talking about a new mocking system in Linkedin that helps to mock all the outbound traffic for service level integration tests, will also talk a little bit about the overview of Linkedin Mocking strategy. Share the knowledge and what we learnt with everyone.

Effective Testing of a GPS Monitoring Station Receiver

Andrew Knodt (Lockheed Martin)

Links: Video, Slides

The existing GPS monitoring stations used by the Air Force have become difficult to maintain and an effort is underway to replace them with a GPU accelerated Software Defined Radio (SDR) approach. An overview of the unique testing challenges of this specialized GPS receiver along with an examination of several testing approaches will be presented. While focused on a GPS application, these testing approaches could easily be applied to other production level SDR efforts.

Automation on Wearable Devices

Anurag Routroy (Intel)

Links: Video, Slides

As wearable technology is on the rise in personal and business use, all the companies which have a solid space in the Android market have switched their focus on this upcoming technology. Thus creating their apps with wearable support, which also increases the effort to test their app on the wearable devices. Hence automation on the wearables becomes important to reduce testing effort and increase efficiency.

Unified Infra and CI Integration Testing (Docker/Vagrant)

Maxim Guenis (Supersonic)

Links: Video, Slides

Developers struggle every day to have a working local development environment ready when developing, debugging and going through the continues integration cycle.We can to solve that by integrating docker and vagrant to be used with CI tool. This combination allows to control applications at stack level on development machines, while able to use the same stack in integration tests. In this talk we will discuss:

  • Use of Docker in CI integration tests
  • Control of stack instead of single docker or app.
  • Version control of development and test environments, easily ddistributed with git and docker tools.
  • Seamless support for running Dockers on Mac and windows.

Eliminating Useless Test Bits

Patrick Lam (University of Waterloo)

Links: Video, Slides

Specializing static analysis techniques for test suites has yielded interesting results. We've previously learned that most tests are simple straight-line code, namely a sequence of setup statements followed by a payload consisting of asserts. We show how static analysis can identify useless setup statements, enabling developers to simplify and speed up their test cases.

Coverage is Not Strongly Correlated with Test Suite Effectiveness

Laura Inozemtseva (University of Waterloo)

Links: Video, Slides

The coverage of a test suite is often used as a proxy for its ability to detect faults. However, previous studies that investigated the correlation between code coverage and test suite effectiveness have failed to reach a consensus about the nature and strength of the relationship between these test suite characteristics. Moreover, many of the studies were done with small or synthetic programs, making it unclear whether their results generalize to larger programs, and some of the studies did not account for the confounding influence of test suite size. We have extended these studies by evaluating the relationship between test suite size, coverage, and effectiveness for realistic Java programs; our study is the largest to date in the literature. We measured the statement coverage, decision coverage, and modified condition coverage of these suites and used mutation testing to evaluate their fault detection effectiveness. We found that there is a low to moderate correlation between coverage and effectiveness when the number of test cases in the suite is controlled for. In addition, we found that stronger forms of coverage do not provide greater insight into the effectiveness the suite.

Fake Backends with RpcReplay

Matt Garrett (Google)

Links: Video, Slides

Keeping tests fast and stable is critically important. This is hard when servers depend on many backends. Developers must choose between long and flaky tests, or writing and maintaining fake implementations. Instead, tests can be run using recorded traffic from these backends. This provides the best of both worlds, allowing developers to test quickly against real backends.

ChromeOS Test Automation Lab

Simran Basi (Google) and Chris Sosa (Google)

Links: Video, Slides

ChromeOS is currently shipping 60+ different Chromebook/boxes each running their own software. On the field, customers are getting a fresh system every 6 weeks. This would not be possible without a robust Continuous Integration System vetting check-ins from our 200+ developers. In this talk, we describe the overall architecture with specific emphasis on our test automation lab. In addition, we discuss Moblab (short for mobile (test) lab), our entire test automation infrastructure running from one chromebox. This system is used by many of our partners so that they too can run tests the way we do.