Black box testing is the process by which one would attempt to discover flaws in a software product, without explicit knowledge of the source code behind it. This method of testing is the industry standard for finding flaws in consumer applications, script languages, API's and other UI-driven products. This contrasts against white box testing, a method of testing based on the source code. Turning a blind eye to the source code has numerous benefits over its cousin that make it particularly well suited to larger software houses.

First and foremost, it distinguishes a separate QA team from a development team. Keeping your source code churn and dispersement to a minimum (open source or proprietary is not under scrutiny here), allows you to employ people who will become expert users of a product, learning its intricacies fully. This grants them the freedom of not having to focus on the learning of the dark wizardry of the code, but of the problem solving and thoroughness of learning a product's inside and out. In my experience, keeping a headcount ratio on testing to development teams of 2:1 seems to be a functional and resourceful guideline. One person (the developer) to think of what needs to be done, and two people looking over his or her shoulder to figure out what they might have missed.

Black box testers have a different skill set than your average developer. Their cleverness and problem solving skills take different and almost antithetical incarnations. A developer seeks to solve a problem in a solid and extensible way, taking performance and project design goals into consideration. Your testing engineer seeks to find clever ways to dodge the safeguards of a developer and to find holes (however obvious or obscure they are) in the execution of a program through repeated use. Largely, it is considered that developers cannot test their own code. When creating something, a crafter has an intrinsic sense of what would hurt their code, and thus would tend to shy away from that which might hurt it. Would you stand on a delicate statue you built? Of course not; it was never designed for such a thing. A tester is less nice with a developer’s "baby". Even if it is something like design considerations to lean towards certain performance strengths (rather than being lazy about bugs), developers tend to have a fairly good idea of what would be good for the code, and what would be bad. Having a set of eyes to represent the consumer's usage and interests makes all the difference to a piece of software, come ship-time.

Black box testers are oftentimes called "the conscience of a project". Their job is quality through hard perseverance and strong, solid work. Rather than finishing code tasks, and meeting shipping deadlines, the job of a QA team is to make a product as good as it can be, almost to the point of being a thorn in the side of development and project leadership. The responsibility of testing is to also focus on what should be important. Stomping at ant bugs ("the program crashes whenever I create a gigantic square, and fill it with yellow blinking text, and then resize all of the windows really quickly ten times"), is a waste of people's time, and a good tester knows in his or her heart where that solid line is.

Black box testing itself is a difficult task. It requires a lot of discipline, and a lot of actual product use. Because of the nature of being locked out of the source code, most of the actual testing needs to be done by hand; automation is quite tough in a UI application. While you can do things like make the OS dance over your application like a puppet, especially in increasingly large and complex applications (such as an OS), the actual automation system can cause errors in the system. Automating it can be difficult.

Complexity favors black box testing, as do older codebases. Expecting a person to become familiar with 20 million lines of code is nigh impossible. On some projects, the intelligence that a single person contributes could not even possibly understand that amount of lines, but yet a typical tester could exercise say 5 million of them in any given test pass. Black box testing is as much about simplification as it is about being grounded in the real world. A person performing this testing is more likely to find those obscure bugs found in the interactions between multiple code bases (drag and drop, data import, etc), than is a person attempting to eyeball test cases from code. In fact, many items in code simply do not manifest themselves fully until the OS gets through with its devious ways.

Particularly, this happens to be the case with UI's and interfaces. A dialog box written in code could be perfectly lined up and fit and trim, until someone on a different resolution messes it up, or until the window manager's font engine chokes on the choices you gave it and reverts to the default. Largely, visual items such as bitmaps, fonts, clip art, and drawing have little gain from source code knowledge. A person with a sense of even basic look-and-feel can find these flaws. Perhaps the loading time is not good, or items do not drag the way they should under certain conditions or what not. There are many things that can go wrong with an application, and it is up to them to find them.

Black box testing is not always ideal... There are some cases where it pales in favor of white box testing:
  • Small applications or small teams: When not a lot of items are entered, it's fairly easy to audit every line of code, rather than go through all of the interactions. By small, think teams of less than five people, and projects of 50K lines or less. The ramp-up time of projects of this size for black box testers is fairly constant to those of larger apps
  • Protocols, kernels, and mission-critical apps: Black box testing oftentimes leaves holes in weird places. While great for consumer end apps (where holes are acceptable, and oftentimes necessary to ship within your lifetime), for things that have zero tolerance for bugs, white box is better. It is worth the time to permutate out code paths and to really take a look at items given in it

Otherwise black box has strengths that make it the industry standard choice for smart and fast development of large software projects. Apple, Microsoft, Adobe, and many others use it. Tips for black box testers:
  • Make a thorough plan, and stick to it: Make a test plan, and make it good, sometimes going so far as to do huge permutations of items and running through them. Stick to it. It can sometimes be dull work, but the strength of the product afterwards is rock solid.
  • Assume nothing about the underlying code: Making assumptions that certain items in the code will work a certain way are ways to cause gaps in your coverage. Testing should have no knowledge of any sort of sacrifices or hacks that development had to put into place for whatever reason. It's tough to understand what goes on in the trenches without being there. Turn a blind eye, and don't assume anything; it's on the other side of the Chinese wall.
  • Work feature by feature, and test their interactions separately: If you can't break something up by feature (such as an API set, or font package), break it down into parts and test their interactions atomically. Assume no relation between parts, even if they do seem like they are the same. "Save" and "Save As" can be two different beasts... just because they save doesn't mean they go to the same places, for instance.
  • Reproducibility and the fewest steps always win: Where as in white box testing you can find "theoretical bugs", the only bugs that are good in black box are the ones that happen with some degree of certainty, or with some sort of steps that the developer can trace through to get there with. Otherwise, it's a wasted effort.
  • Worship the assertion The "assertion failed!" is great because it's a message from the programmer, and are usually quite easy to trace back to something bad happening. It's a gimme that something is going wrong, or about to go wrong. "Debug" or "checked" builds of software are invaluable for that reason alone.
  • Be a user: "Dogfood" the product, use it in common "situations". Do what a user would do. Come up with user scenarios as to why something should be a priority on the fix. Don't suffer from pride in your bugs, because that can detract from the greater focus on a product.

In closing, the discussion of Black box vs. White box is not one of open-source vs. proprietary software. In fact many open source applications are tested essentially black box. Your QA engineers and your core programmers have entirely different skill-sets and goals as mentioned above. The distinction is one of methodology, and transcends development license. Testing teams are just as important in the software creation process as the person writing the code, because they save that person or team time, and for many projects in the industry black box testing is that method.