Clean Code

Table of Contents

NOTE: TOC and these notes do not strictly correspond to the order used in the book itself.

Clean Code

  • Code is really the language in which we ultimately express the requirements.
  • It is unprofessional for programmers to bend to the will of managers who don’t understand the risks of making messes.
  • We are authors. Ratio of time spent reading vs. writing is well over 10:1. Making it easy to read actually makes it easier to write.
  • The Boy Scout rule: Leave the campground cleaner than you found it. Isn’t continuous improvement an intrinsic part of professionalism?

What Is Clean Code? (Answers from experts)

  • Bad code temps the mess to grow! When others change bad code; they tend to make it worse.
  • Clean code is focused. Each function, each class, each module exposes a single-minded attitude that remains entirely undistracted, and unpolluted, by the surrounding details.
  • Clean code can be read, and enhanced by a developer other than its original author. It has unit and acceptance tests. It has meaningful names. It provides one way rather than many ways for doing one thing. It has minimal dependencies, which are explicitly defined, and provides a clear and minimal API. Code should be literate since depending on the language, not all necessary information can be expressed clearly in code alone.
  • Runs all the tests; Contains no duplications; Expresses all the design ideas that are in the system; Minimizes the number of entities such as classes, methods, functions, and the like.

Prequel and Principles

  • Single Responsibility Principle: every module, class or a function should have responsibility over a single part of that program’s functionality and it should encapsulate that part.
  • Open Closed Principle: software entities (modules, classes, functions) should be open for extension but closed for modification.
  • Dependency Inversion Principle: Classes should depend on abstractions, not on concrete details.

Meaningful Names

  • Use Intention-Revealing Names. Choosing good names takes time but saves more time than it takes.

  • Avoid Disinformation

  • Meaningful Distinctions. It is not sufficient to add number series or noise words, even though the compiler is satisfied. If names must be different, then they should also mean something different.

  • Use Pronounceable Names. Intelligent conversion is now possible.

  • Use Searchable Names.

    • Single-letter names and numeric constants have a particular problem that they are not easy to locate across a body of text.
    • The length of a name should correspond to the size of its scope.
  • Avoid Encodings

    • Hungarian Notation
    • Member Prefixes
    • I prefer to leave interfaces unadorned (no I prefix).
  • Avoid Mental Mapping. One difference between a smart programmer and a professional programmer is that the professional understands that clarity is king.

  • Class Names: nouns or noun phrase. Avoid names like Manager, Data, Professor, Info in the name of the class.

  • Method Names: verbs or verb phrase. Accessors, mutators, predicates should have get, set or is prefix.

  • Don’t Be Cute

    • Don’t tell little culture-dependant jokes.
    • Say what you mean. Mean what you say.
    • Pick One Word per Concept. Using the same term for two different ideas is essentially a pun.

Functions

  • Duplication may be the root of all evil in software.
  • Writing clean software is like any other kind of writing. When you write a paper or an article you get your thoughts down first, then you massage it until it reads well.
  • Master programmers think of systems as stories to be told rather than programs to be written. They use the facilities of their chosen programming languages to construct a much richer and more expressive language that can be used to tell that story.
  • Your real goal is to tell the story of the system.

Small!

  • Functions should not be 100 lines long. Functions should be hardly ever 20 lines long.
  • Every function in this program was just two, or three, or four lines long. Each was transparently obvious. Each told a story. And each led you to the next in a compelling order.
  • The blocks within if, else, while etc. statements should be one line long.
  • The indent level of a function should not be greater than one or two.

Do One Thing

  • FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY.
  • Sections within Functions: This is an obvious symptom of doing more than one thing. Functions that do one thing cannot be reasonably divided into sections.
  • One Level of Abstraction per Function

Use Descriptive Names

  • The smaller and more focused a function is, the easier is to choose a descriptive name.
  • Don’t be afraid to make a name long. A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.

Function Arguments

  • The ideal number of argument for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided where possible. More than three (polyadic) requires very special justification - and then shouldn’t be used anyway.
  • Argument are hard. They take a lot of conceptual power.
  • Arguments are even harder from a testing point of view. If there are no arguments, this is trivial. If there’s one argument it’s not too hard.
  • assertEquals might be better written as assertExpectedEqualsActual(expected, actual). This strongly mitigates the problem of having to remember the ordering of the arguments.

Output Arguments

  • Are harder to understand than input arguments.
  • Using an output argument instead of a return value for a transformation is confusing. If a function is going to transform its input argument, then the transformation should appear as the return value.
  • In general output arguments should be avoided. If your function must change the state of something, have it change the state of its owning object (use OOP).

Flag Arguments

  • Flag arguments are ugly. Passing a boolean into a function is a truly terrible practice. It does one thing if the flag is true and another if the flag is false!

Argument Objects

  • Reducing the number of arguments by creating objects out of them may seem like cheating, but it’s not. Likely it is a part of a concept that deserves a name of its own.

Have No Side Effects

  • Side effects are lies. Your function promises to do one thing, but it also does another hidden thing.
  • Anything that forces you to check the function signature is equivalent to a double-take. It’s a cognitive break and should be avoided.

Command/Query Separation

  • Functions should either do something or answer something, but not both.

Prefer Exceptions to Returning Error Codes

  • Returning error codes from command function is a subtle violation of command query separation.
  • Extract Try/Catch Blocks into functions of their own.
  • Error Handling Is One Thing. Functions should do one thing. Thus, a function that handles errors should do nothing else. This implies that if the keyword try exists in a function, it should be the very first word in the function and that there should be nothing after the catch/finally blocks.

Comments

  • Don’t commend bad code - rewrite it.
  • The older a comment, and the farther away it is from the code it describes, the more likely it is to be just plain wrong. Programmers cant realistically maintain them.
  • Inaccurate comments are far worse than no comments at all.
  • Comments do not make up for bad code. “Ooh, I’d better comment that!” No! You’d better clean it!
  • Explain yourself in code:
// Check to see if the employee is eligible for full benefits
if ((employee.flags & HOURLY_FLAG) &&
(employee.age > 65))

vs.

if (employee.isEligibleForFullBenefits())

Good Comments

  • Legal Comments
  • Informative Comments
  • Explanation of intent
  • Clarification. Before writing, take care there is no better way and then take even more care that they are accurate.
  • Warning of Consequences
  • TODO comments. This should not be an excuse to leave bad code in the system.
  • Amplifications. Javadocs can be just as misleading, nonlocal, and dishonest as any other kind of comment.

Bad Comments

  • Mumbling
  • Redundant Comments
  • Misleading Comments
  • Mandated Comments. It’s plain silly to have a rule that says that every function must have a javadoc, or every variable must have a comment.
  • Journal Comments (see Tanzer’s code).
  • Noise Comments
  • Don’t use a comment when you can use a function or a variable.
  • Position Markers
  • Closing Brace Comments. Try to shorten your function instead.
  • Attributions and bylines
  • Commented-Out code
  • HTML Comments. It should be the responsibility of the tool that generates HTML documentation to transform them.
  • Nonlocal information.
  • Too much information. Don’t put interesting historical discussions or irrelevant descriptions of details into your comments.
  • Inobvious Connection
  • Function headers
  • Javadocs in nonpublic code

Formatting

  • Team should agree on a single set of formatting rules and all members should comply.
  • Have an automated tool that apply those formatting rules for you
  • Code formatting is about communication and communication is the professional developer’s first order of business.

Vertical Formatting

  • Small files are usually easier to understand than large files.
  • Vertical openness between concepts <-> vertical density.
  • Closely related concepts should be kept vertically close to each other.
  • Variable declarations should be declared as close to their usage as possible. Because our functions are very short, local variables should appear at the top of each function.
  • If one function calls another, they should be vertically close and the caller should be above the callee, if possible.

Horizontal Formatting

  • keep lines short. 100 or 120 chars. Beyond that it is probably just careless.
  • use horizontal white space to associate things that are strongly related and disassociate things that are more weakly related.
  • Horizontal alignment (see Tanzer) is not useful. It seems to emphasize the wrong things and may lead away the eye from the true intent.
  • If we have a long list that needs to be aligned the problem is the length of the list, not the lack of alignment.
  • Avoid collapsing scopes down to one line (if, while … in C/Java).
  • Good software system is composed of a set of documents that read nicely. They need to have a consistent and smooth style.

Objects and Data Structures

  • There’s a reason that we keep our variables private. We don’t want anyone else to depend on them.
  • Hiding implementation is about abstractions. We do not want to expose the details of our data. Rather we want to express our data in abstract terms.
  • Serious thought needs to be put into the best way to represent the data an object contains. The worst option is to blithely add getters and setters.

Data/Object Anti-Symmetry

  • Objects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions.
  • OOP makes it easy to to add new classes without changing existing functions. Procedural code makes it easy to add new functions without changing existing data structures.
  • Understand this and choose the approach that is best for the job at hand.

Law of Demeteter

  • module should not know about the innards of the objects it manipulates.
  • Train Wrecks (a().b().c()): whether this is a violation of the Law of Demeter depends whether the internal structure should be hidden or exposed. If a, b, c, are just structures with no behaviour, Law of Demeter doesn’t apply.
  • Hybrids make it had to add new functions but also make it hard to add new data structures. Avoid creating them.
  • Hiding structure: If ctxt is an object, we should be telling it to do something; we should be be asking it about its internals.

Data Transfer Objects

  • DTO is a class with public variables and no functions
  • Beans have private variables and are accessed by getters/setters. The quasi-encapsulation provides no benefit.
  • Active Records are a special form of DTO with some navigational methods like save and find. Don’t put business logic in them (it creates a hybrid), treat an Active Record as a data structure and create objects that contain the business rules and hide their internal data (probably just instances of the Active Record).

Error Handling

  • Error handling is important, but if it obscures logic, it’s wrong.
  • Separate differed concerns, algorithm for … and error handling.
  • It’s good practice to start with a try-catch-finally when you’re writing code that could throw exceptions. try blocks are like transactions; catch leaves the program in consistent state.
  • Write tests that force exceptions, then add behaviour to your handler to satisfy your tests.
  • Use unchecked exceptions. The price of checked exceptions is Open/Closed Principle violation.
  • Provide context with exceptions - a stack trace can’t tell the intent of the operation that failed.
  • Wrap 3rd party APIs, incl. custom exception structure.
  • Don’t return null. when we do, we’re essentially creating work for ourselves and foisting problems upon our callers.
  • Don’t pass null: returning null from methods is bad, passing null into methods is worse. In most programing languages there is no good way to deal with a null that is passed by a caller incidentally. Therefore, the rational approach is to forbid passing null by default.

Boundaries

  • Do not pass Maps (or any other interface at a boundary) around your system. Wrap them.
  • Learnings tests - to check our understanding for new APIs. They are free (we need to learn the API anyway) and they allow us to check if the 3rd party packages are working as expected on new releases.
  • Good SW design accommodate change without huge investment and rework. We should avoid letting too much of our code know about the third party particulars. It is better to depends on something you control than on something you don’t control, lest it end up controlling you.

Unit Tests

  • Law 1: You may not write production code until you have written a failing unit test.
  • Law 2: You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
  • Law 3: You may not write more production code than is sufficient to pass the currently failing test.

Keeping Tests Clean

  • Having dirty tests is equivalent to, if not worse than, having no tests.
  • Test code is just as important as production code. It is not a second-class citizen. It requires thought, design, and care. It must be kept clean as production code.

Clean Tests

  • What makes a clean test? Three things: Readability, readability, and readability. In unit tests readability is perhaps more important than it production code.
  • Readability means clarity, simplicity and density of expression.
  • Build-Operate-Check pattern. First part build ups the data, the second operates on that data, the third checks the operation yielded expected results.

Domain-Specific testing languages

  • Rather than using the APIs of the system directly we build up a set of functions and utilities that make use of those APIs and make tests more convenient to write and easier to read. These functions and utilities become a specialized API used by the tests.

A Dual Standard

The code within the testing API does have a different set of engineering standards than production code. It must be simple, succinct, and expressive, but it need not be as efficient as production code.

One Assert Per Test

  • The number of asserts in a test ought to be minimized.

Single Concept per Test

  • We want to test a single concept in each test function.

F.I.R.S.T.

  • Fast - tests should be fast
  • Independent - tests should not depend on each other
  • Repeatable - tests should operate in any environment. You should be able to run tests in the production environment, in the QA environment, and on your laptop while riding home on train without a network. If your tests aren’t repeatable in any environment, you’ll always have an excuse for why they fail.
  • Self-Validating - tests should have a boolean output. Either they pass or fail.
  • Timely - unit tests should be written just before the production code that makes them pass.

Classes

  • There is seldom a good reason to have a public variable.
  • We like to put private utilities called by a public function right after the public function itself.

Small!

  • The first rule of classes is that they should be small. The second rule of classes is that they should be smaller than that.
  • Naming helps: if we cannot derive a concise name fora class, than it’s likely too large.
  • Class names including weasel words like Processor or Manager or Supre often hint at unfortunate aggregation of responsibilities.
  • Getting software to work and making software clean are to very different activities.
  • The primary goal in managing complexity is to organize so that a developer knows where to look for things an need only understand the directly affected complexity at any given time.
  • We want our system to be composed of many small classes, not a few large ones. Each small class encapsulates a single responsibility, has a single reason to change, and collaborates with a few others to achieve the desired system behaviour.

Cohesion

  • Classes should have a small number of instance variables.
  • Generally the more variables a method manipulates the more cohesive that method is to its class.
  • Neither advisable to create a maximally cohesive classes, OTOH, we would like the cohesion to be high.

Organizing for Change

Isolating from Change

  • We want to structure out systems so that we muck with little as possible when we update with new or changed features. In an ideal system, we incorporate new features by extending the system, not by making modifications to existing code.
  • The lack of coupling means that elements of our system are better isolated from each other and from change.

Systems

Separate Constructing a System from Using It

  • Construction is a very different process from use.
  • SW systems should separate the startup process (main fct), when the application objects are constructed and the dependencies “wired” together, from the runtime logic that takes over after startup.
  • Factories
  • Dependency Injection. Inversion of Control (IoC) moves secondary responsibilities from an object to other objects that are dedicated to that purpose, therefore supporting the Single Responsibility Principle.

Scaling Up

  • It’s a myth that we can get systems “right at the first time”. Instead, we implement only today’s stories, then refactor and expand the system to implement new stories tomorrow. TDD, refactoring, clean code makes this work at the code level.
  • Software systems are unique compared to physical systems. Their architecture can grow incrementally, if we maintain the proper separation of concerns.

Pure Java AOP Frameworks

  • In Spring, you write your business logic as POJOs. POJOs are purely focuses on their domain. They have no dependencies on enterprise frameworks (or any other domain). They allow for truly test driving the application, without doing a Big Design Up Front.
  • We can start a SW project with a “naively simple” but nicely decoupled architecture, deliver working user stories quickly, then adding more infrastructure as we scale up.
  • A good API should largely disappear from the view most of the time, so the tam expends the majority of its creative efforts focused on the user stories being implemented. If not, then the architectural constraints will inhibit the efficient delivery of optimal value to the customer.

Optimize Decision Making

  • Postpone decisions until the last possible moment. This isn’t lazy or irresponsible; it lets us make informed choices with the best possible information.

Systems Need Domain-Specific Languages

  • If you are implementing domain logic in the same language that a domain expert uses, there is less risk that you will incorrectly translate the domain into implementation.

Conclusion

  • At all levels of abstraction, the intent should be clear.
  • Never forget to use the simplest thing that can possibly work.

Emergence

  • Is there a set of simple practices that can replace experience? Clearly not.

Rule 1: Runs All the Tests

  • Tight coupling makes it difficult to write tests.
  • OOP goal of low coupling and high cohesion. Writing tests leads to better designs.

Rule 2-4: Refactoring

No Duplication

  • Template-method helps

Expressive

  • Code should clearly express the intent of its author.
  • You can express yourself by choosing good names. We want to be able to hear a class or a function name and not be surprised when we discover its responsibilities.
  • You can also express yourself by keeping your functions and classes small. Small classes and functions are usually easy to name, easy to write, and easy to understand.
  • The most important way to be expressive is to try. Remember, the most likely next person to read that code will be you.
  • Take a little pride in your workmanship. Spend a little time with each of your functions and classes. Choose better names, split large functions into smaller functions and generally just take care of what you’ve created. Care is a precious resource.

Minimal Classes and Methods

  • High class and method counts are sometimes the result of pointless dogmatism.
  • Our goal is to keep our overall system small while we are also keeping our functions and classes small. Remember, however, that this rule is the lowest priority of the four rules of Simple Design. So, although it is important to keep class and function count low, it’s more important to have tests, eliminate duplication, and express yourself.

Concurrency

  • Objects are abstractions of processing. Threads are abstractions of schedule.
  • Decoupling what gets done from when it gets done. This can dramatically improve both the throughput and structure of application.

Writing clean concurrent program is hard - very hard.

  • The design of a concurrent algorithm can be remarkably different from the design of single-threaded system.
  • Concurrency incurs some overhead, in both performance as well as writing additional code.
  • Concurrency bugs aren’t usually repeatable.
  • Think about shut-down early and get it working early. It’s going to take longer than you expect.

Concurrency Defense Principles

  • Keep your concurrency-related code separate from other code.
  • Take data encapsulation to heart; severely limit the access of any data that may be shared.
  • Use copies of data. Collect results from multiple threads and then merge the results in a single thread.
  • Partition data into independent subsets that can be operated on by independent threads, possible in different processors.
  • Keep synchronized sections small.

Testing Threaded Code

  • Get you nonthreaded code working first.
  • Treat spurious failures as candidate threading issues. Don’t ignore system failures as one-offs.
  • Make your threaded code pluggable, so you can run it in various configurations.
  • Run with more threads than processors (cores), to encourage system switches.
  • Jiggle the code so that threads run in different orderings at different times. The combination of well-written tests and jiggling can dramatically increase the chance finding errors.

Successive Refinement

  • You’d be able to read the code from top to the bottom without a lot of jumping around or looking ahead.
  • Programming is more a craft than it is a science. To write clean code, you must first write dirty code and then clean it.
  • Most freshmen programmers think that the primary goal is to get the program working. Once it is “working” they move to the next task, leaving the program in whatever state they got it to “work”. Most seasoned programmers know this is professional suicide.
  • “If the structure of this code was ever going to be maintainable, now was the time to fix it. So I stopped adding features and started refactoring”.
  • Refactoring is a lot like solving a Rubik’s cube. There are a lot of little steps required to achieve a large goal. Each steps enables the next.
  • Bad schedules cna be redone, bad requirements can be redefined. Bad team dynamics can be repaired. Bad code rots and ferments, becoming an inexorable weight that drags the team down.

Refactoring SerialDate

  • It is only through critiques that we learn. Doctors do it. Layers do it. Pilots do it. And we programmers need to learn how to do it too.

  • First make it work. Get the coverage and extend/make the unit tests work.

  • Then make it right.

  • Get your code into a form that’s easy to change.

Smells and Heuristics

Comments

  • Inappropriate information
  • Obsolete comment
  • Redundant comment
  • Poorly written comment
  • Commented-out code

Environment

  • Build requires more than one step
  • Running tests requires more than one step

Functions

  • Too many arguments
  • Output arguments
  • Flag arguments
  • Not used (dead) functions

General

  • Multiple languages in one source file.
  • Obvious behaviour is unimplemented - Principle of least suprise.
  • Incorrect behaviour at the boundaries. Don’t rely on your intuition. Prove that your code works in all the corner cases.
  • Overridden safeties - turned of failing tests, compiler warnings etc.
  • Duplication.
  • Code at wrong level of abstraction. Constants, variables, utility functions hat pertain only to the detailed implementation should not be present in the base class. Don’t mix higher and lower level concepts together. You cannot lie or fake your way out of a misplaces abstraction. Isolating abstractions is one of the hardest things that software developers an do, and there is no quick fix when you get it wrong.
  • Base classes depending on their derivatives.
  • Too much information. Well-defined modules have very small interfaces that allow you to do a lot with a little. Hide your data, hide your utility functions, hide your constants and hide your temporaries. Don’t create classes with a lot of methods and instance variables. Don’t create lots of protected variables and functions for your subclasses. Help keeping low coupling by hiding information.
  • Dead code
  • Vertical separation
  • Inconsistency - goes back to the principle of least surprise.
  • Clutter.
  • Artificial coupling.
  • Feature envy. The methods of a class should be interested in the variables and functions of the class they belong to, and not the variables and functions of other classes.
  • Selector arguments. In general it is better to have many functions than to pass some code into a function to select behaviour.
  • Obscured intent.
  • Misplaced responsibility - code should be placed where a reader would naturally expect it to be.
  • Inappropriate static - there might be a reasonable chance that we’ll want the function to be polymorphic.
  • Use explanatory variables.
  • Function names should say what they do.
  • Understand the algorithm.
  • Make logical dependencies (i.e. assumptions, e.g. a constant) physical (calling a method providing the data).
  • Prefer polymorphism to If/Else or Switch/Case
  • Follow standard conventions - coding standards.
  • Replace magic numbers with named constants.
  • Be precise - don’t be lazy
    • expecting the first match to be the only is naive
    • don’t use float point numbers to represent currency
    • use locks and/or TX
    • don’t be too specific, e.g. declaring a variable as ArrayList when List is good enough
    • making all variables protected by default is not constraining enough
  • Structure over convention, e.g. abstract methods > switch with nicely named enumerations.
  • Encapsulate conditionals. Turn if (expected == null || actual == null || areStringsEqual()) into a method -> bool.
  • Avoid negative conditionals.
  • Functions should do one thing.
  • Hidden temporal couplings -> make explicit.
  • Don’t be arbitrary.
  • Encapsulate boundary conditions. We don’t want swarms of +1s and -1s all over the code. Encapsulate then in variables.
  • Functions should descend only one level of abstraction.
  • Keep configurable data at high levels.
  • Avoid transitive navigation (Law of Demeter - write shy code).

Java

  • Avoid long imports by using wildcards.
  • Don’t inherit constants. Use static import instead.
  • Constants vs. enums - use enums.

Names

  • Choose descriptive names. Don’t be too quick to chose a name.
  • Choose names at the appropriate level of abstraction.
  • Use standard nomenclature where possible.
  • Unambiguous names.
  • Use long names for long scopes.
  • Avoid encodings.
  • Names should describe side-effects.

Tests

  • Insufficient tests.
  • Use a coverage tool - they report gaps in your testing strategy.
  • Don’t skip trivial tests.
  • An ignored test is a question about an ambiguity.
  • Test boundary conditions.
  • Exhaustive tests near bugs. Bugs ten to congregate. When you find a bug in a function, it is wise to do an exhaustive test of that function.
  • Test coverage patterns can be revealing.
  • Tests should past fast.