Improving my personal workflow

  • Create a precommit hook
  • Create an alias to check for forced specs (force)
    • git diff --cached | grep (ddescribe|iit|fdescribe|fit|describe.only|it.only|debugger)
  • Automatically exclude npm_modules and bower_components from webstorm index
  • Create git alias
    • git pon - push origin next
    • git pun - push upstream next
    • g run - pull --rebase upstream next
    • g rud - pull --rebase upstream dev_gulp
    • g pud - push upstream dev_gulp
    • g pod - push origin dev_gulp
  • Create alias to run all tests
    • nt (next-test) - gulp protractor; gulp js:unit; grunt test-unit (server)
  • Create git PR from command line
    • https://hub.github.com/
  • https://github.com/ndbroadbent/scm_breeze#installation
  • Window mgmt https://itunes.apple.com/us/app/breeze/id414857071?mt=12

Anatomy of a Web App

Some of the terminology I will use below have a variety of meanings but for this blog post I'm going to define them in (what I hope) is in a simple and straightforward manner.

The basic building blocks:

  • Components - A physical section of the app. This includes the HTML, CSS, and JSS needed to make that particular section work. This aligns with the idea of Web Components and the components that React espouses.
  • Utilities - These are small, single-purpose tools that are used multiple times. Think of this as the embodiment of the Unix way.
  • Coordination - This is what facilitates the interactions across the various components. 
    • Note: I know I should think of a better name but I wanted to highlight that some architectures don't have a single unifying interface while others such as the React-Flux architecture by Facebook uses a single Dispatcher instance (is this an example of the Singleton pattern?) to coordinate the sending of events. On the other hand, the popular Reflux architecture / library uses a more streamlined approach and does away with the single Dispatcher by having the actions trigger directly to the various data stores.

A more complete anatomy:

  • Router - Frequently implemented both on the client and server side.
  • Data layer - The source of truth for your user data. This can either be directly accessing your database or using an ORM (object relational mapping) to interface with the database indirectly.
    • Caching layer - As websites scale up, there's typically a caching intermediary between the service asking for the data and the database itself. A popular open source choice is currently Redis.
  • API endpoints - The most popular being RESTful interfaces which help you decouple the interface from the implementation. Also, versioning and good documentation is very important, especially if these are public API endpoints for outside consumption.

Technical Infrastructure:

  • Testing - Even within testing there are so many parts, but I will list the big categories:
    • Unit testing - enables refactoring
    • Functional testing / Acceptance testing - interacts with the application from the user interface and treats it like a black box
    • Integration testing - ensure that data flows properly between various services 
    • UI testing - catch visual regressions in a semi-automated way
  • Source Code Management (SCM) - Using a Distributed Version Control system such as Git or a traditional centralized Version Control System. Tools such as GitHub are also important part of this codebase check-in / check-out process.
  • Dependency management - In the Javascript world, there's still a bit of a divide between client-side dependencies (the most popular manager probably being Bower) and server-side dependencies (the standard which is NPM). It seems like the trend is towards 'isomorphic' dependencies which can run in both the browser and server and NPM being the de-facto registry for dependencies.
  • Build system - Packages up the individual files and puts the application into a releasable state. The build system (e.g. Grunt or Gulp are the most popular ones for Javascript right now) help you concatenate, minify, and lint files to make it ready for production.
  • Integration / Deployment - Along the lines of the previous one, this helps you get your application moving through the various stages up until production
  • Monitoring / Logging - Once your application is in production, you need to ensure its health through a variety of tools and metrics.
  • Load Balancer  - Helps you scale your application across multiple machine instances
  • Serving Static vs Dynamic content - You will usually use something AWS S3 for static content and AWS EC2 for dynamic content. This helps you reduce the load on the workers doing the dynamic content.

Bonus Points:

  • I18n - Making the website usable for those not in your primary locale / language
  • Accessibility - Making the website usable for those with disabilities
  • CDN (content delivery network) - Making your website physically closer to people around the world by distributing static contents to various endpoints around the world.
  • Analytics - Third party services such as Google Analytics and Mixpanel make this easier.
  • Offline caching - This can be done through a mechanism like Local Storage which allows users to store a certain amount of data on their computer so they can (potentially) interact with the application even when they don't have internet connection.
  • Comprehensive testing - Once you've covered the major testing categories above, there's ways of improving test coverage even further by testing across different devices, OS, browser vendors, browser versions, and i18n locales.

Fighting complexity

Having worked on a significant codebase for a couple months for the first time has been an interesting experience. I've realized that it's easy to complain about a particular section of a codebase and say, "this part is so confusing! who would ever write it this way?" only to later make a quick patch or hacky workaround that causes you to scratch your head just days later.

I have listed below several common trends or themes that I've noticed that drives complexity and also some thoughts on how to manage complexity and simplify things when possible.

Generalizing for future use. It's natural as a programmer to want to make things reusable, but I think the greatest problem with this is that while we may instinctively know that we will want to reuse a certain module or component in the future - we don't know exactly what that use case is. What ends up happening is that we build a framework without actually validating its utility with real world usage. The XP methodology can be best summarized as advocating for emergent design and following the mantra of YAGNI (you ain't gonna need it). Rather than doing up-front design or creating things that are reusable from the get-go, XP prefers doing something once, and then later when you end up needing to do something similar to refactor your code so you can actually reuse it. The key is that you make a component reusable at the last possible minute. I think what makes creating a framework upfront so appealing is that it seems more straightforward than having to create it once and then refactor it to adapt to another use case.

Creating a complex dependency graph. This can actually stem from the previous trend. In the desire to generalize for future use case, it may seem appealing to create a dependency (e.g. an NPM module) that does a specific functionality. And then to further make a part of that dependency reusable for other dependencies you split that out into a micro-dependency. Soon, in order to understand one functionality of a website, it requires hopping through multiple files in your project repo and then going through several layers of dependencies to understand how something truly works. I do want to balance this by saying that of course using libraries can provide significant benefits, and it's unfeasible to do "everything by hand" without external dependencies. But I think the key is that when you are using dependencies, you try to pursue these principles:

  • Use the least number of dependencies possible. All things being equal, if there are dependencies that aren't pulling their weight or aren't being currently used, I think it's better to pull them out. First, it reduces bloat (important for client-side dependencies that have to be sent down the wire) and, second, it reduces the time to do builds, and third, most importantly, it makes it simpler to look for what dependencies are doing what. When you have a long dependency list with many small (little heard-of) dependencies, it can be quite tricky to figure out which dependency is responsible for doing what. Of course at least in the npm world, each dependencies probably has another set of 8 dependencies, and each of them have more. It's easy to imagine one top-level dependencies that ends up requiring 20 total dependencies when all is said and done. But in the end, I would more on the number of top-level dependencies rather than the overall number of dependencies.
  • Use dependencies with good documentation and stable APIs. The above statement should be qualified with a quality aspect. Dependencies that act like good citizens such as thorough and easy-to-understand documentation and stable API that follows semantic versioning will cause much less issues. When you're picking a dependency, as I mentioned in an earlier post, it's worth making good docs and test coverage a key evaluation criteria. It's harder to evaluate how stable an API is, unless you know someone that's a long-time user of that library who can either vouch or lament the stability of the API.
  • Dependencies should follow the Unix philosophy - do one thing well. I think this is an easy to thing and a really hard thing to follow. In a way, this basically rules out any strongly opinionated framework (e.g. any large MVC framework). I think using a framework provides a lot of benefits in terms of scaffolding a project and getting it up to speed quickly, but in the long run frameworks require a lot of energy to use effectively (e.g. upgrading a site from one version to another).

Getting lazy and doing hacky workarounds (aka the "I'll refactor tomorrow [insert date]"). I've said this to myself so many times and it's bitten me many times. In the end, making quick fixes may be necessity from time to time, but most of the time I'm making a quick fix in a dirty way because I'm being lazy or I can't think of a better way at the moment. I'd like to become more disciplined and focus on refactoring sooner than later and when I really get stuck on how to refactor something to either make a to-do item for myself or ask for help from someone else.

 

Learning Agile

Excellent book on various Agile methodologies. I wrote a review on Amazon, which I've pasted below. Buy the book here:

---

For anybody who works in an Agile team or wants to help guide their team to becoming more Agile, I would recommend reading this book. I think the biggest selling point for me was that it actually describes the differences between the various Agile methodologies. With all the different Agile methodologies that exists (and I'm sure there's many more beyond the 4 described in the book), it's easy to feel like you're swimming in an ocean of buzzwords. Luckily, I think the authors recognize the importance of actually "showing" you what Agile looks like and have vignettes sprinkled throughout the book to show you teams that are struggling and eventually teams that are succeeding with Agile


The first methodology they discuss is Scrum. They do this very intentionally because Scrum is probably the easiest Agile methodology to actually "adopt". It has clear practices and more closely resembles "traditional" software development process compared to the other methodologies.

The second methodology they discuss is XP which I have heard is highly influential in the Agile field with luminaries such as Kent Beck and Martin Fowler as early leaders of the field, however to be honest I have never heard of anyone working in a software team they would describe as "XP". The most charitable way to interpret this is that many of the XP values, principles, and practices have trickled down into mainstream "Agile" consciousness. The easiest way to summarize XP is "embracing change" and the authors show you how they support that overarching goal through practices such as unit testing to facilitate refactoring and delaying decisions until the last moment.

The last part describes the Lean and Kanban methodologies which are closely related. In short, they focus on continuous improvement. Before reading the book, I had heard about a Kanban board and the idea of moving tasks around different columns, but the real eye-opener was their emphasis on the importance of Work In Progress (WIP) limits. They show you how vital it is to have WIP limits and why you need to be careful of ignoring it (which oftentimes happen as they demonstrate).

Probably my favorite part of the book is when they describe teams that partially apply Agile methodologies while still retaining much of their legacy software development practice and end up achieving OK but "better than nothing" results. I think it's easy to think of Agile as the end all be all of good software development practice but it's a journey to actually achieving it and I like how they show you the realistic challenges of going from a traditional software development methodology to agile.

Heuristics for picking frameworks / libraries / tools

Like many software engineers, I oftentimes get caught up with wanting to learn about the next new thing without critically evaluating whether it's worth the effort learning and using. Even if I don't end up using it, it can still be a time sink to read about every new tool that's out there. As many people on Hacker News have lamented, the Javascript world especially seems to churn rather quickly.

So I have decided to create a set of heuristics for picking the next xyz dependency for the latest project. In other words here some general "rule of thumbs" that I've decided to follow for myself:

  • Popularity is important, but not the most important thing. Of course popular libraries and frameworks like jQuery and Angular have gotten big for a reason, but you shouldn't pick them solely for that reason or even primarily because of that reason. You should pick the right tool for the job, which sounds so obvious, but it's easy to forget about this in your pursuit for using the latest library. For example, Meteor.js is very powerful real-time full-stack Javascript framework, but unless you're creating a web application that has demanding real-time capabilities (e.g. a task management app like Asana, which incidentally inspired the Meteor framework), it's probably an overkill.
  • Pick the simplest tool that gets the job done. Having used Angular for a little bit now, I have a more balanced perspective than when I first heard about it and started using it. The two biggest benefits of Angular in my mind is that it enables you to write web apps more quickly (e.g. two-way data binding, powerful HTML templating) and has testability baked into the framework. The biggest problem that I have with Angular is that it's a very complex framework with many concepts that are significantly if not radically different than vanilla Javascript and its documentation is not that... wonderful. When you have too many abstractions built in a framework, you end up learning about the framework itself and less about the underlying language.
    • Sidebar: For example, if you read the ngTransclude documentation (https://docs.angularjs.org/api/ng/directive/ngTransclude), unless you have a solid understanding of transclusion to begin with, you are probably scratching your head. On the other hand, I think Facebook has done a very nice job of providing accessible documentation for its open source libraries. When Flow Type was released, on day one it had a very comprehensive of documentations and a few tutorials. This leads me to my next point...
  • Pick a library that has good test coverage and ample documentation. Why do I care about these two things? They usually are markers for a library developed and maintained by a responsible team. Sure, you can have lots of documentation and still have a clunky tool with an awkward API, but at least it shows that the maintainers at least put effort into making their tool usable by others. The same thing with test coverage. You can have 100% code coverage and still have a buggy software, but I'm willing to bet that any library with less than 50% code coverage probably has many, if not at least a few bugs. Some libraries like Lodash really emphasize code coverage and boast 100% coverage (https://lodash.com/) but my general belief is that ~80% is already pretty good.
  • Licensing - stay legal. I'll keep this brief, but if you're doing proprietary software development and you want to use open source dependencies, you'll need to be careful to avoid certain open source licenses or you might run into some legal troubles. See my earlier post: http://willchen.posthaven.com/open-source-licenses
  • Lastly - don't get stack in analysis paralysis. When I was first learning about CSS, I spent days reading about articles discussing the differences between Less and Sass. Eventually I decided to use Sass because I had a subscription to CodeSchool and they had some pretty well-developed tutorials on Sass specifically but not Less. In the end, I think the few days I spent researching the differences could have been better spent just writing toy projects in either of the language. After a couple hours of researching and analyzing the different options, I think it's time to just pick one. After all, once you learn Sass, it's a lot less hard to pickup a similar precompiler language like Less or Stylus.

Hopefully this guide helps. And as I started out in the beginning, these are really just 'rules of thumbs', so don't feel like any of these are unbreakable rules.