Will Chen

Docker project pitch

Deployment made painless

Rather than trying to fight all the difficulties of deployment through various tricks and programs, you can use Docker to ensure that your development and production environments are consistent.

What are potential roadblocks to using Docker?

Docker is a relatively new technology. It was first released in March 2013 and in June 2014, version 1.0 was released. Since it's only recently been "production-ready", Docker hasn't seen widespread adoption yet in the industry. This means documentation, and general community support still has room for growth.
Security issues, particularly around sharing a host OS with others. There are several recommendations to tighten the security from default settings, however these require a good understanding of Linux and may scare away developers with less experience in DevOps.
Unsure about how to apply an existing project to Docker. If you've already worked on developing a project, it can be challenging to figure out how to docker-ize a project. For example, should an entire MEAN stack be in one container? Or should there be one container for the DB and one container for the webserver?
Support for Docker across various IaaS/PaaS providers is relatively nascent. I'm not sure about others, but for Azure, support for Docker is relatively recent. For example, on Azure you can only setup a Docker via CLI and not through the web portal (which probably doesn't matter, because if you're not comfortable with CLI, you will find it quite challenging to use Docker anyhow).

Proposed Idea: DockerJS

Overview: Straight-forward generator to create minimal, connected containers for new and existing projects to simplify the deployment process.

Benefits: Enable you to deploy (painlessly) right away and do continuous deployment and integration with minimal initial work.

Ecosystem: Abstraction layer on top Docker for a specific type of developer (full-stack web developer / teams).

The solution is to provide a step-by-step method for web developers to take advantage of Docker quickly.

Steps:

Choose what your stack will be
Create containers based on the stack selection (will likely create at least two containers)
Configure the containers to ensure a) the containers talk to each other and b) security
Install the targeted portion of the stack along with dependencies
Create containers on the cloud platform (e.g. Azure, AWS, Google Cloud) and link them to the containers on the local dev machine

The idea is that these containers generate very barebone footprint. It shouldn't seek to replace scaffolding generators such as Yeoman (which are oftentimes accused of being bloated). Instead it's to create the smallest scaffold that still demonstrates the container retains its functionality (e.g. a simple Hello World on an express server).

Then, if it's a new project, a developer can use a scaffolding generator if they want, or if it's an existing project, copy in existing code.

Yeoman-style generator that allows you to create Minimum Containers.

Initially, you can create a MEAN stack or more generally, whatever stack you want.

It creates several containers:
1. Primary DB (e.g. MongoDB, mySQL)
2. Secondary DB (e.g. Redis)
3. Server DB (e.g. Node, Express + MVC front-end framework)
4. Testing Framework
5. Continuous Integration (if they're doing Jenkins)

The Cloud of Tomorrow

With this recent announcement by both Google and Amazon for their support of Docker container management services, the shift towards shipping Docker containers instead of just application code (e.g. git repository) seems all but certain. This isn't too surprising given that these two and Microsoft have already offered basic support for Docker containers.

Amazon has announced a compelling suite of new features at its AWS re:invent industry event. The two that strike me as the potentially most disruptive are Amazon's relational DB service Aurora and event-driven compute service Lambda.

In a nutshell, both of them provide a major abstraction layer that essentially shifts the responsibility of scaling and maintenance from Amazon's customers to AWS.

In the future, the underlying cloud-based infrastructure of web and mobile-based applications will likely differ in these ways:

Shipping containers instead of code: in the pursuit of making deployments predictable across all environments, technology such as Docker containers have become ascendant. By fulling capturing the runtime environment and all of an application's dependencies in a snapshot, DevOps teams will be able to work in fully normalized environments across thousands of instances of their application
Real-time compute services: Amazon Lambda allows developers to avoid having to think about provisioning instances, load balancing, and maintaining systems. Instead, it allows them to essentially solely focus on the code the service will execute.
Micro-services (internal and external): Complex applications will be comprised of dozens if not hundreds of services that are stitched together. Some of these services will be internal (e.g. those core to the product and company) and others will be external (e.g. application performance analytics)
Abstracting away the issues of scaling and maintenance: Two of the largest challenges of scaling up are 1) handling a large number of concurrent requests on your compute instances (e.g. a traffic spike due to getting written up by Tech Crunch) and 2) managing a large database such as replication, sharding, backups, failover, etc. Databases are some of the most valuable assets of a company and a single catastrophic database error could close the doors of the business. This is similar to above
- Database as a service
- Scalable compute services
Yeoman work will be efficiently automated and handled by external service providers: There's a lot of work in developing and maintaining a high-performance application that most developers know they should do, but don't want to do. Think:
- Optimizing front-end performance through concatenation, minification, and client-side caching (think Javascript tools like Grunt, Gulp, etc)
- Monitoring and managing availability. Tracking application performance data (think New Relic)
- Managing a Content Delivery Network (CDN) to ensure quick access of your application anywhere in the world
- Maintaining IT security. In addition to applying patches, companies need to proactively monitor intrusions and identify any potential security gaps
- Meeting regulatory compliance. Keeping track of administrative issues, logging, and managing user permissions
Cloud services will become more secure than the alternatives: Certain industries such as government, healthcare, and financial services have been reluctant to adopt cloud services due to concerns around regulatory compliance. As cloud services become more mature and hybrid approaches (partially public and private) proliferate, these industries will pick up on their adoption of cloud services.

A lean, scalable cloud-based infrastructure:

Handle API requests with Amazon Lambda (event-based compute services)
Deliver static assets with Amazon S3 (storage services)
Database as a Service (Amazon Aurora for SQL)
Cached data / CDN-ified (Redis & services)

Essentially, the backbone on various online applications will look increasingly similar with the only major difference being the product-related code. This might seem like an obvious point, after all tech companies are primarily interested in creating great products and services, not managing complex and costly IT infrastructure that support their offerings.

A simple approach to design and CSS

Tools to use:

Bootstrap (Front-end framework)
Sass (preprocessor language)
Compass
Grunt (to compile the Sass files to CSS)

Step 1: Draft the structure

Ideally use a whiteboard or draw it out by hand. Once you feel comfortable with the layout, create a wireframe using a specialized app like Mock Flow or just use PPT.

Step 2: Create the structure in HTML

Create all the DOM elements (as possible) and don’t worry about aesthetics.
The key is to get all the elements on the page in the right order
Add CSS classes in logical places Tip: using a very basic scaffolding (e.g. barebones Bootstrap page) helps you get started

Step 3: Create the CSS outline based on the HTML structure

Write down the HTML tags and CSS classes that you want to reference
Pseudo-code in the various CSS classes of what you want to do

Step 4: Iterate with the CSS styles until it looks ideal

Gradually put in actual CSS code so the page begins to resemble the design you set out to create
Make one feature and then test
Frequently commit throughout the process as it’s easy to break the visual design of a page

Step 5: Validate the HTML and CSS page (code-wise and visually)

Use HTML / CSS / Javascript validator to make sure the code is valid
Do a visual test to make sure all the elements that you need are visible on the page

Introduction to Bloom filters

Last week, I learned what bloom filters are and implemented one.

One of the Hackers-in-Residence at Hack Reactor wrote a very helpful article that introduces the concept of Bloom Filter that I would recommend reading:
http://kiafathi.azurewebsites.net/quick-thoughts-bloom-filters/

Bloom filters overview

In a nutshell, bloom filters allow you to filter out requests to access and manipulate a database by telling you whether a key is potentially stored or is definitely not stored. This way, you can avoid doing an expensive operation by ruling out keys that are definitely not in a database.

An instance of a bloom filter is an object with a storage array that is of a pre-determined size (note: bloom filters cannot be re-sized). At the different indexes, the value can be either 1 (truthy value) or undefined( (falsey value). When you create a bloom filter, you determine the total size of the array and the number of hashing functions that will be used to generate a set of marks. For each key, they will have a unique pattern of marks (the element is set to the value of 1).

Once you have added keys to a bloom filter, you can look up whether the key potentially exists in the bloom filter by using the potential keys and using the hashing functions to generate a pattern of marks. If each spot in that pattern is set to a truthy value of 1, then that means the key potentially exists. Otherwise, if any of the spots in that pattern is set to 0, that means the key definitely does not exist otherwise all of those spots would have a truthy value.

Pros and Cons of Bloom Filters:

Pro: Bloom filters allow you to avoid trying to do expensive operations by ruling out keys that definitely do not exist in a database.
Pro: By appropriately sizing the bloom filter relative to the number of keys stored, you can keep the false positive rate reasonably low (calculation for false positive rate).
Con: Bloom filters do not allow you to definitively conclude whether a key is stored in a filter.
Con: Because of the above drawback, bloom filters cannot be automatically re-sized without creating a brand new bloom filter and re-inserting all of the keys through a source of truth.

Ideas for future software engineering side projects

Once I've finished with Hack Reactor, I'd like to tackle on a few additional side projects to get some more experience with various technologies. I've made a quick list so I don't forget:

Learn statically typed altJs: I'd like to get more experience working with statically typed altJs variants like TypeScript. I recently watched a Youtube video of Facebook announcing their new type checker called Flow which they will be releasing open source sometime later this year. I'm very interested because as I've done coding in JS I've noticed so many times where having a strong type functionality would have caught an error much quicker in the development process.
- Learn TypeScript and Flow basics
- Create plug-in for Atom to enable better TypeScript support (and written in Typescript)
Google App Add-on: Create a Google App add-on for Google Sheets that identifies commonly made mistakes such as summing the wrong rows, or entering in numbers as a string rather than a numerical value.
- Note: This would be a good use of TypeScript as type checking will be an essential part of the functionality.
HIPAA-compliant docker containers: I still need to do more research on this but during my brief experience of trying to make a healthcare tech product, I realized that creating a mobile or web application that complies with HIPAA is very difficult and costly. I'd like to create a standard configuration for managing Docker containers so that they would be compliant with HIPAA. I need to do more research and understand the granular details of what it takes to meet HIPAA compliance but my rough understanding is that Docker containers can play a relatively critical part in maintaining HIPAA compliance with online applications.
- Depending on how low-level the solution is, this might be a good excuse to learn Go, since that's the language that Docker is written in
Open Doc - I started working on a side project that used a very large public dataset from the US government on the number of procedures a doctor did and the average cost of these procedures by specific procedure number (e.g. knee surgery). I'm currently using a MEAN-stack, except the M has been switched from MongoDB to MySQL after realizing halfway that I wanted to conduct somewhat complex queries that had a relational component. The two main outstanding components:
- Fix the search functionality to extend it to the entire database
- Deploy the web app and database to Azure
Learn other languages - So far, my sole programming language has been Javascript. I'd like to learn some on these list below:
- Go: a relatively new systems language that seems to be gaining traction and offers strong performance
- Python / Ruby: more traditional choices for web back-end
- Java / C / C++ : traditional languages, mostly learning to understand general CS concepts