Sep 28, 2020

Processes in IT and a cost of a web site

This article is about the software development life cycle in a business and money language. Many headmen of projects expect short and simple answers on complex technical challenges, and can't receive them. At the same time, all projects are technically similar, and the solution to most of the problems is generally known.
In this post, I summarize my experience of communicating with the leaders of different companies, when short phrases were not enough. I hope this post will help business people and engineers better understand each other.

Once upon a time a man talked to a construction foreman. The house a man lives in with his family was made of bricks. Laying on a ground cemented bricks made walls. There was one problem two men were talking about - walls tilt. Propped up with sticks, walls tilted more and more, and a man was worried that the house would collapse sooner or later.

 
"The house requires reconstruction," said the foreman. "Here's what I offer to do: carry in the power line to power the construction equipment, dig the pit, make drainage, fill the foundation…"

"Oh, no!" a house owner interrupted the foreman, "I need the House! Walls! Not a pit!"

"Maybe you should consider buying a modular house?" offered a foreman.

 

Our days

Last month I talked to a startup. It has a web service being coded for several years, and now they need to do something with it. Founders told me their plan to hire about 10 programmers to rewrite or refactor the system. I asked if they had user stories, documentation for API, task management system, and they don’t. They asked me to write a list of what I consider they should do, and I wrote this:

  1. List key parameters affecting sales of the product: SLA, features - anything to bind the virtual tasks to the real world.
  2. Study the business logic, define contexts by DDD, and create a high-level description to assist onboarding of new developers.
  3. Define the system bottlenecks that make problems with scaling and availability
  4. Agree the middle-term goals of an IT team with stakeholders.
  5. Establish a hiring and onboarding process.
  6. Create a workflow using team collaboration tools such as kanban boards, ticket/tracker system, chat, git repo.
  7. Set up system monitoring, logging and backup/recovery plans.
  8. Decompose the middle-term development goals by milestones and agree calendar plans.
  9. Create a CI/CD development-test-release flow.
  10. Write the architecture modification plan.
  11. Prioritise tasks in the backlog.
  12. Work out the regular meetings with a product owner and reporting to stakeholders.

But where is the development in the list? The answer is: all of this. Development is a complex process, it is not just coding. It is a Systems Development Life Cycle (SDLC).

Founders of the company told they doesn’t need management, they look for someone who codes. Thank you guys for your feedback, cause I it made me feel I should write this article and explain what development is. Everything in the list is related to money.

How each element in the list is related to money.

  1. Key parameters affecting sales - traffic, retention, etc. We all love R&D, discuss the culture of coding, but evaluation of IT companies is a multiplication of a revenue or user base by some factor, and the code is not taken into account at all. I believe the work should be bound to measurable parameters used to set goals for technicians. It is important for the key parameters to be named and tracked. I was involved in a lot of projects, and saw how managers of a couple of growing projects ignored significant changes of key parameters. Ignoring changes ruined projects in just several months after years of hard work. Despite being technically reliable and still make some money, projects lost the market, and could not recover. I remember the opposite situation at Oracle Corporation. Everyone was told how many millions of dollars of revenue the product we worked with earned, the market share, how many new customers came, how many left, competitors, and so on. There were common meetings for all staff where VPs presented the report they made to the board. You can see by Oracle's financial results this concentration and involvement works well.

  2. Studying the business logic of a project is an essential part of hiring new employees.  Defining contexts is required to make people understand each other and define areas of responsibilities. Investment in general high-level documentation pays off with each newcomer salary.

  3. Bottlenecks are the technical problems that prevent achieving goals by key parameters. They need to be addressed in the backlog and project plans.

  4. Middle-term goals is what the company pays programmers for. It is not vital to have the middle plans written, but absence of middle-term plans usually means that developers play in research and fight for coding standards of their preference.

  5. About hiring. Everything in a project is done when people are hired. The cost of onboarding is the sum of salaries of employees involved in learning and teaching, plus the lost profit from the tasks not implemented during that time. Large enterprises may afford months of onboarding, but most projects need programmers to join as soon as possible. Should hiring be a process, or the top management does it with irregular interviews depends on plans to scale business. Startups change fast. Testing new markets and scaling require different people. Intensively growing startups need the hiring process run with minimum top management involvement. Whether a company treats hiring as a regular process or as a finite task is a simple indicator of its plans to grow.

  6. A development workflow with boards, tasks, code repos, daily meetings, building, testing and deployment is essential for most teams. We just can’t have features deployed predictably and keep uptime by SLA without the software development lifecycle.
    All tools are ready-made, one can set them all up in a couple of days, they are free or cheap. Jira or Clubhouse, Slack / Element / Riot / Skype, Github or Gitlabs, Jenkins / Actions / Jobs, Docker with Registry, Swarm or K8s, Sentry / Rollbar / Logstash - everything is already done, just choose and apply.

  7. System monitoring, logging and backup/recovery plans is not the first task you do when deploying the MVP. It becomes essential when the project starts burning money to attract customers. Everything breaks sometimes, and the bigger the budget grows, the more important it becomes to keep the system available for the cash to flow. Setting up backups and monitoring is not a management decision, it’s an essential way to decrease downtimes.

  8. Decomposing the middle-term development goals by milestones and epics is required to follow the process, predict the results and intervene when the problems appear, not waiting until it is too late.

  9. The benefit from a CI/CD development-test-release flow is not obvious. By Lehman's law the complexity of software increases while it evolves, unless work is done to maintain it. CI/CD is the developers’ part of efforts to control the complexity growth. Covering the code with tests, and running them in the dedicated build phase curbs the development budget growth. It's OK to live without it, you will just grow slower, cause you need a QA team of the same size as development, doubling office space, accounting, and quantity of managers in meetings.  This task is not at the top of the list because it makes sense only for successful and actively developed projects. For average projects deployment from a release git branch works well.

  10. The architecture modification plan is one of the most important tasks to speed up the development of the features that seriously impact the business, and scale it. This plan helps prioritise tasks, cutting off not important ones. I place it in the bottom of this list because it is not crucial, you do not need to keep the system working. We need it when we have ambitious middle-term goals and resources for them. This is not a part of management, this is an architect role.

  11. Prioritizing tasks in the backlog can be done differently, and there is no one true way for all. In SCRUM it is done by the team with an opinion from a product owner. In traditional workflows priorities are defined by a team lead. In a gull-style management it is not defined at all. It is more comfortable for people when they know the rules for setting the task priority. When people know the rules how priorities are defined, they learn to do more important tasks to feel themselves important.

  12. Regular meetings with management, reporting to stakeholders and attention lets people feel right, being a part of something, helping teams survive crises.

How much does it cost to do a startup?

So, how long will it take to implement a project? Most likely, the answer should be some kind of "Use Wordpress, as 38% of internet web sites do". In many cases you can avoid in-house development. Use a SAAS, a ready-made package, outsource it, get the total cost of a finite solution, and a calendar date in a contract. Unless you are doing business in IT, of course. By Lehman's law, the functionality of software must be continually increased to maintain user satisfaction over its lifetime. To compete in the market the development of the software never ends. When the project grows there are more operational, business analysis, management and architecture tasks added.

What if you just code without all those CI/CD, repositories, trackers, just make calls and discuss? Well, maybe programmers will understand you and make the software that does what you want it to. Or maybe you will have to fire and hire several development teams and rewrite the software several times. The difference is in predictability of results.

Sep 16, 2020

Nginx, Let's Encrypt, IPv6, HTTP/2 and healthcheck

While working on the generic configuration set for Nginx/PHP/MySQL web applications in Docker, the most difficult problem I met was writing scripts and configuration files for Nginx. I need a simple, generic, small and reliable solution to issue and renew Let's Encrypt certificates in Docker, reload Nginx, support HTTP2 and IPv6.

Each of these tasks alone has a pretty straightforward solution. It became much more complicated when I started writing a generic solution one could copy-paste among different sites and domains.

A requirement for a solution to be generic means I don't want to build my own Nginx image, I want to use an official image from the Docker Hub: nginx:alpine. I use Acme.sh that just works everywhere.

A requirement to be simple means that I don't a separate container to issue certificates, no http proxy to process HTTP requests from Let'sEncrypt. 

And a requirement to be reliable means I don't pass a docker unix socket to a container to reload Nginx when the certificate is renewed

I use a posix sh script running in background in the Nginx container with a plain "webroot" method to pass Let's Encrypt authorization. I implement a healthcheck for Docker.

It has a limitation of scalability and replication: this solution is for a single Nginx service setup.

Here is a standalone pattern for Nginx configs:

https://github.com/grikdotnet/docker-patterns/tree/master/1.nginx-acme.sh-healthcheck

Jun 7, 2020

Fast deployment of Nginx/PHP/MySQL with LetsEncrypt, HTTP2 and IPv6 using Docker Swarm.

Periodically I deploy simple web sites with Nginx, PHP and MySQL. It is usually one or two virtual servers, all with similar requirements and configs. I came to a "setup-and-forget" set of configs to deploy a site and upgrade in a few years.

Here is my "Infrastructure As A Code" for LEMP sites you can deploy pretty fast.

Functional requirements:
  1. HTTPS with ACME-issued certificate (Let'sEncrypt, BuyPass, and others)
  2. Local deployment with a self-issued certificate
  3. Nginx with HTTP2
  4. IPv6 support with a single IP, as most virtual servers have
  5. Simple upgrade and migration of services among server providers
  6. Use environment variables for secrets such as a database password
  7. Healthcheck for services
  8. Works in Linux as production environment
  9. Works in Docker Desktop for both Mac and Windows as development environment
Non-functional requirements:
  1. Lightweight to fit a cheap virtual server with 1Gb RAM.
  2. Use official well-known repositories and images, no third-party dependencies
  3. Single site per server. VPS are cheap enough. It can serve multiple domains, of course.


Why Docker Swarm?

Kubernetes could be a great choice, but it consumes gigabytes of RAM. It just won't work in a tiny VPS. Ansible/Vagrant are popular tools among system administrators, but they don't provide service decoupling. I like upgrading PHP, Nginx and MySQL by downloading new images for containers with a single command.

Problems:
  1. Docker Swarm does not provide cron jobs scheduling. ACME renewal should be done each 2 months. I have a dedicated article describing this solution.
  2. Swarm does not support IPv6 options, and Docker documentation asks for a /80 IPv6 range.
Keypoints:

* You can see an "acme.sh" package used instead of Certbot. Acme.sh in the "Nginx mode" configures Nginx to issue certificates. This way certificates can be obtained before starting Nginx with production configs. Local deployment generates a self-signed certificate.

* Recent Nginx and MySQL images support init scripts in "/docker-entrypoint.d/". You can see scripts mounted to the Nginx and MySQL containers installing openssl and acme.sh packages, initializing database when container starts. This way you can avoid maintaining a custom image.

* Official PHP images do not support init scripts, so I substitute the entyrypoint script with a custom one, and still use an official docker image for PHP.

* There is no simple solution to work with IPv6 in Docker Swarm. Docker uses IPv4 NAT to route traffic, and IPv6 is just something Docker not designed for.
Docker documentation offers is to use Compose-file version 2, ask a provider for a /80 range, and set the "enable_ipv6" flag for the docker engine. This requires too much manual work.
Another way is to add an IPv6 NAT service in a Docker container. But the best way I found is to run Nginx in a "host" network.

* A host network can't be used in Docker desktop in Windows and Mac. This is solved with YAML inheritance in a "docker-compose.override.yml" file.
A local deployment with a "docker-compose up" uses both "docker-compose.yml" and "docker-compose.override.yml" files.
In production the command "docker stack up -c docker-compose.yml mystack" reads just the "docker-compose.yml" file and runs Nginx in a host network mode.
 
* Environment variables take precedence over values in the "mysql/db.env" file. This trick allows using values from a .env file in a local deployment, and environment variables are used in production.

* A "clear_env = no" clause in "fpm.conf" file allows passing environment variables to PHP scripts. Only variables listed in "docker-compose.yml" are passed, so it's safe.

* PHP scripts access MySQL service using a separate network, while Nginx communicates with PHP over Unix socket. Database can be easily moved and replicated to separate servers in a cluster with minor configuration changes.

A single-instance PHP site is quite productive if done right, it may handle traffic at a million daily users scale, and is very cost-effective. Just keep in mind that a single-instance architecture has SPOFs, does not provide failover and adds problems for implementing the blue-green deployment.

That's it, now I can deploy simple PHP scripts with Nginx, MySQL, TLS with ACME certificates, HTTP2 and IPv6 using a single command.

Apr 29, 2016

PHP 7 on Mac OS X custom build, Xdebug and Composer

How to build PHP 7 on Mac OS X.


  1. Install XCode 
  2. Add command line tools. They can be added with command xcode-select --install
  3. Download latest version from http://php.net/downloads.php
  4. Install OpenSSL with a homebrew: brew install openssl
  5. Find your Openssl version: ls /usr/local/Cellar/openssl/
  6. Use it in the configure command:
'./configure' \
'--prefix=/usr/local' \
'--with-config-file-path=/usr/local/etc' \
'--enable-debug' \
'--with-openssl=/usr/local/Cellar/openssl/1.0.2g' \
'--enable-mbstring' \
'--enable-fpm' \
'--with-config-file-path=/usr/local/etc' \
'--enable-mysqlnd' \
'--with-pdo-mysql' \
'--with-mysqli=shared' \
'--with-bz2=shared,/usr/lib' \
'--enable-bcmath=shared' \
'--with-curl=shared' \
'--with-freetype-dir=/usr' \
'--with-png-dir=/usr' \
'--with-gd=shared' \
'--enable-gd-native-ttf' \
'--with-jpeg-dir=shared,/usr' \
'--with-zlib=shared' \
'--with-xsl=shared' \
'--with-iconv=shared' \
'--with-pear' \
'--with-mhash=shared' \
'--disable-exif' \
'--disable-ftp' \
'--disable-sockets' \
'--disable-sysvsem' \
'--disable-sysvshm' \
'--disable-shmop' \
'--disable-ipv6' \
'--disable-posix' \
'--with-layout=PHP'

Note: openssl and mbstring extensions are required by Composer, it is convenient to compile them as non-shared.

7. Compile and install: make; make install 

8. Copy "php.ini-development" file from the source folder to /usr/local/etc/php.ini and update

9. Add XDebug: pecl install xdebug
Add "zend_extension=/usr/local/lib/php/extensions/debug-non-zts-20151012/xdebug.so" to /usr/local/etc/php.ini

10. Make alias in ~/.bash_profile to call php without xdebug
alias composer='php -n ~/bin/composer'

Feb 29, 2016

Docker image for PHP with common extensions and Composer

I created a Docker image for PHP 7.

The official PHP image lacks many commonly used extensions.
My image is extending it, having added most of extensions one uses in every day work, and a composer. It is  usable to set up a PHP-FPM service quickly, without compiling extensions manually. Just init the configs in a mounted folders, edit configs, disable unused extensions, and you are done.

The image is here: https://hub.docker.com/r/grigori/phpextensions/

Further I will add docker compose config file to set PHP together with MySQL and Nginx.

Jul 11, 2011

my package is published at phpclasses.org

Here is the link:
http://www.phpclasses.org/package/7020-PHP-Retrieve-and-filter-request-variable-values.html

This is a bunch of classes in a namespace that allow one using filter_input() function conveniently, without remembering all those FILTER_ constants.
It provides API for several frequently used filters with convenient autocompletion in IDEs.

May 30, 2011

security issue with regular expressions

Ok, yii fixed it's security issue with regular expressions in validators I was worried about.

It comes out that serious php applications use regular expressions as a tool for checking user input, not paying attention to the documented limitations.

Everyone around talks about sql injections, but when it comes to regular expressions, you need to explain this even to the authors of the framework.

In fact, I like how Qiang closes bugs in yii in minutes, most important is make him notice them :)